Appium Commands
The Appium Vega Driver supports the following World Wide Web Consortium (W3C) WebDriver commands:
find_element
The find_element
command finds the first UI element that matches a specified selector strategy.
element = driver.find_element('UiSelector', '{"args": {"text": "Element Text"}}')
UiSelector
and XPath
(beta) selector strategies. If it can't find a specified selector, the command returns an exception.To use XPath
with Appium Vega Driver, you can use the By.xpath()
locator strategy.
appium_driver.find_element(by="xpath", value="//*[contains(@clickable, \”true\”)]”)
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
selector | string | Yes | The locator strategy to use for finding elements. | UiSelector |
args | string | Yes | The value of the locator to search for. |
Common XPath expression components
A XPath
expression includes the following components:
- Axis - Specifies the direction of the search, such as descendant, ancestor, and following.
- Node test - Specifies the node type to select, such as element, attribute, or text().
- Predicate - Filters the selected nodes that are based on various conditions, such as attribute values, position, or custom functions.
The following example demonstrates the various components of a XPath
expression:
//Text[@text='Click me', @test-id='1234']
find_elements
The find_elements
command finds multiple UI elements, supports various locator strategies, and returns an array of elements, even if empty. Use this command for:
- Iterating through lists
- Verifying multiple controls
For optimal performance and reliability, use specific locator strategies and appropriate timeouts. In complex UI scenarios, use advanced techniques like hierarchical XPath
.
element = driver.find_elements('UiSelector', '{
"args": {
"text": "Element Text"
}
}'
)
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
locator | string | Yes | The locator strategy to use for finding the elements. | UiSelector |
args | string | Yes | The value of the locator to search for. |
Comparison of find_element and find_elements
Aspect | find_element | find_elements |
---|---|---|
Return value | Single WebElement object |
Array of WebElement objects (even if empty) |
When no element found | Raises NoSuchElementException |
Returns an empty array |
Use case | Single unique elements | Multiple elements |
Performance | Faster (stops at first match) | Slower (finds all matches) |
Flexibility | Single element operation | Multiple element operations |
Error handling | Requires exception handling | Handles missing elements |
Iteration | Doesn't support iteration | Supports iteration |
Verification | Confirms element presence | Confirms element count and state |
To locate a child element within the context of the parent element, find the enclosing element that contains the child elements.
window = driver.find_element("xpath", '//window[@id="{window_id}"]')
button_child_xpath = window.find_element("xpath", '//child[@test_id="button-insert"]')
button_child_uiselector = window.find_element("UiSelector", '{"args": {"test_id": "button-insert"}}')
click
The click
command simulates clicking on a specific element at its center point. While various locator strategies are available, clicking on elements by their ID provides a reliable and maintainable approach.
button = appium_driver.find_element("UiSelector", '{"args":{ "text": "Button 1" }}')
button.click()
Arguments
The click
command doesn't require arguments as it operates on a previously located element.
send_keys
The send_keys
command simulates typing text into an element and expects a string argument.
# After launching the test app
editableEle = appium_driver.find_element("UiSelector", '{"args":{ "role": "edit" }}')
editableEle.send_keys("the quick brown fox")
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
text | string | Yes | The text to type into the element. | "Hello World" |
send_keys
doesn't open or close the keyboard automatically. To open the keyboard, click any editable item. To close, use the dismiss or back button.Best practices:
- Verify that the target element has focus and is ready to receive input.
- Check element focus before sending keys. Use commands like
click()
to select the element before sending keys. - Add short delays between key sends as rapid or simultaneous input might cause issues on some devices or apps.
get_attribute
The get_attribute
command retrieves the value of a specified attribute for a UI element.
button = appium_driver.find_element("UiSelector", '{"args":{ "text": "My Button" }}')
buttonText = button.get_attribute("UiObject:text")
assert buttonText == '["My Button"]'
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
Attribute | string | Yes | The name of the attribute to retrieve. | text |
List of attributes
Attribute | Expected response format |
---|---|
UiObject:text | "text" |
UiObject:scroll-offset | "x, "y" |
UiObject:scroll-directions | "up", "down", "left", "right" |
UiObject:current-page | "1" |
UiObject:page-count | "1" |
UiObject:focused | "true" |
UiObject:enabled | "true" |
UiObject:long-clickable | "true" |
UiObject:draggable | "true" |
UiObject:pinchable | "true"UiObject:editable "true" |
UiObject:checkable | "true" |
UiObject:checked | "true" |
UiObject:focusable | "true" |
UiObject:pageable | "true" |
UiObject:scrollable | "true" |
UiObject:clickable | "true" |
UiObject:test-id | "test-id" |
UiObject:page-direction | "up", "down", "left", "right" |
UiObject:description | "description" |
get_element_text
The get_element_text
command retrieves the visible text content of a UI element. It returns the text that would be visible to the user, excluding any hidden text.
element = appium_driver.find_element("UiSelector", '{"args":{ "text": "My Button" }}')
text = element.text
print(f"Element text: {text}")
Arguments
The get_element_text command doesn't require arguments as it operates on a previously located element.
Best practices:
- Verify element existence before attempting to get its text.
- Handle potential empty text returns appropriately.
- Consider text formatting and special characters in assertions.
- Use with proper wait strategies when needed.
press_keycode
The press_keycode
command simulates pressing a specific hardware key on the device. Use it with caution as it can impact app state and test flow. It accepts only numerical keycodes (no hexadecimal codes).
driver.press_keycode({KEYCODE})
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
keycode | integer | Yes | The keycode of the hardware key to press. | 10 |
tap
The tap
command simulates tapping a specific set of coordinates on the screen. This command offers precise control over touch interactions but requires caution.
buttonCoordinates = (200, 350)
appium_driver.tap([buttonCoordinates])
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
x | integer | Yes | The x-coordinate of the tap location. | 10 |
y | integer | Yes | The y-coordinate of the tap location. | 10 |
Best practices:
While tap
offers fine-grained control, you should prioritize element-based interactions for test stability and maintainability. Use it only when element-based methods are impractical.
is_enabled
The is_enabled
command checks if a UI element is currently enabled or disabled in the app. It returns true
for enabled (interactive) and false
for disabled (non-interactive).
element = appium_driver.find_element("UiSelector", '{"args":{ "text": "Button 1" }}')
element.is_enabled()
An element is enabled when users can interact with it. However, enabled doesn't guarantee interactivity. Elements might be enabled but lack visibility or have other interaction barriers. To validate interactivity, combine is_enabled
with other commands like is_displayed
.
Arguments
The is_enabled
command doesn't take arguments as it operates on a previously located element.
Best practices:
- Use
is_enabled
as part of a comprehensive element state check. - Don't rely on
is_enabled
to determine element interactivity. - Implement wait strategies before checking element state.
is_displayed
The is_displayed
command checks if a UI element is visible on the screen. It returns true
for visible and false
for hidden.
element = driver.find_element('UiSelector', '{"args": {"test_id": "my_button"}}')
element.is_displayed()
An element is displayed when they are visible within the screen real estate.
Arguments
The is_displayed command doesn't take arguments as it operates on a previously located element.
Best practices:
- Use
is_displayed
as part of a comprehensive element state check.
implicitly_wait
The implicitly_wait
command sets how long the driver waits when searching for unavailable elements. While this command enhances test stability, use it with caution. For complex scenarios, combine implicitly_wait
with explicit waits
to balance reliability and execution speed.
driver.implicitly_wait(5)
For detailed implementation, see Set Implicit Wait Time.
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
seconds | integer | Yes | The amount of time to wait, in milliseconds (ms). | 10 |
Best practices:
- Set
implicitly_wait
at the start of the test. - Reset to a reasonable default (for example, 0 ms) when necessary.
get_window_rect
The get_window_rect
command retrieves the size and position of the current app window relative to the overall screen size.
Example:
Python
res = driver.get_window_rect()
Java
Dimension windowSize = driver.manage().window().getSize(); # returns the window size (width, height)
Point windowOrigin = driver.manage().window().getPosition(); # returns the window origin coordinate (x, y)
Where:
x
- left coordinate of the windowy
- top coordinate of the windowwidth
- width of the windowheight
- height of the window
Arguments
No arguments required.
Best practices:
Use get_window_rect
with other commands to create robust test scripts that handle varying screen sizes and layouts.
get_page_source
The get_page_source
command retrieves the current page source of the app, which represents the underlying UI structure.
Python
page_source = driver.page_source
print(page_source)
Java
String pageSource = driver.getPageSource();
System.out.println(pageSource);
Arguments
The get_page_source
command doesn't require any arguments.
get_screenshot
The get_screenshot
command captures the current app state as a base64-encoded string. Consider user privacy when using this command. Don't capture digital rights management (DRM) content.
Python
screenshot_base64 = driver.get_screenshot_as_base64() # obtains the screenshot as a base64-encoded string
Java
File screenshotFile = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE); // obtains the screenshot and stores it in a file
byte[] screenshotRaw = ((TakesScreenshot)driver).getScreenshotAs(OutputType.BYTES); // obtains the raw bytes of the screenshot
String screenshotBase64 = ((TakesScreenshot)driver).getScreenshotAs(OutputType.BASE64); // obtains the screenshot as a base64-encoded string
For a complete list of methods and examples, see Appium documentation.
Arguments
No arguments required for this command.
install_app
The install_app
command installs an app on the device. The VPKG file path must be available on the host's (appium server) filesystem. Use this command at the beginning of the test to automate environment setup.
executeScript with mobile: installApp
command.Example:
Python:
driver.install_app('/path/to/app.vpkg')
Java:
driver.executeScript("mobile: installApp", Map.of("appPath", "/path/to/app.vpkg"));
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
app_path | string | Yes | Is the path where application is on the appium server. | /path/to/application.vpkg |
activate_app
The activate_app
command launches or brings to the foreground a specified application on the device. This command is useful for testing app switching scenarios or ensuring a specific app is active before performing further test actions.
Example:
Python:
driver.activate_app('com.example.myapp.main')
Java:
boolean isInstalled = (boolean) driver.executeScript("mobile: activateApp",
Map.of("appId", "com.example.myapp.main"));
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
app_id | string | Yes | The package name or bundle identifier of the app to activate. | com.example.myapp.main |
Best practices
To get the correct appId
for launching the app, run vpm list applications
.
terminate_app
The terminate_app
command stops the currently running app. Use this command when testing app behavior during forced closures or ensuring a clean state between test cases.
Example:
driver.terminate_app('com.example.myapp')
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
app_id | string | Yes | The package identifier of the app to stop. | com.example.myapp |
remove_app
The remove_app
command uninstalls an app from the device.
Example:
driver.remove_app('com.example.myapp')
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
app_id | string | Yes | The package identifier of the app to uninstall. | com.example.myapp |
execute_script
The execute_script
command runs shell commands or invokes jsonrpc
APIs.
To execute shell commands on the device:
appium_driver.execute_script("shell", "echo hello world") #
To invoke jsonrpc
APIs not exposed in Appium public interface:
appium_driver.execute_script("jsonrpc: getScreenContext", "")
push_file
The push_file
command allows you to transfer files directly to a device. The file content must be provided as a base64-encoded string. This is useful for uploading test data, configuration files, or any other files needed during test execution.
Python:
with open('test_image.png', 'rb') as image_file:
image_data = base64.b64encode(image_file.read()).decode('utf-8')
driver.push_file('/data/local/tmp/test_image.png', image_data)
Java:
File fileToUpload = new File("/path/to/your/file.txt");
String base64Content;
try {
// Convert file content to Base64
byte[] fileContent = FileUtils.readFileToByteArray(fileToUpload);
base64Content = Base64.getEncoder().encodeToString(fileContent);
// Execute the pushFile command
driver.executeScript("mobile: pushFile", Map.of(
"remotePath", "/tmp/uploaded_file.txt",
"payload", base64Content
));
System.out.println("File uploaded successfully");
} catch (IOException e) {
System.err.println("Error uploading file: " + e.getMessage());
e.printStackTrace();
}
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
remote_path | string | Yes | The path on the device where you save the file. | /data/local/tmp/test_file.txt |
data | string | Yes | The file's Base64-encoded content. |
pull_file
The push_file
command allows you to transfer files directly from a device. The file content is returned as a base64-encoded string. This is useful for downloading test results, configuration files, or any other files needed during and after test execution.
Python:
base64_string = driver.pull_file("/tmp/testFile.txt")
Java:
try {
// Pull files from directory
Object fileBase64 = driver.executeScript("mobile: pullFile", Map.of("remotePath", "/tmp/testFile.txt"));
System.out.println("File downloaded successfully");
} catch (IOException e) {
System.err.println("Error downloading file: " + e.getMessage());
e.printStackTrace();
}
Arguments
Name | Type | Required? | Description | Example |
---|---|---|---|---|
remote_path | string | Yes | The path on the device where you pull the file from. | /data/local/tmp/test_file.txt |
D-pad navigation
The execute_script
function sends Linux input event codes through the jsonrpc: injectInputKeyEvent method
. This functionality lets your app simulate user input events programmatically for automation and testing.
The function runs specified commands and key events within a script, enabling D-pad navigation in apps designed for directional pad input.
The table lists the input event codes.
Key code | Value |
---|---|
KEY_LEFT | 105 |
KEY_RIGHT | 106 |
KEY_UP | 103 |
KEY_DOWN | 108 |
KEY_PENTER (select) | 96 |
KEY_BACK | 158 |
KEY_HOMEPAGE | 170 |
For a comprehensive list of defined keycodes, see input event codes. The input event codes reference helps create accessible and usable keyboard experiences. Support for specific keycodes varies by device type and input configuration. Test keyboard input handling for non-standard keyboard layouts or behaviors.
Use the following command for injectInputKeyEvent
:
driver.execute_script("jsonrpc: injectInputKeyEvent",[{"inputKeyEvent": "{EVENT}" , "holdDuration": 1000}])
For example, use the following command to navigate down.
driver.execute_script("jsonrpc: injectInputKeyEvent",[{"inputKeyEvent": "108" , "holdDuration": 1000}])
Last updated: Sep 30, 2025