What are different types of Modes in Prompting?
Autonomous
Autonomous mode lets you describe an entire workflow in a single prompt. The system understands the app structure and independently builds and executes the full flow step by step. It reasons using multiple signals including XML, visible text, images, and screen coordinates to decide the best action at each step.
This mode is best suited for complex or non linear tasks that require deeper understanding and decision making, such as multi step user journeys or flows with dynamic UI changes.
Example
"Open the shopping app, search for running shoes under Rs 3,000, apply size and brand filters, select the second product, add it to cart, log in with test credentials, and proceed till the payment screen."
Guided
Guided mode executes one instruction at a time by primarily relying on the underlying XML structure of the app. It is optimized for speed and predictability, making it up to five times faster than Autonomous mode for straightforward tasks. When the system does not have enough information to safely execute an instruction, it automatically falls back to Autonomous mode to reason through the scenario and continue execution without breaking the flow.
Example
"Tap on Search" "Enter 'running shoes' in the search bar" "Tap on the first result"