- Overview
- UI Automation
- About the UI Automation activity package
- Applications and technologies automated with UI Automation
- Project compatibility
- UI-ANA-016 - Pull Open Browser URL
- UI-ANA-017 - ContinueOnError True
- UI-ANA-018 - List OCR/Image Activities
- UI-DBP-006 - Container Usage
- UI-DBP-013 - Excel Automation Misuse
- UI-DBP-030 - Forbidden Variables Usage In Selectors
- UI-DBP-031 - Activity verification
- UI-PRR-001 - Simulate Click
- UI-PRR-002 - Simulate Type
- UI-PRR-003 - Open Application Misuse
- UI-PRR-004 - Hardcoded Delays
- UI-REL-001 - Large Idx in Selectors
- UI-SEC-004 - Selector Email Data
- UI-SEC-010 - App/Url Restrictions
- UI-USG-011 - Non Allowed Attributes
- UX-SEC-010 - App/Url Restrictions
- UX-DBP-029 - Insecure Password Use
- UI-PST-001 - Audit Log Level in Project Settings
- UiPath Browser Migration Tool
- Clipping region
- Computer Vision Recorder
- Activities index
- Activate
- Anchor Base
- Attach Browser
- Attach Window
- Block User Input
- Callout
- Check
- Click
- Click Image
- Click Image Trigger
- Click OCR Text
- Click Text
- Click Trigger
- Close Application
- Close Tab
- Close Window
- Context Aware Anchor
- Copy Selected Text
- Element Attribute Change Trigger
- Element Exists
- Element Scope
- Element State Change Trigger
- Export UI Tree
- Extract Structured Data
- Find Children
- Find Element
- Find Image
- Find Image Matches
- Find OCR Text Position
- Find Relative Element
- Find Text Position
- Get Active Window
- Get Ancestor
- Get Attribute
- Get Event Info
- Get From Clipboard
- Get Full Text
- Get OCR Text
- Get Password
- Get Position
- Get Source Element
- Get Text
- Get Visible Text
- Go Back
- Go Forward
- Go Home
- Google Cloud Vision OCR
- Hide Window
- Highlight
- Hotkey Trigger
- Hover
- Hover Image
- Hover OCR Text
- Hover Text
- Image Exists
- Indicate On Screen
- Inject .NET Code
- Inject Js Script
- Invoke ActiveX Method
- Key Press Trigger
- Load Image
- Maximize Window
- Microsoft Azure Computer Vision OCR
- Microsoft OCR
- Microsoft Project Oxford Online OCR
- Minimize Window
- Monitor Events
- Mouse Trigger
- Move Window
- Navigate To
- OCR Text Exists
- On Element Appear
- On Element Vanish
- On Image Appear
- On Image Vanish
- Open Application
- Open Browser
- Refresh Browser
- Replay User Event
- Restore Window
- Save Image
- Select Item
- Select Multiple Items
- Send Hotkey
- Set Clipping Region
- Set Focus
- Set Text
- Set To Clipboard
- Set Web Attribute
- Show Window
- Start Process
- System Trigger
- Take Screenshot
- Tesseract OCR
- Text Exists
- Tooltip
- Type Into
- Type Secure Text
- Use Foreground
- Wait Attribute
- Wait Element Vanish
- Wait Image Vanish
- Application Event Trigger
- Block User Input
- Check/Uncheck
- Check App State
- Check Element
- Click
- Click Event Trigger
- Drag and Drop
- Extract Table Data
- Find Elements
- For Each UI Element
- Get Browser Data
- Get Clipboard
- Get Text
- Get URL
- Go to URL
- Highlight
- Hover
- Inject Js Script
- Keyboard Shortcuts
- Keypress Event Trigger
- Mouse Scroll
- Navigate Browser
- Select Item
- Set Browser Data
- Set Clipboard
- Set Runtime Browser
- Set Focus
- Set Text
- Take Screenshot
- Type Into
- Unblock User Input
- Use Application/Browser
- Window Operation
- Perform browser search and retrieve results using UI Automation APIs
- Web Browsing
- Find Images
- Click Images
- Trigger and Monitor Events
- Create and Override Files
- HTML Pages: Extract and Manipulate Information
- Window Manipulation
- Automated List Selection
- Find and Manipulate Window Elements
- Manage Text Automation
- Load and Process Images
- Manage Mouse Activated Actions
- Automate Application Runtime
- Automated Run of a Local Application
- Browser Navigation
- Web Automation
- Trigger Scope Example
- Enable UI Automation support in DevExpress
- Computer Vision Local Server
- Mobile Automation
- Release notes
- About the mobile device automation architecture
- Project compatibility
- Get Log Types
- Get Logs
- Get Page Source
- Get Device Orientation
- Get Session Identifier
- Install App
- Manage Current App
- Manage Other App
- Open DeepLink
- Open URL
- Mobile Device Connection
- Directional Swipe
- Draw Pattern
- Positional Swipe
- Press Hardware Button
- Set Device Orientation
- Take Screenshot
- Take Screenshot Part
- Element Exists
- Execute Command
- Get Attribute
- Get Selected Item
- Get Text
- Set Selected Item
- Set Text
- Swipe
- Tap
- Type Text
- Terminal
- Release notes
- About the Terminal activity package
- Project compatibility
- Best practices
- Find Text
- Get Color at Position
- Get Cursor Position
- Get Field
- Get Field at Position
- Get Screen Area
- Get Text
- Get Text at Position
- Move Cursor
- Move Cursor to Text
- Send Control Key
- Send Keys
- Send Keys Secure
- Set Field
- Set Field at Position
- Terminal Session
- Wait Field Text
- Wait Screen Ready
- Wait Screen Text
- Wait Text at Position

UI Automation Activities
UiPath.Semantic.Activities.NUITask
ScreenPlay is UiPath®’s next-generation automation agent, designed to bring agentic behavior and cognitive capabilities to the desktop. ScreenPlay interacts with applications much like a human would—navigating interfaces, adapting to change, and handling complex tasks that were previously infeasible with traditional automation methods.
To learn more, refer to ScreenPlay.
This activity must be added inside a Use Application/Browser activity.
- Task - Prompt describing the UI task to be performed, with the ability to:
- Use Variables
- Add image from screen (inline with the text)
- View last execution trace. For more details, see the Running and inspecting the execution results page.
-
Model - Indicates the underlying LLM used by ScreenPlay for task execution planning and reasoning. The following options are available:
UiPath (with Gemini 2.5 Flash)
- Basic model
- Works best on browsers
- Uses a proprietary implementation based on the page's DOM, using Gemini Flash for reasoning and image understanding
- Moderately fast
- Standard model - for complex tasks
- Works best on browsers
- Uses a proprietary implementation based on the page's DOM and image understanding, using GPT-4.1 for reasoning
- Not very fast
- Basic model - faster, cheaper
- Works best on browsers
- Uses a proprietary implementation based on the page's DOM and image understanding, using GPT-4.1 mini for reasoning
- Moderately fast
- Standard model - for complex tasks
- Works best on browsers
- Uses a proprietary implementation based on the page's DOM and image understanding, using GPT-5 for reasoning
- Slow
- Basic model - faster, cheaper
- Works best on browsers
- Uses a proprietary implementation based on the page's DOM and image understanding, using GPT-5 mini for reasoning
- Moderately fast
- Standard model - for complex tasks
- Works on any type of application, including image-based interfaces
- Uses OpenAI Operator, an image-based reasoning model. Likely the best of the bunch
- Slow
- Standard model - for complex tasks
- Works on any type of application, including image-based interfaces
- Uses Anthropic Computer Use, an image-based reasoning model
- Slow
- Rate this activity - Good or Poor
Additional options
Options
- Max number of steps - This is the maximum number of steps that ScreenPlay can take to achieve its goal. The property can be used as a basic guardrail to prevent infinite agentic loops.
- Type by clipboard - Indicates whether the clipboard is used to type the given text. The following options are available:
- Never - Never use the clipboard
- Always - Always use the clipboard
- Whenever possible - Use the clipboard when possible. This depends on the OS and the text to be typed (e.g. If any special key is used, then the clipbaord will not be used)
-
Use DOM when available - Indicates whether DOM data will be used/sent to the LLM Model for applications where DOM can be extracted.
DOM can be used only by UiPath LAM implementations.
Disable if DOM-based targeting leads to incorrect element coordinates.
Default value is True.
-
Disable variable security - Indicates whether the variable security should be disabled.
Variable security ensures that prompt instructions can not be passed via variable values.
The purpose of this feature is to prevent prompt injection attacks and is based on LLM, so if a "false positive" occurs, the user can disable it for each ScreenPlay activity. To conclude if it is a "false positive", the user can inspect the execution trace, system prompt, reasoning, and actions.
Enable this option only if you need to pass prompt instructions via variable values or if a "false positive" result occurred.
Default value is False.
- Input mode - Select which method should be used to generate keyboard and mouse input:
- Same as App/Browser - Uses the Input mode settings from the parent Use Application/Browser activity.
- Hardware Events - Acts as a real user by using "hardware" inputs such as mouse movements or keyboard strokes to interact with applications. These are hardware-triggered events sent directly to the operating system. While this method offers 100% behavioral emulation, some events may occasionally be lost. As a developer, it's your responsibility to ensure that all events reliably reach the target application.
- Chromium API - Performs actions using debugger APIs. Works only for Chromium elements. Sends all text in one go. Works even if target app is not in focus. For more details, refer to Chromium API.
- Simulate - Simulates using accessibility APIs. Recommended for browsers, Java based applications, SAP. Usually more reliable than Hardware Events. Sends all text in a single action. Works even if target app is not in focus. Please test if your target application UI element supports this.
- Window messages - Simulates using Win32 messages. Recommended for desktop apps. Usually more reliable than Hardware Events. Sends all text in one go. Works even if target app is not in focus. Please test if your target application UI element supports this.
-
Continue on error - Specifies if the automation should continue even when the activity throws an error. This field only supports
Booleanvalues (True, False). The default value is False. As a result, if the field is blank and an error is thrown, the execution of the project stops. If the value is set to True, the execution of the project continues regardless of any error.
Timings
-
Delay before - Delay (in seconds) between the time the previous activity is completed and the time this activity begins performing any operations. The default value is 0.2 seconds. Adding a delay between activities ensures that one activity has enough time to complete before the next activity begins.
-
Delay after - Delay (in seconds) between the time this activity is completed and the time the next activity begins any operations. The default value is 0.3 seconds. Adding a delay between activities ensures that one activity has enough time to complete before the next activity begins.
-
Timeout - Specify the amount of time (in seconds) to wait for the activity to be executed before throwing an error. The default value is 30 seconds.
Output
- Result - The result of the task, if any. For now we only support
Stringoutput.