Skip to main content
AI Task nodes can be equipped with a wide range of capabilities. Each capability unlocks additional tools the agent can use during execution. We suggest starting with the default capabilities, and enabling additional ones when strictly needed.
API Tool Identifiers — When configuring AI Task node tools via the API or MCP server, use the snake_case identifiers listed below each tool. You can also call the listAvailableTools MCP tool or GET /agents/available-tools API endpoint to discover all available tools programmatically.

Web Browsing Essentials

Core tools for basic navigation and DOM interaction.

Go to URL

Navigate the browser to any URL (go_to_url)

Refresh Page

Reload the current webpage (refresh_page)

Basic Web Interaction

Click, type, select elements, and interact via the DOM (dom_browser_interaction)
Critical: Basic Web Interaction is RequiredBasic Web Interaction is enabled by default and should always remain enabled for AI Task nodes. This capability provides essential tools for clicking, typing, selecting elements, and interacting with the DOM.
  • Disabling Basic Web Interaction will prevent AI Task nodes from performing basic browser interactions
  • You should not disable this capability unless you have a very specific reason
  • This capability is part of the default “Web Browsing Essentials” set and is required for AI Task nodes to function properly

Advanced Web Browsing Toolkit

Tools for extraction, evaluation, and page manipulation

Extract HTML

Download or inspect the full HTML of the page (extract_html)

Get Text

Extract visible text as clean markdown (get_text)

Evaluate JavaScript

Run custom JS in the page context (evaluate_javascript)

Zoom Out

Reduce zoom level to view more content (zoom_out)

Zoom In

Increase zoom level for readability (zoom_in)

Take Screenshot

Take a screenshot of the current page (take_screenshot)

Solve Captcha

Trigger automatic captcha solving (solve_captcha)

Save PDF

Save the current page as a PDF file (save_pdf)

Computer Vision

Tools for image-based interaction when DOM access is insufficient.

Computer Vision

Click, locate, and interact using visual recognition (computer_use)

Communication

Send messages or work with email during execution.

Send User Message

Ask the user questions or request clarification (send_user_message)

Send Mail

Send emails with custom subject and body (send_mail)

Get Mail

Retrieve inbound emails from the agent’s inbox (get_mail)

Send API Request

Make HTTP API requests to external services (send_api_request)

File System

Work with files locally or inside the browser session.

List Files

View all files available in the execution context (list_files)

Read Files

Read text, images, PDFs, or downloaded files (read_files)

Upload File

Upload files to file input elements on a webpage (upload_file)

Memory & Storage

Store and retrieve execution-scoped data.

Write Scratchpad

Save notes or structured data to memory (write_scratchpad)

Read Scratchpad

Retrieve previously stored information (read_scratchpad)

Read Clipboard

Access the current clipboard contents (read_clipboard)

Google Sheets

Read and write spreadsheet data.

Sheets: Get Data

Retrieve values from cell ranges like A1:B10 (google_sheets_get_data)

Sheets: Set Value

Update values in specific cells (google_sheets_set_value)

Authentication

Generate tokens and handle one-time passwords.

Generate TOTP Code

Produce 6-digit MFA codes using stored credentials (generate_totp_secret)

Context & Utilities

Access deeper execution context or system utilities.

Query Context

Ask questions about past actions and stored information (query_context)

Get Datetime

Fetch the current datetime in any timezone (get_datetime)