📚 API Reference

Complete reference documentation for Browser AI Agent. This guide covers all classes, methods, and configuration options.

Installation

Browser AI Agent can be installed using the provided setup script or manually:

# Automated installation python setup.py # Manual installation pip install -r requirements.txt playwright install

Configuration

Configure Browser AI Agent using environment variables in a .env file:

Variable Type Default Description
DEEPSEEK_API_KEY string required Your DeepSeek API key
BROWSER_HEADLESS boolean true Run browser in headless mode
BROWSER_TYPE string chromium Browser type: chromium, firefox, webkit
AGENT_DEBUG_MODE boolean false Enable detailed logging
AGENT_MAX_RETRIES integer 3 Maximum retry attempts

🤖 BrowserAIAgent

Main class for browser automation with AI capabilities

class BrowserAIAgent: def __init__( self, headless: bool = True, browser_type: str = "chromium", debug: bool = False, max_retries: int = 3 )

Constructor Parameters

Parameter Type Default Description
headless bool True Run browser without GUI
browser_type str "chromium" Browser engine to use
debug bool False Enable debug logging
max_retries int 3 Maximum retry attempts

execute_task()

async def execute_task( self, task_description: str, timeout: int = 120, screenshot_on_completion: bool = True ) -> TaskResult

Execute a browser automation task using AI reasoning.

Parameters

Parameter Type Description
task_description str Natural language description of the task
timeout int Maximum execution time in seconds
screenshot_on_completion bool Take screenshot when task completes

Returns

TaskResult - Object containing execution results and metadata

Example:
async with BrowserAIAgent() as agent:
    result = await agent.execute_task(
        "Navigate to Google and search for 'Python programming'"
    )
    print(f"Success: {result.success}")
    print(f"Steps: {result.steps_executed}")
async def navigate_to( self, url: str, wait_for_load: bool = True ) -> bool

Navigate to a specific URL.

Parameters

Parameter Type Description
url str Target URL to navigate to
wait_for_load bool Wait for page to fully load

execute_multiple_tasks()

async def execute_multiple_tasks( self, tasks: List[str], parallel: bool = False ) -> List[TaskResult]

Execute multiple tasks in sequence or parallel.

Parameters

Parameter Type Description
tasks List[str] List of task descriptions
parallel bool Execute tasks in parallel

📊 TaskResult

Result object returned by task execution methods

@dataclass class TaskResult: success: bool steps_executed: int execution_time: float final_screenshot: Optional[str] extracted_data: Optional[Dict] error_message: Optional[str] ai_reasoning: List[str]

Attributes

Attribute Type Description
success bool Whether the task completed successfully
steps_executed int Number of automation steps performed
execution_time float Total execution time in seconds
final_screenshot Optional[str] Path to final screenshot
extracted_data Optional[Dict] Data extracted during task execution
error_message Optional[str] Error message if task failed
ai_reasoning List[str] AI reasoning steps and decisions

🚀 execute_browser_task()

Convenience function for simple task execution

async def execute_browser_task( task_description: str, headless: bool = True, browser_type: str = "chromium", debug: bool = False, timeout: int = 120 ) -> TaskResult

Execute a single browser task without managing agent lifecycle.

Parameters

Parameter Type Default Description
task_description str required Natural language task description
headless bool True Run browser in headless mode
browser_type str "chromium" Browser engine to use
debug bool False Enable debug logging
timeout int 120 Maximum execution time
Example:
import asyncio
from browser_ai_agent import execute_browser_task

async def main():
    result = await execute_browser_task(
        "Go to GitHub and search for 'browser automation'",
        headless=False,
        debug=True
    )
    
    if result.success:
        print("Task completed successfully!")
    else:
        print(f"Task failed: {result.error_message}")

asyncio.run(main())

⚠️ Error Types

Exception classes for error handling

BrowserAIError

Base exception class for all Browser AI Agent errors.

ConfigurationError

Raised when configuration is invalid or missing.

TaskExecutionError

Raised when task execution fails.

BrowserError

Raised when browser operations fail.

AIReasoningError

Raised when AI reasoning fails or returns invalid results.

Error Handling Example:
from browser_ai_agent import BrowserAIAgent, TaskExecutionError

async def safe_execution():
    try:
        async with BrowserAIAgent() as agent:
            result = await agent.execute_task("Complex task")
    except TaskExecutionError as e:
        print(f"Task failed: {e}")
    except Exception as e:
        print(f"Unexpected error: {e}")

Best Practices

⚠️ Important Guidelines:
  • Always use async context managers (async with) for proper resource cleanup
  • Enable debug mode during development for better error diagnosis
  • Use descriptive task descriptions for better AI understanding
  • Handle exceptions appropriately in production code
  • Monitor execution time and adjust timeouts as needed

Performance Tips

  • Headless Mode: Use headless=True in production for better performance
  • Browser Choice: Chromium is fastest, Firefox most compatible, WebKit for Apple devices
  • Parallel Execution: Use parallel=True for independent tasks
  • Resource Management: Always close agents properly to free resources
  • Caching: Reuse agent instances for multiple related tasks

Need More Help?

Check out our examples and community resources