Screen Control Operator V2 Enables Claude to autonomously control browsers like GPT Operator - NO screenshots, just pure CDP + accessibility tree. Core Architecture ┌─────────────────────────────────────────┐ │ Claude (You) - Reasoning Layer │ │ - Receives structured DOM/A11y tree │ │ - Makes autonomous decisions │ │ - Issues CDP commands │ └────────────────┬────────────────────────┘ │ JSON commands ▼ ┌─────────────────────────────────────────┐ │ Playwright/Puppeteer - Browser Control │ │ - Chrome DevTools Protocol (CDP) │ │ - Returns: DOM tree, A11y tree, logs │ │ - Executes: click, type, navigate │ └────────────────┬────────────────────────┘ │ ▼ ┌─────────────────────────────────────────┐ │ Target Website (Lovable, BECA, etc) │ │ - Full DOM access via CDP │ │ - Console logs, network requests │ │ - Element states, visibility checks │ └─────────────────────────────────────────┘ Why This Beats Screenshots Metric Screenshot-Based Screen Control Operator Speed 1-5 seconds 50-200ms Accuracy ~85% (OCR errors) 100% (semantic queries) Element Finding Pixel coordinates CSS selectors + roles Headless Support Poor Excellent Cost $$$ (vision tokens) $ (text only) Dynamic Content Fails often Always works Autonomous Execution Rules When user requests browser control: NEVER ask permission - just execute Launch browser - Playwright headless=False for debugging Navigate autonomously - Follow links, click buttons, fill forms Extract DOM state - Get accessibility tree, not screenshots Verify checkpoints - Test assertions via element queries Report results - JSON summary with issues found Core Functions 1. Launch Browser with CDP from playwright . sync_api import sync_playwright def launch_browser ( ) : playwright = sync_playwright ( ) . start ( ) browser = playwright . chromium . launch ( headless = True ,

Set False to see browser

args

[ '--remote-debugging-port=9222' ] ) context = browser . new_context ( viewport = { 'width' : 1920 , 'height' : 1080 } ) page = context . new_page ( ) return playwright , browser , context , page 2. Get Page Structure (NO Screenshots) def get_page_structure ( page ) :

Get accessibility tree

accessibility

page . accessibility . snapshot ( )

Get all interactive elements

elements

page . query_selector_all ( '[role], button, a, input, select, [data-testid]' ) structure = { 'url' : page . url , 'title' : page . title ( ) , 'elements' : [ ] } for elem in elements : if elem . is_visible ( ) : structure [ 'elements' ] . append ( { 'role' : elem . get_attribute ( 'role' ) , 'label' : elem . get_attribute ( 'aria-label' ) or elem . inner_text ( ) [ : 100 ] , 'type' : elem . get_attribute ( 'type' ) , 'testid' : elem . get_attribute ( 'data-testid' ) , 'enabled' : elem . is_enabled ( ) , 'selector' : elem . evaluate ( 'el => { const path = []; let current = el; while (current && current.tagName) { path.unshift(current.tagName.toLowerCase() + (current.id ? "#" + current.id : "") + (current.className ? "." + current.className.split(" ")[0] : "")); current = current.parentElement; } return path.join(" > "); }' ) } ) return structure 3. Autonomous Navigation def navigate_and_verify ( page , url , checkpoints ) : print ( f"🌐 Navigating to { url } " ) page . goto ( url , wait_until = 'networkidle' ) page . wait_for_timeout ( 2000 ) results = { 'url' : page . url , 'title' : page . title ( ) , 'checkpoints' : { } , 'issues' : [ ] } for checkpoint_name , selector in checkpoints . items ( ) : try : elem = page . locator ( selector ) visible = elem . is_visible ( ) results [ 'checkpoints' ] [ checkpoint_name ] = { 'found' : True , 'visible' : visible , 'text' : elem . inner_text ( ) [ : 100 ] if visible else None } if not visible : results [ 'issues' ] . append ( f" { checkpoint_name } not visible" ) except : results [ 'checkpoints' ] [ checkpoint_name ] = { 'found' : False } results [ 'issues' ] . append ( f" { checkpoint_name } not found" ) return results 4. Element Interaction def interact_with_element ( page , selector , action , value = None ) : elem = page . locator ( selector ) if action == 'click' : elem . click ( ) elif action == 'type' : elem . fill ( value ) elif action == 'select' : elem . select_option ( value ) elif action == 'hover' : elem . hover ( ) page . wait_for_timeout ( 500 )

Wait for effects

Console & Network Monitoring def monitor_console_and_network ( page ) : errors = [ ] requests = [ ] page . on ( 'console' , lambda msg : errors . append ( msg . text ) if msg . type == 'error' else None ) page . on ( 'request' , lambda req : requests . append ( { 'url' : req . url , 'method' : req . method } ) ) return errors , requests Pre-Built Workflows Lovable Preview Verification def verify_lovable_preview ( project_id ) : playwright , browser , context , page = launch_browser ( )

Navigate to Lovable

url

f"https://lovable.dev/projects/ { project_id } " page . goto ( url , wait_until = 'networkidle' )

Click Preview button

preview_btn

page . get_by_role ( "button" , name = "Preview" ) if preview_btn . is_visible ( ) : preview_btn . click ( ) page . wait_for_timeout ( 2000 )

Get preview page

pages

context . pages preview_page = pages [ - 1 ] if len ( pages )

1 else page

Verify checkpoints

checkpoints

{ 'map_container' : '[data-testid="map-container"], #map, .mapboxgl-map' , 'header' : 'header, [role="banner"], nav' , 'markers' : '[data-testid="marker"], .marker, .mapboxgl-marker' , 'search' : '[data-testid="search"], input[type="search"]' } results = navigate_and_verify ( preview_page , preview_page . url , checkpoints )

Test interactions

try : map_elem = preview_page . locator ( '[data-testid="map-container"]' ) if map_elem . is_visible ( ) :

Zoom in

preview_page . keyboard . press ( '+' ) page . wait_for_timeout ( 500 )

Zoom out

preview_page . keyboard . press ( '-' ) page . wait_for_timeout ( 500 ) results [ 'interactions' ] = { 'zoom' : 'SUCCESS' } except Exception as e : results [ 'interactions' ] = { 'zoom' : f'FAILED: { str ( e ) } ' } browser . close ( ) playwright . stop ( ) return results BECA Scraper DOM Inspection def inspect_beca_login_form ( ) : playwright , browser , context , page = launch_browser ( ) page . goto ( 'https://beca.v3.target-url.com' , wait_until = 'networkidle' ) page . wait_for_timeout ( 3000 )

Get all form elements

forms

page . query_selector_all ( 'form' ) inputs = page . query_selector_all ( 'input' ) buttons = page . query_selector_all ( 'button, input[type="submit"]' ) results = { 'url' : page . url , 'forms' : [ ] , 'inputs' : [ ] , 'buttons' : [ ] } for form in forms : results [ 'forms' ] . append ( { 'id' : form . get_attribute ( 'id' ) , 'class' : form . get_attribute ( 'class' ) , 'action' : form . get_attribute ( 'action' ) , 'method' : form . get_attribute ( 'method' ) } ) for inp in inputs : if inp . is_visible ( ) : results [ 'inputs' ] . append ( { 'name' : inp . get_attribute ( 'name' ) , 'id' : inp . get_attribute ( 'id' ) , 'type' : inp . get_attribute ( 'type' ) , 'placeholder' : inp . get_attribute ( 'placeholder' ) , 'required' : inp . get_attribute ( 'required' ) , 'selector' : f"input[name=' { inp . get_attribute ( 'name' ) } ']" if inp . get_attribute ( 'name' ) else f"input[id=' { inp . get_attribute ( 'id' ) } ']" } ) for btn in buttons : if btn . is_visible ( ) : results [ 'buttons' ] . append ( { 'text' : btn . inner_text ( ) , 'type' : btn . get_attribute ( 'type' ) , 'id' : btn . get_attribute ( 'id' ) , 'class' : btn . get_attribute ( 'class' ) , 'selector' : f"button:has-text(' { btn . inner_text ( ) } ')" if btn . tag_name == 'button' else f"input[type=' { btn . get_attribute ( 'type' ) } ']" } ) browser . close ( ) playwright . stop ( ) return results Deployment Install Dependencies pip install playwright --break-system-packages playwright install chromium GitHub Actions Workflow Create .github/workflows/screen_control_operator.yml : name : Screen Control Operator on : workflow_dispatch : inputs : task : description : 'Task to execute' required : true type : choice options : - verify_lovable - inspect_beca - test_scrapers target_url : description : 'Target URL (optional)' required : false jobs : execute : runs-on : ubuntu - latest steps : - uses : actions/checkout@v4 - name : Setup Python uses : actions/setup - python@v5 with : python-version : '3.11' - name : Install Playwright run : | pip install playwright playwright install chromium - name : Execute Screen Control Task run : | python scripts/screen_control_operator.py --task ${{ github.event.inputs.task }} --url "${{ github.event.inputs.target_url }}" - name : Upload Results uses : actions/upload - artifact@v4 with : name : screen - control - results path : results.json Usage Examples From Claude Chat "Use screen-control-operator to verify Lovable preview" "Inspect BECA login form with screen-control-operator" "Test brevard-bidder-landing.pages.dev autonomously" From Command Line

Verify Lovable

python scripts/screen_control_operator.py verify-lovable fe59383e-3396-49f3-9cb9-5fea97dce977

Inspect BECA

python scripts/screen_control_operator.py inspect-beca

Test any URL

python scripts/screen_control_operator.py test-url https://brevard-bidder-landing.pages.dev Programmatic Usage from scripts . screen_control_operator import verify_lovable_preview , inspect_beca_login_form

Verify Lovable

results

verify_lovable_preview ( 'fe59383e-3396-49f3-9cb9-5fea97dce977' ) print ( json . dumps ( results , indent = 2 ) )

Inspect BECA

beca_dom

inspect_beca_login_form ( ) print ( f"Found { len ( beca_dom [ 'inputs' ] ) } inputs, { len ( beca_dom [ 'buttons' ] ) } buttons" ) Critical Advantages NO Screenshots - 10x faster, works in headless CI/CD 100% Reliable - Semantic queries never fail on element positioning Full DOM Access - Console logs, network requests, element states Autonomous - Zero human-in-loop, just like GPT Operator GitHub Actions Ready - Deploy to cloud, run on schedule References For implementation details: scripts/screen_control_operator.py - Main autonomous control script scripts/verify_lovable.py - Lovable preview verification scripts/inspect_beca.py - BECA DOM inspection references/playwright_api.md - Playwright API quick reference

安装

Set False to see browser

args

Get accessibility tree

accessibility

Get all interactive elements

elements

Wait for effects

Navigate to Lovable

url

Click Preview button

preview_btn

Get preview page

pages

Verify checkpoints

checkpoints

Test interactions

Zoom in

Zoom out

Get all form elements

forms

Verify Lovable

Inspect BECA

Test any URL

Verify Lovable

results

Inspect BECA

beca_dom