Changelog
All notable changes to Charlotte, documented by release.
v0.6.2
--cdp-endpoint CLI option — Connect to a running Chrome/Chromium instance via its DevTools Protocol endpoint instead of launching a new browser. Supports ws:// URLs and channel:chrome shorthand. Closes GAP-33.
Iframe interaction — Interaction tools (click, type, select, toggle, submit, scroll, hover, key, wait_for, upload, fill_form) now work against elements inside child frames. Closes #66.
Reduced CDP session churn — Interaction helpers reuse CDP sessions instead of repeatedly attaching and detaching. Closes #113.
Cross-frame drag validation — charlotte_drag now rejects drags between elements in different frames with a CharlotteError instead of silently producing undefined behavior.
Stale frame sessions in CDPSessionManager — Frame sessions are cleaned up on frame detach and empty reverse-index entries are pruned. Closes #67.
Batched startup tool.disable() into a single sendToolListChanged() notification, mirroring the v0.6.1 runtime fix.
Viewport of pre-existing pages is preserved when connecting via --cdp-endpoint.
v0.6.1
Runtime tool group activation — Tools enabled via charlotte_tools were not callable by MCP clients due to notification flooding. Now sends a single batched notification per enable/disable action. Fixes #146.
v0.6.0
charlotte_fill_form — Batch form fill tool. Fill an entire form in one tool call with an array of {element_id, value} pairs. Closes GAP-04.
Slow typing — charlotte_type now accepts slowly and character_delay parameters for character-by-character input. Closes GAP-05.
Lazy Chromium initialization — Browser launches on first tool call instead of at startup, preventing idle instances.
MCP logging capability — Server declares logging capability for client compatibility.
CLI improvements — parseArgs migration, --help flag, improved --no-headless handling.
Default viewport increased to 1440×900 for more realistic rendering.
BREAKING: All tool names renamed from charlotte:xxx to charlotte_xxx for MCP spec compliance. Closes #57.
Node.js requirement relaxed from >=22 to >=20.
pollUntilCondition JS evaluation — Replaced new Function with CDP Runtime.evaluate for consistency.
Screenshot stale compositor frame — Flush before capture to prevent stale SPA screenshots.
CVE-2026-31988 — Override yauzl to 3.2.1.
Server version now read from package.json instead of being hardcoded.
v0.5.1
Popup and target="_blank" tab capture — Clicks on target="_blank" links and window.open() now auto-capture the new tab in PageManager. Surfaced as opened_tabs in tool responses. Fixes #103, #98.
Contributor issue templates — Bug report, feature request, and tool request templates. Community links in README.
Renamed AXIOM to ASM across Charlotte site.
v0.5.0
Iframe content extraction — Child frame content (interactive elements, text, summaries) merged into parent page representation. Configurable depth.
Structural tree view — charlotte:observe accepts view: "tree" and "tree-labeled" for hierarchical text output with optional element IDs.
File output — charlotte:observe and charlotte:screenshot accept output_file to write results to disk, reducing token consumption. Closes GAP-13.
Screenshot artifact management — screenshots, screenshot_get, screenshot_delete tools for persistent screenshot files.
wait_for JS evaluation now uses CDP Runtime.evaluate, fixing multi-statement conditions that silently returned undefined.
Browser reconnection race — getBrowser() auto-recovers via ensureConnected() instead of throwing immediately.
Renderer pipeline resilience — malformed AX properties, failed content nodes, bad iframes, and transient CDP errors no longer crash renders.
Event listener cleanup on tab close, dialog handler error handling, and dev mode shutdown resilience.
Form field matching null guard, landmark ID cross-frame collision, CLI arg parsing with = in paths, and Zod bounds validation.
README rewritten with problem-first opening. MCP client configs added for Cursor, Windsurf, VS Code, Cline, and Amp.
v0.4.2
charlotte:upload — Set files on <input type="file"> elements via CDP. Validates file existence and element type. Closes GAP-02.
File input detection — File inputs now correctly identified as file_input type instead of button.
charlotte:key enhancement — Added keys (sequence), element_id (focus targeting), and delay parameters for keyboard-driven UIs.
Boolean parameter validation — All boolean parameters now accept string-coerced values ("true"/"false") from MCP clients.
click_at hover simulation — Moves mouse to coordinates and pauses before clicking, fixing framework-managed link navigation.
v0.4.1
charlotte:click_at — Click at specific page coordinates for non-semantic elements (custom widgets, canvas, SVG).
CSS selector mode for charlotte:find — Query the DOM directly via selector parameter, returning elements with Charlotte IDs.
charlotte:evaluate silent null on multi-statement code — Replaced with CDP Runtime.evaluate for correct completion values.
v0.4.0
Tiered tool visibility — Startup profiles control which tools load into the agent's context. Six profiles: core (7), browse (22), interact (27), develop (30), audit (13), full (40). Granular group selection via --tools.
charlotte:tools meta-tool — Runtime tool group management. List, enable, and disable tool groups mid-session without restarting.
Profile benchmark suite — Four tests measuring tool definition overhead across full, browse, and core profiles.
charlotte:drag — Drag an element to another element using mouse primitives. Closes GAP-01.
Landmark IDs — Landmarks now have stable hash-based IDs (rgn-xxxx) for tool referencing.
charlotte:console — Retrieve console messages with level filtering and buffer clearing. Closes GAP-21.
charlotte:requests — Retrieve network request history with URL, resource type, and status filters. Closes GAP-22.
Modifier key clicks — charlotte:click now accepts ctrl, shift, alt, meta modifiers for all click types.
Pseudo-element content duplication — extractFullContent() no longer emits duplicate text from CSS ::before/::after pseudo-elements.
Default startup profile is now browse (22 tools) instead of loading all 40 tools. Use --profile=full for the previous behavior.
PageManager now captures all console messages and network responses (not just errors). Ring buffers capped at 1000 entries.
Static server binds to 127.0.0.1 instead of 0.0.0.0. Directory traversal prevention via allowedWorkspaceRoot.
v0.3.0
charlotte:dialog — Accept or dismiss JavaScript dialogs (alert, confirm, prompt, beforeunload). Closes GAP-03.
Dialog-aware action racing — Clicks that trigger dialogs return immediately instead of hanging for 30s.
dialog_auto_dismiss configuration — Auto-handle dialogs via charlotte:configure. Options: none, accept_alerts, accept_all, dismiss_all.
Dialog-blocking stub responses — Minimal stub representation when a dialog is blocking, so agents always know a dialog needs handling.
PageManager now accepts CharlotteConfig in its constructor for dialog auto-dismiss configuration.
v0.2.0
Compact response format — Responses are 50-99% smaller. Charlotte's navigate returns 336 chars for Hacker News vs Playwright MCP's 61,230.
Interactive summary for minimal detail — Element counts by landmark region instead of full element lists. Wikipedia dropped from 711K to 7.7K chars.
Default state stripping — Interactive elements omit redundant defaults (enabled: true, visible: true, focused: false).
Navigation defaults to minimal detail. Pass detail: "summary" or "full" for more context.
Removed unused alerts field from page representation.
v0.1.3
Benchmark suite for comparing Charlotte against Playwright MCP across real websites.
v0.1.2
Added mcpName field for MCP registry publishing.
v0.1.1
get_cookies — Retrieve cookies for the active page with optional URL filtering.
clear_cookies — Clear cookies with optional name filtering.
Session integration tests now use HTTP URLs for cookie operations.
v0.1.0
Initial release. All six implementation phases complete: navigation, observation, interaction, session, development, and utility tools.
Renderer pipeline: accessibility tree + layout geometry + interactive element extraction.
Hash-based element IDs stable across re-renders.
Snapshot store with ring buffer and structural diffing.
222 tests across 19 test files (unit + integration).