Workflows
PEN now exposes a workflow-first entry point for the most common debugging
loops, and the lower-level tools still chain together naturally when you
need a more manual investigation. Start with pen_workflow when you
want guided verdicts and next steps, then fall back to the manual patterns below
when you need finer control.
Workflow-First Entry Points
slow-page-triage
Use this when a page feels slow and you need one pass across startup cost, render-blocking resources, Web Vitals, and CPU hotspots. The workflow returns a single verdict plus the most likely next step instead of leaving the LLM to infer the chain from raw outputs.
js-bloat-check
Use this when you suspect oversized bundles, weak code splitting, or scripts that load far more code than they use. When a URL is provided, the workflow starts coverage before navigation so the result reflects page-load JavaScript, not just idle runtime state.
accessibility-pass
Use this when you want a fast accessibility sweep for missing alt text, unlabeled controls, heading-order problems, and missing document language. It is intentionally lightweight and still points you back to Lighthouse and manual keyboard or screen-reader testing for deeper validation.
Manual Investigation Patterns
Memory Leak Investigation
- Force GC to get a clean baseline
- Take snapshot A
- Have the user reproduce the suspected leak (navigate, open/close a modal, etc.)
- Take snapshot B
- Diff the two snapshots — PEN shows new objects, grown objects, and total delta
The diff highlights retained objects that grew between the two snapshots — those are your leak suspects. PEN also includes a percentage change and a qualitative assessment (e.g. "minor growth", "significant growth", "critical growth") to help the LLM quickly gauge severity. From there, the LLM can reason about the object types and suggest what’s holding the reference.
Tip: If you'd rather track allocations over time instead of taking two manual snapshots, use
pen_heap_trackwithaction: "start", reproduce the problem, thenaction: "stop".
Page Load Optimization
- Navigate to the target page
- Capture a Chrome trace during load
- Analyze the trace for long tasks (>50ms), LCP, CLS, and slow resources
- Check the network waterfall for large assets, slow requests, and render-blocking resources
- Measure Core Web Vitals for the final score
That gives the LLM the full picture: trace-level timing, network bottlenecks, and Web Vitals scores in one pass.
Console Debugging
- Start console capture to wire up the CDP listener
- Have the user reproduce the issue
- Pull error messages — filtering by
level=errorkeeps noise down. You can also usetextFilterto search for specific strings (case-insensitive substring match) across all message text.
Console entries include source URLs, line numbers, and stack traces for exceptions. Buffer holds 1,000 messages; oldest 100 get evicted when it fills up.
Full Page Audit
- Navigate to the page
- Run Lighthouse for a high-level score (performance, accessibility, SEO, best practices)
- Capture a trace for the detailed timeline
- Analyze the trace to pinpoint exactly what Lighthouse flagged
Lighthouse tells you what's wrong. The trace tells you why and where in the timeline.
Bundle Audit
- Run
pen_js_coveragewith thenavigateparameter set to the page URL — PEN starts coverage, navigates, collects, and stops internally - Run
pen_css_coveragethe same way to see which CSS rules are unused
This surfaces dead code. The LLM can then recommend code splitting or tree-shaking based on what's unused. Both coverage tools are single-call — they handle start, navigate, collect, and stop in one invocation.
Multi-Tab Profiling
- List all browser tabs
- Switch to the target tab
- Profile CPU on that tab
- Grab performance metrics
Handy when your app spans multiple tabs or you need to compare performance across pages.
Trace-Driven Analysis
Record a raw trace, then feed it to pen_trace_insights for a structured
breakdown. No need to leave the conversation to analyze the file. You get:
- Long tasks (>50ms threshold)
- Layout shifts (CLS contributors)
- Largest Contentful Paint timing
- Slowest resources
- Frame timing and dropped frames (>33.3ms = below 30fps)
Network Performance
- Enable network capture (optionally disable cache)
- Interact with the page — navigate, click, scroll
- View the waterfall to spot slow requests, large assets, or 4xx/5xx errors.
Use
statusFilter(4xx,5xx,error, or an exact status code) andurlFilter(case-insensitive substring) to narrow results. - Drill into a specific request for full headers, timing, and body details
Device Simulation
- Set device emulation (e.g., iPhone 14 with 4G network + 4x CPU throttle)
- Navigate to the page
- Measure Web Vitals under throttled conditions
- Capture a trace to see what's slow on constrained hardware
Network presets: 3G (563ms latency, 188KB/s down), slow-3g (2000ms, 50KB/s), 4G (170ms, 500KB/s), WiFi (2ms, 3.75MB/s), offline (all connectivity
disabled). You can also set offline: true independently of any preset
to simulate airplane mode.
Tool ID Flow
Some tools produce IDs consumed by downstream tools:
| Producer | ID Type | Consumer |
|---|---|---|
pen_heap_snapshot | snapshot ID | pen_heap_diff |
pen_list_pages | target ID | pen_select_page |
pen_network_waterfall | request ID | pen_network_request |
pen_list_sources | script ID | pen_source_content, pen_search_source |
pen_capture_trace | trace path | pen_trace_insights |
IDs are opaque strings (or file paths for traces). They stay valid until PEN restarts or the thing they reference goes away (tab closed, page navigated, etc.).