.
AI writes more and more of our code. So we need a way to be sure it works. The code and the product. Today we make the agent prove it.
What to have ready.
Install Claude in Chrome.
This is a Chrome plugin that lets the agent drive a real browser. It can open your app, click, type, and read the screen, the console, and the network. We use it this afternoon to test the product the way a user sees it.
Install the Chrome extension, then connect it to Claude Code. Do this before the afternoon lab.
.
Where we are in the workflow.
An agent works in a loop.
Who checks it works?
Two kinds of quality.
& ship
Every part can pass. The product can still break where the parts join. So we check both: the parts and the whole.
.
Why test-first fits the loop.
Write the test first.
How we run TDD.
The agent can fake a passing test.
Build it test-first.
Take your small feature. Load the skill. Let it grill you, then watch it work one slice at a time.
.
Why unit tests aren’t enough.
Code review tells you what changed. It cannot tell you what the page does when a real user loads it.
The agent drives a real browser.
First give it a rubric: which flow to try, what “passing” looks like, what counts as broken. Without that, the agent says “all fine” because it has no way to know what “not fine” is.
A bug only the browser catches.
Test it like a user.
Give an agent a browser. Let it click through the feature you just built, the way a real user would.
- Did it find anything that broke in the real UI?
- What could it not test on its own?
.
What we added today.
You have two habits the team owns now: the /tdd skill that makes the agent prove its code, and the browser check that proves the product. Both run inside the agent's loop, on every task.
The agent does the checking now. You still own the judgment: what to build, and whether it's right.