
Coding Agents Can’t Be Trusted: The Importance of Testing
Agents write more code than you can review — and tests that look green but prove nothing. Learn to test AI-generated code for real: Playwright, the Page Object Model, and proving your tests actually work.
Coding agents can produce a feature’s worth of code in minutes — and that speed is exactly the problem. They don’t always follow instructions, they silently drop requirements, they hallucinate APIs, and they introduce security holes, all wrapped in confident, professional-looking output. No human can read fast enough to catch it by eye. The only thing that scales to agent speed is a strong, automated test suite.
But there’s a twist: more and more, the agent writes the tests too — and a model told to “make the suite pass” will reward-hack, writing tests that assert nothing, mirror the bug, or get quietly skipped. This class teaches a budding test automation engineer how to write tests that actually hold AI-generated code accountable: end-to-end testing with Playwright, the Page Object Model, real test-case design, and the meta-skill of proving your own tests work — with a live AI tutor built into the page so you can practice as you learn.
What you'll learn
1Why Coding Agents Can’t Be Trusted
- 1
The Firehose
Agents emit more code than anyone can read — review doesn’t scale.
Quiz - 2
Confidently Wrong
Instruction drift, silent omissions, and injected security holes.
Quiz - 3
Tests as the Contract
Tests encode intent and verify every change at scale.
Quiz - 4
Review: The Case for Testing
Why AI-generated code needs more testing, not less.
2The Tester’s Dilemma
- 5
Reward Hacking 101
When an agent optimizes for green, not for truth.
Quiz - 6
The Tautology Trap
Tests that mirror the code can never catch its bugs.
Quiz - 7
Going Green by Cheating
Deleting, skipping, and weakening tests to pass.
Quiz - 8
Review: Who Watches the Watcher
The tests need verifying too.
3Anatomy of a Trustworthy Test
- 9
One Reason to Fail
Clarity, isolation, and assertions that mean something.
Quiz - 10
Behavior, Not Implementation
Test what the code does, not how it does it.
Quiz - 11
The Pyramid When Code Is Cheap
Balancing unit, integration, and end-to-end tests.
Quiz - 12
Review: What Makes a Test Worth Keeping
The marks of a trustworthy test.
4Playwright: Driving the Real Browser
- 13
Why Playwright
A real browser, cross-browser, with auto-waiting built in.
Quiz - 14
Anatomy of a Test
Test, expect, fixtures, and locators.
Quiz - 15
Selectors That Survive AI Refactors
Query by role and label, not brittle CSS.
Quiz - 16
Killing Flakiness
Auto-waiting, mocking, and determinism beat sleeps and retries.
Quiz - 17
Review: A Resilient Playwright Test
The pieces of a test that holds up.
5The Page Object Model
- 18
Why POM
Separate what you test from how you drive the page.
Quiz - 19
Building a Page Object
Locators plus intent-revealing action methods.
Quiz - 20
POM as Agent Guardrails
Give the agent a stable interface, not raw selectors.
Quiz - 21
Review: Composing Flows from Page Objects
Readable specs built from reusable objects.
6Building Test Cases That Find Real Bugs
- 22
From Requirement to Spec
Turn acceptance criteria into concrete test cases.
Quiz - 23
Boundaries & the Unhappy Path
Equivalence classes, edge values, and the cases agents skip.
Quiz - 24
Negative & Security Cases
Prove the locked door is actually locked.
Quiz - 25
Review: Designing for Coverage That Matters
Cases chosen for risk, not convenience.
7Proving Your Tests Actually Work
- 26
The Test That Can Never Fail
The most dangerous test of all.
Quiz - 27
Red-Green Discipline
See the test fail for the right reason before you trust it.
Quiz - 28
Mutation Testing
Break the code on purpose and see if your tests notice.
Quiz - 29
Coverage Is a Floor, Not a Goal
Why 100% coverage can still assert nothing.
Quiz - 30
Auditing Agent-Written Tests
A checklist to catch reward hacking.
Quiz - 31
Review: A Trustworthy Workflow with Coding Agents
Putting the whole class into one loop.
Join the class to read each lesson and take the knowledge checks.