The Test Graveyard
Most stores run the same failed tests twice. You test adding trust badges to your product page. It loses. You remember nothing. Eighteen months later, a new designer joins your team. They pitch adding trust badges to the product page. You test it again. It loses again.
You paid twice for the same lesson. That's the cost of not keeping a test log.
A test log is your institutional memory. It prevents you from retesting what didn't work. It shows you patterns in what your audience actually responds to.
Why a Test Log Matters
Two reasons. First, you stop wasting traffic on tests you already know lose. Second, losing tests teach you what your audience doesn't care about.
A team that's been running tests for a year without logging them has run 50 tests. Half of them probably lost. They can't remember which ones. So they run them again.
A team that logs every test has a record. They know in five minutes that shortening the hero headline lost three times already. They skip it. They move to something new.
The second reason is subtler but more valuable. Losing tests tell you about your audience. You tested a "free gift with purchase" offer on the homepage. It didn't move conversions. That tells you your audience doesn't respond to scarcity or bonus product. That insight changes the next five tests you design.
What to Log
Keep it simple. A Google Sheet with seven columns:
- Date started
- Page tested (homepage, product page, checkout, etc.)
- Hypothesis (one sentence)
- Variant (what changed)
- Sample size (visitors per variant)
- Result (won, lost, inconclusive)
- Learning (what this teaches you)
- Next action (what test comes next)
You don't need fancy software. A spreadsheet works. The point is writing it down.
The Learning Column Is Everything
This is the section most teams skip. They log the test result and move on. But the learning is where the value is.
Bad learning: "Shorter headline didn't work."
Good learning: "Shorter headline didn't lift CRO. Suggests audience actually reads the PDP carefully and values detailed product description. Length isn't the problem."
The bad version tells you what didn't work. The good version tells you why it didn't work. That's the difference between a losing test and a lesson.
Another example. You test adding a "customers also bought" carousel to the product page. It loses. Bad learning: "Carousel test didn't convert." Good learning: "Adding related products in carousel format didn't increase AOV or CR. This suggests product discovery happens during browsing, not at the final stage. Focus next tests on collection page and search experience."
Start logging today.
Get a test log template and build your first month of tests.
Monthly Review: Finding Patterns
At the end of each month, read through your log. Look for patterns. What keeps losing? What keeps winning?
If you've run five tests on the product page and three of them were about changing copy, all three lost, that's a pattern. The product copy isn't your problem. Your audience isn't copy-sensitive. Stop testing copy.
If you've run three tests about checkout friction and two of them won, that's a pattern. Checkout friction is where your money is. Run more checkout tests next month.
A test log becomes a decision-making tool over time. It shows you where your leverage actually is.
This Is Why Agencies Don't Keep Logs
Most agencies get paid by the hour or the test, not the result. If they run 50 tests and 40 of them lose, they still collect 50 payments. If they had a test log, they'd realize that after test 15, the patterns became clear. They'd run fewer tests and make more money for clients. But fewer tests means fewer billable hours.
Your test log is a competitive moat. An agency that doesn't care about your patterns doesn't have one. You do. A year of logged tests tells you what your specific audience responds to. Nobody else has that data.
How to Use It
Before you design a new test, look at the learning column from the last five tests. Build off what you learned, not what looks intuitive. Every test should answer a question raised by the test before it.
That's how a hypothesis-first workflow compounds. You don't run random tests. You run tests that build on each other. Each one teaches you about your audience and points to the next one.
A test log that's actually used turns testing from guessing into learning.