Flakiness is a sign of hidden dependencies. Diagnostic checklist:
1. Time-dependence: does the test depend on DateTime.now(), Date.today(), or relative dates?
- Symptom: passes during the day, fails at midnight.
- Fix: inject a TimeService; mock current time in test.
2. Order dependence: does running the class's tests in a different order change pass/fail?
- Symptom:
runAllTestspasses, but running one test alone fails. - Fix: each test must set up its own data.
@TestSetupshouldn't accumulate state across tests.
3. Shared static state: Apex statics persist within a transaction. Test 1 sets a static; test 2 reads stale value.
- Symptom: order-dependent failures, transient.
- Fix: reset statics in
@isTestsetup; test isolation.
4. Async resolution timing: tests with Test.stopTest() flush async. If you query before stopTest, results are stale.
- Symptom: test fails because data isn't yet updated.
- Fix: query AFTER
Test.stopTest().
5. SOQL ordering without ORDER BY: SOQL doesn't guarantee order without ORDER BY.
- Symptom: test asserts
results[0].Name == 'A'but sometimes gets 'B'. - Fix: add
ORDER BYor assert on a Map keyed by ID.
6. Limit-related timing: tests near governor limits may fail intermittently as platform-side factors shift.
- Symptom: passes locally, fails in CI under load.
- Fix: profile actual usage; reduce or split into multiple tests.
7. External dependencies leaking: real callouts can't happen in tests, but if test infrastructure is wrong, real DML or scheduled jobs may fire and interfere.
- Symptom: results vary based on org state.
- Fix: ensure mocks for all external calls; check
@isTest(SeeAllData=false).
8. Data dependencies: @isTest(SeeAllData=true) means tests see real org data, which changes.
- Symptom: passes on Monday, fails Friday.
- Fix: remove
SeeAllData=true, build all data in test.
9. Concurrent test runs: parallel test execution can cause data conflicts on shared accounts.
- Symptom: flaky in CI's parallel mode but passes single-threaded.
- Fix: use unique names per test (
Test.getRunningJobId()for uniqueness); avoid hard-coded values.
Diagnostic process:
- Run the failing test 100 times (in CI loop). Capture failure rate.
- Compare passing vs failing debug logs side-by-side.
- Look at the EXACT failure: which assertion? What was the actual vs expected?
- Form a hypothesis from the symptom; check categories above.
A flaky test is broken; not "mostly working". Fix it or quarantine. Letting flakies linger erodes trust in the test suite.
