Skip to main content
Development

Testing Practices

Testing philosophy, test types, per-language conventions, file organization, fixtures/mocks, CI strategy, and what NOT to test. Use when writing tests, reviewing test coverage, setting up test infrastructure, or deciding what to test.

Testing Practices

Philosophy

Test your software, or your users will.

  • Test against contracts, not implementations — assert what it should do, not how it does it. Tests that break on every refactor are coupling to internals.
  • State coverage > line coverage — exercise meaningful paths and edge cases, not just lines. We intentionally use no coverage tools — percentage targets create false confidence.
  • Tests are the first users of your API — if tests are hard to write, the design is wrong. Refactor the interface, not the test.
  • Property-based testing finds edges you didn’t think of — complement example-based tests with fuzz and property tests where the input space is large.
  • Tests should be boring — a test that’s hard to read is a test nobody trusts. Inline data, obvious assertions, no clever abstractions.

See review-design for the underlying Pragmatic Programmer principles (design by contract, pragmatic paranoia).

Test Types

TypeWhat It VerifiesWhen to UseCodebase Example
UnitSingle function/module in isolationAlways. Every public function.sr/crates/sr-core/src/version.rs#[cfg(test)] mod tests
IntegrationMultiple modules working togetherCross-layer interactions, real I/Osr/crates/sr-git/tests/integration.rs — TempDir + real git CLI
Snapshot/GoldenOutput hasn’t changed unexpectedlyTemplates, code generation, formattersincipit/generators/golden_test.go-update flag to regenerate
FuzzNo panics/crashes on arbitrary inputParsers, deserializers, sanitizersincipit/resume/adapter_fuzz_test.go — Go native testing.F
Property-basedInvariants hold for generated inputsMathematical properties, roundtrip encode/decodeUse proptest (Rust), testing/quick (Go), hypothesis (Python)
BenchmarkPerformance characteristicsHot paths, algorithms, throughputlinear-gp/crates/lgp/benches/ — criterion framework
SmokeBasic environment sanityCI gate, post-deploy checklinear-gp/crates/lgp/tests/smoke_tests.rs — 2 generations, no crash
E2EFull system from user perspectiveCritical user flowsteasr CI — real Chrome + xvfb-run dogfood

Golden File Pattern (Go)

var update = flag.Bool("update", false, "update golden files")

func TestGolden(t *testing.T) {
    got := generate(input)
    golden := filepath.Join("testdata", "golden", name)
    if *update {
        os.WriteFile(golden, got, 0644)
        return
    }
    want, _ := os.ReadFile(golden)
    if diff := cmp.Diff(string(want), string(got)); diff != "" {
        t.Errorf("mismatch (-want +got):\n%s", diff)
    }
}

Run go test -update ./... to regenerate, then commit the diffs.

Fuzz Pattern (Go)

func FuzzParseInput(f *testing.F) {
    // Seed corpus: valid, empty, edge cases
    f.Add([]byte(`{"name": "Jane"}`))
    f.Add([]byte(`{}`))
    f.Add([]byte(``))

    f.Fuzz(func(t *testing.T, data []byte) {
        // Should never panic — errors are fine
        _, _ = ParseInput(data)
    })
}

CI: go test -fuzz=FuzzParseInput -fuzztime=10s -timeout=60s ./...

Benchmark Pattern (Rust)

use criterion::{criterion_group, criterion_main, Criterion};

fn bench_transform(c: &mut Criterion) {
    let input = load_fixture();
    c.bench_function("transform", |b| {
        b.iter(|| transform(&input))
    });
}

criterion_group!(benches, bench_transform);
criterion_main!(benches);

Place in benches/ directory. Run with cargo bench.

Per-Language Conventions

Rust

AspectConvention
Frameworkcargo test (built-in)
Unit tests#[cfg(test)] mod tests inline with source
Integrationtests/*.rs (separate binary, full crate access)
Benchmarksbenches/*.rs with criterion crate
Assertionsassert_eq!, assert!(matches!(...)), assert!(result.is_err())
Error testing#[should_panic(expected = "message")] or match on Result::Err
CI commandcargo test --workspace
Async tests#[tokio::test] attribute

Fixtures: tempfile::TempDir for filesystem tests (drops on scope exit). include_str!() for static test data.

Go

AspectConvention
Frameworkgo test (built-in)
Unit tests*_test.go co-located with source
Table-driven[]struct{name string; input X; want Y} + t.Run(tt.name, ...)
Fuzz testsFuzz* functions with testing.F (Go 1.18+)
BenchmarksBenchmark* functions with testing.B
Golden filestestdata/golden/ with -update flag
CI commandgo test ./...
Parallelt.Parallel() at top of each independent test

Fixtures: t.TempDir() for temp directories (auto-cleanup). testdata/ for static files (ignored by Go toolchain).

Python

AspectConvention
Frameworkpytest
Test filestests/test_*.py
Parametrize@pytest.mark.parametrize("name", [...])
Fixtures@pytest.fixture in conftest.py
AssertionsPlain assert (pytest rewrites for readable diffs)
CI commanduv run pytest
Config[tool.pytest.ini_options] in pyproject.toml
[tool.pytest.ini_options]
testpaths = ["tests"]
pythonpath = ["src"]

TypeScript

AspectConvention
Frameworkvitest
Test files*.test.ts co-located with source
Structuredescribe() / it() / expect()
Mocksvi.fn(), vi.mock(), mockResolvedValue()
CI commandnpx vitest run

File Organization

Test TypeLocation
Unit (Rust)Inline #[cfg(test)] mod tests in source file
Unit (Go)*_test.go in same package
Unit (Python)tests/test_<module>.py
Unit (TS)<module>.test.ts in same directory
Integration (Rust)tests/*.rs at crate root
Integration (Go)*_test.go with //go:build integration tag
Golden filestestdata/golden/ (Go), tests/fixtures/ (Rust/Python)
Benchmarks (Rust)benches/*.rs
Fuzz corpustestdata/fuzz/ (auto-managed by Go toolchain)

Test helpers go in the test file, unexported. Do not create shared testutils/ packages — the duplication cost is lower than the coupling cost.

Fixtures & Mocks

Fixtures

LanguagePatternExample
Rusttempfile::TempDirlet dir = TempDir::new().unwrap();
Rustinclude_str!()include_str!("fixtures/sample.yaml")
Got.TempDir()dir := t.TempDir() (auto-cleanup)
Gotestdata/filepath.Join("testdata", "input.json")
Python@pytest.fixtureScoped setup/teardown in conftest.py
Pythontmp_pathBuilt-in pytest fixture for temp dirs
TSFactory functionscreateMockFetch(200, {...})

Mocking Rules

  • Prefer real implementations. Use TempDir and real git commands over git mocks. Use real HTTP servers over fetch mocks when practical.
  • Mock at boundaries. Only mock external services (APIs, databases) and only at the interface boundary.
  • Never mock what you own. If you need to mock your own code, the design needs refactoring — extract an interface.
  • Go: Use interfaces for test doubles. No mocking framework needed.
  • Rust: Use trait objects or generic type parameters for test substitution.
  • TypeScript: vi.fn() and vi.mock() for external dependencies only.

CI Strategy

Test TypeCI StageTriggerTime Budget
Unit + lintci.ymlEvery PR< 5 min
Integrationci.ymlEvery PR< 10 min (cached)
Fuzz smokeci.ymlEvery PR10-30s per target
Full fuzzScheduledNightly/weekly5-30 min
BenchmarksManualRelease prepVaries
E2E / dogfoodrelease.ymlPost-releaseVaries

All test types run with just check locally. CI mirrors just check exactly — no CI-only test logic.

What NOT to Test

  • Third-party behavior — don’t test that serde serializes correctly or that os.MkdirAll creates directories
  • Private implementation details — if you need to export something just for testing, the boundary is wrong
  • Generated code — oag generates TypeScript clients; test the generator, not the output
  • Trivial accessors — a getter that returns a field does not need a test
  • Implementation mirrors — if your test duplicates the logic it tests, it proves nothing
  • Exact error messages — test error types or categories, not wording (it changes)

Gotchas

  • Float comparison: Never assert_eq!(f64, f64). Use an epsilon: assert!((a - b).abs() < 1e-10)
  • Go parallel + shared state: t.Parallel() runs subtests concurrently. Shared fixtures must be immutable or use sync.Mutex.
  • Python src/ discovery: Without pythonpath = ["src"] in pytest config, imports fail. Always configure this in pyproject.toml.
  • Rust integration tests: Each file in tests/ compiles as a separate binary. Group related tests in one file to reduce compile time.
  • Go golden file diffs: Use go-cmp for readable diffs instead of reflect.DeepEqual — the error messages are vastly better.
  • Flaky tests: If a test fails intermittently, it’s a design problem (shared state, timing, network). Fix the root cause; do not retry.