Are tests like double-entry bookkeeping?

3 min readOct 23, 2023

It’s sad that test-driven development (TDD) is conflated with the general practice of writing tests. “I don’t do TDD” doesn’t mean “I don’t write tests”, it just means “I don’t follow the specific discipline of writing a failing test before writing some production code, and fixing the test before continuing”. I think John Ousterhout speaks for a lot of us when he says:

“Although I am a strong advocate of unit testing. I am not a fan of test-driven development. The problem with test-driven development is that focuses attention on getting specific features working, rather than finding the best design.”

Unfortunately, Clean Code does not see the nuance here:

This is a harmful analogy for beginners. In double-entry bookkeeping, every entry has an equal and opposite entry in another account. The worst kinds of tests have an equal and opposite entry in production code.

Consider the following production code (adapted from this excellent article):

class PaymentProcessor(private val creditCardServer: CreditCardServer) {

    fun processPayment(creditCard: CreditCard, money: Money) {
        if (!creditCardServer.isAvailable()) {
            throw NotAvailableException()
        }

        val transaction = creditCardServer.beginTransaction().transaction
        val payment = creditCardServer.pay(transaction, creditCard, 500)

        if (payment.isOverMaxBalance()) {
            throw OverMaxBalanceException()
        }
    }
}

A naive attempt at a test looks like this:

fun testCreditCardIsCharged() {
    val paymentProcessor = PaymentProcessor(mockCreditCardServer)
    whenever(mockCreditCardServer.isServerAvailable()).thenReturn(true)
    whenever(mockCreditCardServer.beginTransaction()).thenReturn(mockTransactionManager)
    whenever(mockTransactionManager.getTransaction()).thenReturn(transaction)
    whenever(mockCreditCardServer.pay(transaction, creditCard, 500)).thenReturn(mockPayment)
    whenever(mockPayment.isOverMaxBalance()).thenReturn(false)
    
    paymentProcessor.processPayment(creditCard, Money.dollars(500))
    
    verify(mockCreditCardServer).pay(transaction, creditCard, 500)
}

Notice how the test mirrors the production code in the beginning, middle, and end. Both start with a check for server availability and conclude with a check for max balance, and the test is sandwiched with an attempt to rebuild the transaction manager using mocks and stubbing. It’s well known that tests like this are brittle and tied to implementation details. Derided as “change detectors”, tests like these make refactoring processPayment difficult and their fragility doesn’t give us confidence that our code is correct.

A tax form with a calculator and pen on the left — Are tests like double-entry bookkeeping? Photo by Kelly Sikkema on Unsplash

I’ve seen developers arrive at the “double-entry” thesis of tests in the following way. Consider the following production code:

class HappyBirthdayViewModel(
    private val userRepository: UserRepository
    private val clock: Clock
) {
    init {
        val birthday = userRepository.user.getField("brthiday")
        if (birthday == clock.today()) {
            _viewState = _viewState.copy(party = true)
        }
    }
}

Notice the spelling mistake in the call to getField. When this bug is caught via an over-specified test:

fun testBirthday() {
    val user = mock()
    whenever(userRepository.user).thenReturn(user)
    val viewModel = HappyBirthdayViewModel(userRepository)
    verify(user).getField("birthday")
}

it’s easy to jump for joy and conclude that tests are there to become a second copy of the code (“write it out twice, little Johnny, to find mistakes”). In reality, an error like this is better caught via better design (in this case, avoiding lookup via strings), static analysis, or via code review.

The point here is tests that rise above the implementation details and test behavior serve us better in the long run, even if they are harder to write. Whether these get called “integration tests” is moot, but these looser kinds of tests make for a code base that is easier to refactor and more likely to be correct:

The best tests are distant from the production code and impartial, like a wise judge on a raised bench. As far as analogies go, “tests are like double-entry bookkeeping” is like Newton’s third law and the reverse King Midas of software design had a baby.

Are tests like double-entry bookkeeping?

Written by David Rawson

No responses yet