Why TDD matters for product managers, product owners, and stakeholders

Co-authored by the badass Diana Villanueva

Test Driven Development (TDD) is a software development technique practiced by a portion of the developer community. In this article we are going to focus on the benefits TDD provides for product managers, product owners, and stakeholders. If you’re managing a project that is written without TDD, you’ll be missing out on a lot of the benefits we discuss below. And if you’re unfamiliar with TDD and managing a project that is developed without TDD, you’ll be able to notice some of the common problems we’ll highlight below, but you wouldn’t be able to suggest TDD as a solution. So it’s important that you learn how TDD can benefit your project and why your development team should be using it.

What does TDD look like?

Imagine you want to deliver a new feature that allows existing users to login. In a typical development approach, developers generate the login functionality without any automated tests and report the feature as done. With a TDD approach, developers write a test first. The test would say something like: “When the user exists and they log in with the correct username and password, they’re taken to their dashboard.” This test would immediately fail because the login functionality has not been built yet. Then, the developer would build the functionality that satisfies the test and the test would pass.

What are the benefits?

Before we jump into TDD, we’ll talk about the benefits of automated tests. What’s the difference? If the developer used the traditional method for the login feature, then wrote a test to check that it worked, that would be an automated test without TDD. It’s only when you write your automated test before the feature that you’re practicing TDD.

Automatic Tests help with:

Regression and Bugs

Imagine you spend the whole day doing chores around the house and at the end of the day you realize you don’t remember where you left your keys, but know you had them at the beginning of the day. Now, you have to think back through the whole day and look everywhere you’ve been. Now imagine a different scenario where you check for your keys every five minutes. In this scenario, when you lose your keys you only have to think back the last five minutes. You’re going to find them much faster.

Automated tests work the same way. If you discover a bug in your production environment, it’s much easier to find and fix it if you had a way of knowing that it worked five minutes ago: “Whatever we did in the last minutes caused the problem.” Automated test are fast and repeatable. You can build an automated test suite that tests your entire app every five minutes. Automated tests on new features can catch issues before they’re released to production. If a problem does reach production, automated tests reduce the surface area of the problem space. The more automated tests you have, the less untested code there is to comb through.

To summarize, automated tests can quickly and frequently check that features work. The more frequently you check, the sooner you find and fix problems. The faster you fix problems, the more your developers can focus on building new features that deliver new user value.

Confidence to change code and deploy

You want to focus on writing stories that deliver user value, regardless of what part of the application it touches. With the traditional method of development, a lot of developer behavior is motivated by fear. If a new feature requires too much change, there’s concern it will break existing features. There may be nasty parts of the code base that nobody is willing to touch because if it breaks, it would take a while to notice or have devastating effects. When fear interferes with quality, features will take longer to finish. Developers will think of creative ways to avoid interacting with the scary part of the code. These creative ways can’t take a direct path through the bad code so they’re harder to understand by others. This is why bad code tends to cause a downward spiral of quality unless addressed. The best way to address the problem is to surround it with enough automated tests that developers feel confident it can be changed without breaking any functionality.

If you’ve ever been in this situation, you may have felt pushback and pressure to prioritize different features by developers. But if there was a trusted test suite in place, this would not be a major concern. The automated tests would tell you and the developer that the existing features still work.

This also gives you an alternative to the dreaded “rewrite”. Rewrites are expensive and difficult to pull off. There’s no guarantee the rewrite will turn out to be better quality. You don’t need to rewrite the app if you can confidently change it to whatever you want.

In addition, when you trust your automated test suite, you have a lot of confidence that your features work because they’ve all been tested. This confidence allows you to deploy to production frequently without stress. No more all nighters.

Documentation

Here is a problem we often encounter: You have documentation that says when you put in a coupon, you get a discount. If the application code changes this behavior, the documentation will have incorrect information unless someone actively combs through it to keep it up to date. Keeping the documentation in sync with reality is expensive and not a huge value add relative to new features. In this context, it can be pragmatic to keep the documentation out of date. But, when someone reads it later they’ll be misled by its information. This makes people question its accuracy and when people are skeptical of the documentation, they tend to not read it. In the end, a lot of the effort to write it in the first place is wasted.

Let’s compare this to automated tests: If the coupon code stops working, the tests fail. Automated tests tell developers when they’re incorrect, traditional documentation does not. Automated tests act as a form of developer documentation that are always up to date. This helps velocity stay consistent even when new team members join the team.

TDD is automated tests on steroids

With TDD, you’ll get all the benefits mentioned above, plus these additional ones:

Better test coverage

When developers write automated tests after the feature works, it’s easy for them to forget to test some functionality because they have to think back and go through every possible path. When forgotten paths are discovered later, developers lose confidence in their entire test suite. They think, “What else did I miss?” This puts the team in a “worst of both worlds” situation where they put effort into automated tests but don’t trust them so they spend effort on manual testing, too. It’s a skill to write testable code and the most straightforward way to write a feature isn’t usually testable. Therefore when developers write the feature first, the code must often be modified so automated tests can be written against it. This creates a “measure once, cut twice” situation where the developer builds the feature, tries to test it, but then realizes they have to rewrite the feature in order to do so. Then time constraints cause developers to skip a lot of tests they should be backfilling.

When developers practice TDD, they write the test first, then write the minimal amount of code to make it pass. It bypasses all these issues. They’re sure that all their code is tested: It wouldn’t be there unless there was a test saying it needed to be. When you don’t practice TDD, you have to manually check the application to get the same confidence in it.

Deliver user value with simple solutions.

By writing a test first, developers constrain themselves to write the minimal amount of code to make the test pass. This focus prevents the common problem of over-engineering: Writing overly complex solutions when a simple one would suffice. Simple, minimalistic solutions maintain high quality. They’re easier to understand and there’s less to read through. This means they’re easier to maintain and change than complex, ornate solutions.

Easier for other apps to integrate with yours.

When other systems work with yours, the developers of the other teams will appreciate convenient APIs. This will make your product more appealing and people will be more willing to leverage it. Test Driven code is usually more pleasant to use. That’s because when the test is written, the developer has to think, “I wish I had some code that I could call like this, even though I don’t.” That’s exactly how anyone integrating with your APIs are going to think. In other words, TDD code is written the way people want to use it. Using the traditional or the test-after approach doesn’t usually work this way. Instead, the focus is on how to make the feature the easiest to implement, and that’s not always the most convenient way to use it. As a result, the design of the code becomes an afterthought.

PM and Dev agreement on Done.

TDD encourages and documents conversations about the definition of “done”. If the feature says, “when the user enters the correct coupon code, the discount is %15”, the developer can write a test that says exactly that. TDD makes this test the first step for the developer so the conversation has to happen early. With traditional processes, there’s a tendency for these conversations to happen after the feature is already written. It’s very expensive to realize the devs and PM are out of sync at this point because the feature may have to be rewritten and retested.


Hopefully these examples informed you of the value of TDD for your product. TDD is not only a developer tool, it is a methodology that helps provide better test coverage which in return increases confidence in your product and deployments. It focuses developers on the minimum amount of code required to deliver user value. It ensures a consistent velocity over time by eliminating unnecessary feature rewrites and reducing time on bugs.

I’m OK with some DRY violations. Here’s why

If you haven’t heard of DRY, it stands for, “Don’t Repeat Yourself” and it says “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.”

I first learned about this principle in The Pragmatic Programmer and immediately connected with it. I think the reason why is that it was so easy to apply compared to all the other software engineering principles. Most principles are very abstract but this one is easy to follow. When you see two methods that are similar, combine the methods into one and both places should use the single method.

The benefit of fixing DRY violations is that if the code is copied and pasted in two places and then you discover a bug in one place, you may forget to fix it in the other place. You spent all the time fixing it the first time and now you’re going to spend all this time fixing it the second time. Bugs are very expensive and this could have been prevented the second time if you DRY’d up the code. Even if you realize you need to fix the bug in two methods upfront, you still have to put in the effort to fix it twice.

Also, we spend much more time reading code than writing it. If there is duplicate code, there is even more code to read to understand the code base.

But, I rarely find two identical methods. Usually it’s more that two methods are a close match instead of an exact match. To merge the two into one, I have to add a flag on the outside to say conditionally skip a part of the similarities. Or, I have to extract a part of one of the methods and that new extraction is now a match with another method.

Don’t get me wrong. This often has a lot of benefits. But, these little compromises make the code slightly more complicated. It’s even worse when the functionality of the DRY’d method now needs to diverge in different ways depending on who’s calling it. Sometimes this works out, but other times you wish you were working with two different methods instead of one shared one. In my experience, it takes a long time to realize that a merged method should really be two separate methods. And until you figure this out, the code tends to get much more complex than it even would have if you left the methods as duplicates.

Lets summarize so far: DRY violations are bad because they increase the size of your code base and can be a cause of bugs. But when you fix these DRY violations your code can also become more complicated and therefore difficult to read.

When I’m practicing TDD (Test Driven Development), DRY violations are low on my list of things to deal with. When you’re practicing TDD by the book, these types of bugs are usually caught by a failing acceptance test. An acceptance test tells me if the feature works or fails. That gives me the luxury to keep the DRY violations around for longer than I’d feel comfortable with in an untested code base even if the bug is duplicated in two methods.

TDD also tends to increase code quality much more than fixing DRY violations does. It’s a false dichotomy, but if I had to choose TDD over fixing DRY violations, I’d choose TDD. This is why I’m comfortable leaving DRY violations around unless a test is driving me towards fixing them.

That’s why there’s a guideline we go by where I work. We avoid fixing questionable DRY violations until there are 3 duplicates. With a well tested code base, the advantages of fixing DRY violations don’t necessarily outweigh the disadvantages. That’s why we often wait and see.

Use Mocks Sparingly

Whether I write my tests in the classical style or the mockist style, I always find that my tests are higher quality when I avoid mocks. Some people must be thinking, “how can you use mockist style without mocks?” Well it turns out there’s actually a very strict definition of what a mock is and a lot of developers usually use the word incorrectly. The best definitions I’ve found are from Martin Fowler’s article titled, Mocks Aren’t Stubs. To quote the article:

  • Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.
  • Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an in memory database is a good example).
  • Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what’s programmed in for the test. Stubs may also record information about calls, such as an email gateway stub that remembers the messages it ‘sent’, or maybe only how many messages it ‘sent’.
  • Mocks are what we are talking about here: objects pre-programmed with expectations which form a specification of the calls they are expected to receive.

I’m using these definitions when I say that you should avoid mocks in your tests. When I mock objects in Java, I’m often using a library named Mockito. If you write a test like this, you know you’re using a mock:


//test that subject.doStuffAndReturnList() clears the list
mockedList = mock(List.class);
subject = new Subject(mockedList);

subject.doStuffAndReturnList();

verify(mockedList).clear();

The clear giveaway here is the verify keyword. I avoid this whenever I can and consider it a smell. In this example the verify checks that the mockedList is cleared if you call the doStuffAndReturnList method on subject.

Here’s why I don’t like this: I do not get the confidence I want from this test. Lets say I want the mockedList to be cleared after I call doStuffAndReturnList(). How do I know that something isn’t added to the list after clear is called? Well, you can use mocks to verify this, too:

verify(mockedList, never()).add(anyObject());

All good? Well the problem is there are many ways to add an element to a list. Maybe the set method was used instead and your test isn’t catching that. In other words, the way that the mockedList is used inside the subject should be an implementation detail. But, once you use a mock (as opposed to a stub) these details are now exposed.

Here’s what I consider to be a superior test:


//test that subject.doStuffAndReturnList() clears the list
realList = new List();
realList.add("foo");
subject = new Subject(stubbedList);

List result = subject.doStuffAndReturnList();

assertThat(result, empty());

Rather than testing how the subject’s dependencies are implemented, I instead check that the subject works the way I expect. This gives me the freedom to populate the list any way I choose but know that it’s empty after doStuffAndReturnList() is called.

There are some exceptions to this rule that I practice. Once in a while I will create a method that returns void but I really want to make sure it’s called in a test. An example is testing that a validate method is called that throws an exception. This is a rare occurrence.

What I like about avoiding mocks is my code ends up being more “functional” in style. I pass in parameters to a function and assert that the results look the way I expect. The implementation details are hidden from the test, even if I’m practicing mockist style TDD.

Testing On Rails: The prescriptive way I test a feature from start to finish

In this article I’m going to give a prescriptive way of practicing TDD. This is currently the protocol I use and its helped me a lot in my development because it’s prescriptive.

Through most of my career, I didn’t have someone to pair with that was more experienced than me when it came to TDD and I had to pick up a lot of things by doing it the wrong way first. That’s not so bad, but what was really crippling was stalling whenever there was a fork in the road.

My goal for this article is to tell you whether to go left or to go right when you reach those forks. I want to remove the ambiguity that’s so common when you try to learn TDD one blog article at a time.

Most of these practices are not unique. In fact, here is a good video that espouses this practice given by Justin Searls: https://www.youtube.com/watch?v=VD51AkG8EZw This talk is so aligned with the way I test that you can basically consider this article to be a summary of his video. But there are some places where I go in depth and some people may prefer an article to a video.

Here’s the example we’ll use for the article: When you visit the user list page, you should see a list of usernames. This list comes from a 3rd party service that is not a part of your application. Some other team builds and maintains it.

Step 1: Write an end to end test

The first thing we’re going to do is write a test that connects to a running version of our app, goes to the user page and checks that there is a list of usernames on the page. This is going to be the first test we write, but for this feature it will be the last test to pass.

It’s not going to test edge cases like, “What if there are no users?” Or, what if a username has newlines in it. Rather, this test is going to be for what we call, “The happy path”. All of those edge cases will be tested in other kinds of tests because end to end tests are relatively slow and the point is to check that the parts of your code are integrated together correctly.

The user service your code will call out to is not part of your code. You don’t want to be effected by it going down or its data changing each run. For this reason, the code you test against will be calling a mocked out user service, not the real one. It will always return the same response when you visit the list page. These are concerns you should care about, just not in this test because this test is relatively slow and you want your test suite as a whole to be very fast.

Step 2: Write mockist style tests all the way down

If you’re unfamiliar with mockist style testing, here’s how it works: You write a test for a class that depends on other code. Instead of trying to implement all the code at once, you write some high level code that delegates to lower level implementations. These lower level implementations are mocked out instead of implemented. Here’s an example test that I would write first:

This is what I would write soon after I wrote my end to end test. Actually, if I was using a single page app framework like angular, I’d probably test the javascript first. But if I had no javascript relevant to the end to end test, I’d write the above test next.

What’s worth noting is I’d write this out and it wouldn’t compile. Once I got to my first compile error, I’d fix it and go back to writing the test. For example, UserListController wouldn’t exist. The test would fail to compile for that reason, then I’d create it. The same goes for the UserRepository.

In both of these cases, I’d do only as little as necessary to get the test to compile, then I’d go back to writing my code. In the case of the UserListController and UserRepository, that means I’d only type out the method signature and return null if necessary. I write the body of my test before I create the fields, because I don’t really know which fields I’ll need until I try to use them.

The reason I have a field named subject is it tells me which class I’m testing in this test. When I lose track of where I am, it’s a quick reminder and it’s more convenient than scrolling to the top of the file.

I write this test and think of all the edge cases I avoided in the earlier section. After I’ve done that, I follow this same process but with a new test named UserRepositoryTest. I always test outside in: I test the class that has dependencies before the class that is a dependency. This ensures I write the smallest amount of code necessary to get the tests to pass and ensures that I’ll use every class I create. When you test inside-out, your classes often fail to fit together and you end up creating things that you didn’t really need in the end.

Here’s a practice I often follow that is non-idiomatic in a lot of languages, but has worked out very well for me so far: I try to keep one method per class. I try to think of this as the Single Responsibility Principle to the extreme. The end result is a proliferation of little classes and a lot less mental overhead. It’s straightforward to recognize if a class is doing more than it’s supposed to. If you have some unrelated code you don’t have to look around for the class that it may belong in because that will almost always be a brand new class. And, it takes a lot of mutable state out of your system.

I know this sounds weird, but so far my code has been very maintainable from following this practice. I suggest you give it a try and I suspect you’ll find the same.

You’re going to go outside-in all the way down to the 3rd party system and stop there. When you reach that point, all your tests from Step 2 should pass all the way down, but your end to end test should still be failing because it’s connecting to the real 3rd party service and it returns data that you can’t control. The last step is to mock that out. But before that we need to test the way the 3rd party service works.

Step 3: Write tests for the real 3rd party service

Depending on how much uncertainty/risk is involved, this step may be swapped with step 2. For example, if you have no idea what parameters the API needs, you may want to write these tests before you write your unit tests all the way to the API. You could end up refactoring all the way back up if you realize you need to pass in a parameter you didn’t know about. But, if I can get away with it, I prefer to do this at step 3.

The API you need to call probably isn’t exactly what you would like. Even if it is today, it may change tomorrow. So I like to wrap it with a thin layer of my own code that separates any of the 3rd party code from mine. Then, I write my tests against this thin layer of code that use the 3rd party code. I don’t test the 3rd party code directly, I test it through my thin layer.

These tests are slow and unstable because I can’t control these 3rd party services. They could be down or the their schema could change without a warning. For that reason I run these 3rd party tests at a different phase from my end to end tests and my mockist style tests. Just because one of these 3rd party tests fails doesn’t mean there’s an issue with my code.

When steps 1-3 are complete the end to end test could still be failing. There is one last step:

Step 4: Mock out the 3rd party service

Near the bottom of your tests from Step 2, you should reach a point where you need to call the 3rd party service. In Step 3, you are testing the real 3rd party service. In this step, you’re going to mock out the 3rd party service so that your end to end tests are stable when and if the 3rd party service goes down.

The thin layer of code we wrote around the 3rd party service needs to be configurable so that it can either use the real service or the mock service that we are going to test in this step.

I like to configure this by a URL and the URL points to a fake service when running my end to end tests. A tool to help with this (if you’re using Java) is Mock Server.

Alternatively, you could create a stub class that implements your thin wrapper code of Step 3 but I prefer a URL because it can be valuable to test that network trip to the fake server.

Now, there is a situation we have to deal with if it hasn’t already been covered: We need to make sure that if the 3rd party schema changes, we have to change our 3rd party tests and our mock server requests/responses to reflect the change. Changing only one or the other is very dangerous because it’ll mean your tests could be passing when the real code isn’t integrating correctly.

But once you’ve figured out how to do that, your end to end test should be consistently passing, your feature should be complete, and you’re ready to repeat Steps 1-4 on your next feature.

11 things I learned about TDD by looking at the tests in the JUnit source

JUnit is a unit testing framework for the Java programming language. JUnit has been important in the development of test-driven development, and is one of a family of unit testing frameworks which is collectively known as xUnit that originated with SUnit.

JUnit was originally written by Erich Gamma and Kent Beck. The latter literally wrote the book on Test Driven Development.  I thought it would be worthwhile to look at some of the tests in the library and draw attention to some interesting things about them because it may change people’s ideas of how to practice TDD.

Continue reading