I’m OK with some DRY violations. Here’s why

If you haven’t heard of DRY, it stands for, “Don’t Repeat Yourself” and it says “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.”

I first learned about this principle in The Pragmatic Programmer and immediately connected with it. I think the reason why is that it was so easy to apply compared to all the other software engineering principles. Most principles are very abstract but this one is easy to follow. When you see two methods that are similar, combine the methods into one and both places should use the single method.

The benefit of fixing DRY violations is that if the code is copied and pasted in two places and then you discover a bug in one place, you may forget to fix it in the other place. You spent all the time fixing it the first time and now you’re going to spend all this time fixing it the second time. Bugs are very expensive and this could have been prevented the second time if you DRY’d up the code. Even if you realize you need to fix the bug in two methods upfront, you still have to put in the effort to fix it twice.

Also, we spend much more time reading code than writing it. If there is duplicate code, there is even more code to read to understand the code base.

But, I rarely find two identical methods. Usually it’s more that two methods are a close match instead of an exact match. To merge the two into one, I have to add a flag on the outside to say conditionally skip a part of the similarities. Or, I have to extract a part of one of the methods and that new extraction is now a match with another method.

Don’t get me wrong. This often has a lot of benefits. But, these little compromises make the code slightly more complicated. It’s even worse when the functionality of the DRY’d method now needs to diverge in different ways depending on who’s calling it. Sometimes this works out, but other times you wish you were working with two different methods instead of one shared one. In my experience, it takes a long time to realize that a merged method should really be two separate methods. And until you figure this out, the code tends to get much more complex than it even would have if you left the methods as duplicates.

Lets summarize so far: DRY violations are bad because they increase the size of your code base and can be a cause of bugs. But when you fix these DRY violations your code can also become more complicated and therefore difficult to read.

When I’m practicing TDD (Test Driven Development), DRY violations are low on my list of things to deal with. When you’re practicing TDD by the book, these types of bugs are usually caught by a failing acceptance test. An acceptance test tells me if the feature works or fails. That gives me the luxury to keep the DRY violations around for longer than I’d feel comfortable with in an untested code base even if the bug is duplicated in two methods.

TDD also tends to increase code quality much more than fixing DRY violations does. It’s a false dichotomy, but if I had to choose TDD over fixing DRY violations, I’d choose TDD. This is why I’m comfortable leaving DRY violations around unless a test is driving me towards fixing them.

That’s why there’s a guideline we go by where I work. We avoid fixing questionable DRY violations until there are 3 duplicates. With a well tested code base, the advantages of fixing DRY violations don’t necessarily outweigh the disadvantages. That’s why we often wait and see.