Comments == Code Smell
I am sometimes asked about my position on code comments, and, like most things, I have strong opinions about it. Two kinds of comments exist:
- JavaDoc-style comments (which encompasses JavaDoc, XMLDoc, RDoc, etc), which are designed to produce developer documentation at a high level (class and method names and what they do)
- In-line comments, generally scattered around the code to indicate a note from developer to developer
Both kinds of comments represent different smells, each with different odors depending on the target.
What makes comments so smelly in general? In some ways, they represent a violation of the DRY (Don’t Repeat Yourself) principle, espoused by the Pragmatic Programmers. You’ve written the code, now you have to write about the code. In a perfect world, you’d never hove to write comments for this purpose: the code will be expressive enough that someone who reads it will understand it. Two things help achieve this result: expressive, readable languages and the composed method pattern.
The language makes a big difference as to the readability of the code. If you write in assembly language, you’re pretty much forced to write comments; no one on earth can read the code directly. As languages have matured, you can get much closer to the ideal of self-documenting code (which was explicitly attempted in Literate Programming). Especially in the modern wave of non-ceremonious languages (like Ruby, Groovy, Scala, etc), you can craft extremely readable code. To this end, ThoughtWorks projects generally avoid the more magical features of languages (like the implicit global variables in Ruby, for example) because it hurts readability. One of the side effects of dynamic typing is outstanding method names. The method name is the only vector of information about what the method does (you can’t crutch on return or parameter types), so methods tend to be named much better on the dynamic language projects upon which I’ve worked.
The second key to readable code is not to have too much of it, especially at the method level. In Smalltalk Best Practice Patterns, Kent Beck defines composed method (which I write about extensively in The Productive Programmer). Composed method encapsulates the idea that all your methods should do one and only one thing, making them as small as possible. The side effect of this discipline leads to public methods that mainly consist of calls to a large number of very small private methods, each of which do only one thing. Composed method has lots of beneficial side effects on your code: small methods are easier to test, you end up with really low cyclomatic complexity for your methods, you discover and harvest reusable code chunks more easily, and your code is readable. This last one brings us back to the topic of comments. If you use composed method, you’ll find much less need to have comments to delineate sections of code within methods (actual method calls do that), and you’ll find that you use better method names.
Now let’s talk about how this applies to the two types of comments. First, where are comments indeed useful (and less smelly)? If you are writing an API, you need some level of generated documentation so that people can use your API. JavaDoc style comments do this job well because they are generated from code and have a fighting chance of staying in sync with the actual code. However, tests make much better documentation than comments. Comments always lie (maybe not now, but on a long enough timeline, all comments will become outdated). Tests can’t lie or they fail. When I’m looking at work in progress on projects, I always go to the tests first. The comments may or may not be there, but the tests define what’s now done and working.
We were on a project where the client insisted on Javadoc comments for every public class and method. We started the project adding those comments, but eventually stopped. When doing agile development, you don’t want anything that hampers refactoring. Having comments in place caused a dilemma: do I refactor the method and change the comment (the most work), refactor the method and leave the old comment (with the theory being that I’ll change it again later, and would rather just have to update the comment once), or not refactor? No good options here. Having pervasive comments discourages refactoring because it adds significant extra friction. On this project, we abandoned commenting as we went along. The last week of the project we literally did nothing but go back and add comments to the code, which worked well because the code base had settled down by that point. But notice what’s lurking in wait: the client wanted all the comments there to make it easier to maintain in the future. But who’s to say that whoever maintains that code will keep the comments up to date? You have the same DRY violation as before. If they maintain the tests, they have to change as code changes because they are executable.
The new wave of Behavior Driven Development tools (like JBehave, RSpec, and easyb make this “executable specification” style of comment + test feasible. I expect to see the usage of these tools skyrocket because they give you documentation that has a fighting chance of staying up to date.
Inline comments are almost always a smell. The only legitimate use of inline comments is when you have some very complex algorithm that you need to have some thoughts about beside the code. Otherwise, the presence of inline comments indicates that you’ve written code that needs explanation, meaning that it cries out for refactoring. I frequently troll code bases upon which I’m working to look for inline comments so that I can refactor the code to eliminate the need for them.
Comments are a great example of something that seems like a Good Thing, but turn out to cause more harm than good. Fortunately, we’ve figured out how to achieve the same benefits that comments allegedly provide with tests, particularly BDD-style tests.