July 23, 2016

Don't Comment Bad Code - Rewrite It

Many preach that it's a good practice to comment your code so that your buddies will know what the heck that code does and why it was written. There are many reasons why developers write code comments. One or two sound legit. Most of them don’t. Sit with me, and I'll show you why code comments are bad and how to write self-explaining code.

Acceptable comments

Copyright or legal comments
JavaDoc | PyDoc | Godoc | <AnyDoc> for public API

Unacceptable comments

Clarification (explain what the code does)
Explanation of intent (explain design decisions)
Warning of consequences
Apology for an ugly hack
TODO comments
Amplification (yelling at your readers)
Mumbling (writing an essay or just talking to yourself)
Redundant comments (just repeating the code)
Scary noise (repeating the code but with copy-paste errors)
Mandated comments (by the dev process)
Journal comments (you have Git)
Closing brace comments (change your editor)
Attribution and bylines (you have git blame)
Commented-out code (abominations – kill them on sight)
Non-local information (talking about code that is somewhere else)
Position markers
HTML in comments

In rare situations, I will accept one or two of those that fall under the "unacceptable comments" category. But I steadfastly refuse to accept writing comments as "the default". I refuse to accept the most common excuse developers use to write a comment – to explain what that code does.

The real questions regarding comments are when and how much. Take Ruby, for example. Idiomatic Ruby code speaks for itself. Let it do its talking. If your code is obvious, then don't explain it. The purpose of a comment is to explain the purpose of code if the code cannot explain its purpose. Can code explain itself?

A comment is a failure to express yourself in code. – Uncle Bob

Comments that explain what the code does

The saying goes that people who advocate against comments are just lazy enough to comment their code. I claim the exact opposite – people that write comments to explain their code are lazy enough to think and refactor their code to be self-explaining.

Unless you are writing code in Assembler or Fortran, you shouldn't find a good reason to spend your time writing comments instead of spending your time writing readable code. Modern programming languages are quite expressive. Name your classes, methods, and variables well, and your code will read like prose.

If your feel your code is too complex to understand without comments, your code is probably just bad. Comments don't make up for bad code. Writing good comments can be more complicated than writing good code. We are not novelists. We are developers.

Compare the code snippets:

(1)

// Is user eligible for a discount?
if ((user.getVerifications() & PHONE_CONFIRMED) && (user.getOrderCount() > 10)) {
  ...
}

(2)

if (user.isEligibleForDiscount()) {
  ...
}

Which one reads better? It takes less than a minute to extract a code block into a method with a good name. Getting rid of a comment is simply creating a function that says the same thing as the comment.

Comments that explain the intent

People tend to accept comments that explain the author's intent or the rationale behind a technical decision. If you need to clarify why you wrote the code in a particular way, other media could keep that knowledge for future generations.

Take Git. Your commits and pull request descriptions are a great place to explain "why" you have written that code and "why" in that particular way. The written discussion in a pull request is the best FAQ. If you have to explain all the "why"s around a change in the code, then you should take a step back and revise your development process, mainly how your team writes commit messages and pull requests.

It would help to establish a culture where developers write good commit messages. Do small meaningful commits with good messages and explain everything you think a future maintainer would be grateful to understand. Write a bigger commit message than the commit itself. You may end up being the one that benefits the most from these descriptions. People tend to forget.

Later on, your colleagues can use git blame and read through the commits to see how the code evolved to its current state, what the coders were thinking, what their intent was when they wrote it, what technical decisions were considered, and why. It is by far better practice than polluting your code with comments.

If Git is not your tool of choice, then go for Architectural Decision Records. An Architectural Decision Record (ADR) captures a single architectural decision. A project's collection of ADRs created and maintained constitutes its decision log. Keep those ADs in your project Git repo – close to your code so that you remember to add the “why” as an ADR when the code changes.

Make the code express your intent. Put the effort into the code, not in the comment. Explain yourself in code, not in comments. The name of a variable or class should tell the reader why it exists, what it does, and how it is used. If a variable needs a comment, then the name is not intention-revealing.

int d; // elapsed time in days

The name d reveals nothing. Let's try and improve the naming and remove the comment.

int elapsedTimeInDays;
int daysSinceLastUpdate;
int daysSinceLastModification;
int recordAgeInDays;

Pick any of these, and you are better off. Next time you visit this code, you may think of an even better name. The new name reveals what's being measured and the unit of that measurement. The code is easy to understand.

Comments that warn about consequences

public static Result computeSomething() {
  // Remember: Result is NOT thread safe!1!
  // Make sure you do bla bla ...
  return new Result("something");
}

You can find questionable code even in the Java library surrounded by similar comments. If you don't know all the caveats around such code, you can spend your days (and nights) chasing nasty bugs. Well, the Java library has many years of legacy to deal with. Most of the time – you don't.

Legacy in your codebase is your responsibility. Pay your debt now or be forever in chains. The more commented hacks you leave in your code, the greater the chance that you'll be the one that shoots themselves in the foot next time. "Working Effectively with Legacy Code" – a book by Michael Feathers – is an excellent start.

Let's look at another example. A comment that warns the one who dares to change the code.

# Dear maintainer:
#
# Once you are done trying to 'optimize' this routine,
# and have realized what a terrible mistake that was,
# please increment the following counter as a warning
# to the next guy:
#
# total_hours_wasted_here = 42

We all have a good laugh when reading through warning comments in code. They can be both funny and scary. Write a test instead of a warning. Anyone can ignore a warning and change the code. No one can ignore a broken test when changing the code. Explain the weird use case in your test description. A good test has two duties – to explain and to guard.

Comments that lie

Comments silently rot. Maintaining them is expensive. The longer a comment lives, the further beyond any truth it is. Codebases are heavy on misleading comments. And that state is not reached overnight or intentionally. You change the code but forget to change the comment that lives ten lines up. At some point, the comments say things that have nothing to do with the code. And you don't know who is right and who is wrong. Is the comment right, or is the code right? Does the comment describe a valid or an obsolete use case?

Expecting even a small team to maintain each other's comments is unrealistic. Different people have different attitudes toward comments. Some people write frivolous comments, some write redundant noisy comments, and some do not write comments. When a team works together on a project, you can spot those differences in the code. Having the whole team write comments as one is a mission impossible. Having the entire team not write comments is easy.

Comments that yell

// HACK AHEAD!
//
// "toLowerCase()" fixes a crash in our production system.
// Do NOT remove it!
// It will take time to design the system right and that's why
// we are doing this nasty quick patch.
//
String someValue = someObject.produceSomeValue().toLowerCase();

The toLowerCase() method call is a trivial thing. Anyone could consider that it doesn't belong here and move it elsewhere. People rarely check why trivia exists. Without the yelling comment, who would know that it was this trivia that quelled a fire in production? But the one who stayed awake that night and fixed the problem doesn't seek another sleepless night. So he’s yelling in code.

I accept comments that yell temporarily until the root cause is found and resolved. I don't accept comments that yell permanently – as part of the design. When you need to do a quick and dirty hack to save your system in the middle of the night – do it. That's OK. But in the morning, return to that wretched code and refactor it. Make it right. Don't let it live there forever.

How do you make a comment yell if your code base is already abundant in comments? You can do caps lock and exclamation marks. And soon, everyone starts using caps lock for comments. Now, do you think you could make a comment yell? You can do ASCII art. If your codebase is loaded with comments, everyone ignores them. Yelling comments will be ignored along with the rest. They become just noise.

Commented-out code

We see commented-out code every day in every project. And every time we question ourselves – why is that code commented out? Is it still relevant? Leave it? Delete it? Would anyone miss it when it's gone? Eventually, everyone assumes that whoever commented that code might need it. So, that code stays there forever and becomes less and less relevant as the days pass.

That commented-out code may be part of an ancient feature long forgotten. It pollutes the code and wastes everyone's time. People trying to understand the bigger picture inevitably see the commented piece and challenge its role.

Commented-out code is evil. Kill it on sight. Whenever you see such code, get rid of it. Anyone who needs it would grep through Git commits history, find it, and use it. You don't need to worry about deleting code that may be needed in some distant future. You have your changes under source control.

Explain yourself in code

Every use of a comment represents a failure to express ourselves in code.

Any fool can write code that a computer can understand. Good programmers write code that humans can understand. – Martin Fowler

Modern programming languages are so expressive. You have module names, class names, method names, variable names, constant names – a plethora of nameholders waiting for you to put them together into a poem. But enough with the theoretical bitterness against code comments. Let's go through a simple example and see how all that guidance is applied in practice.

Bad code

class Calc
  def initialize(amt)
    @amt = amt
  end

  def exec
    t =
      if @amt <= 18_200
        0
      elsif @amt <= 37_000
        @amt * 0.19
      else
        @amt * 0.325
      end
    t / 12
  end
end

Could you figure out what this code does? Please don't cite the arithmetic calculations – they are apparent. What domain concept that abstraction implements?

Bad code explained with a comment

Bad code explained with a comment doesn't make it any better.

# Calculates the monthly income tax from the annual salary. Salaries fall into
# three ranges, each with a corresponding tax rate. A salary below the low bracket
# is non-taxable, and the tax rate is zero. If the salary is between the low and
# the high bracket, the tax is 19%. If the salary is above the high bracket, then
# the tax is 32.5%. The monthly income tax is calculated by dividing the income
# tax by the number of months.

class Calc
  def initialize(amt)
    @amt = amt
  end

  def exec
    t =
      if @amt <= 18_200
        0
      elsif @amt <= 37_000
        @amt * 0.19
      else
        @amt * 0.325
      end
    t / 12
  end
end

Self-explaining code

class MonthlyIncomeTaxCalculator
  LOW_INCOME = 18_200
  HIGH_INCOME = 37_000
  LOW_RATE = 0.19
  HIGH_RATE = 0.325
  NUMBER_OF_MONTHS = 12

  def initialize(annual_salary)
    @annual_salary = annual_salary
  end

  def calculate
    annual_income_tax / NUMBER_OF_MONTHS
  end

  private

  def annual_income_tax
    @annual_salary * tax_rate
  end

  def tax_rate
    if high_income?
      HIGH_RATE
    elsif low_income?
      LOW_RATE
    else
      0
    end
  end

  def high_income?
    @annual_salary >= HIGH_INCOME
  end

  def low_income?
    (LOW_INCOME...HIGH_INCOME).cover? @annual_salary
  end
end

Do you think that the reworked version needs a single comment? I don’t think so. The intent is clear. The logic reads naturally from top to bottom as a poem.

Ruby is an exceptionally expressive language. But even if you have to write the same version of the code in Java or C++, it’ll still be much more readable and understandable than using bad naming and trying to reveal the intent with comments. Let’s review the refactoring patterns used to make the code better.

Extract Variable

An explaining variable or constant is a named entity whose sole purpose is to describe the intent of that piece of code it was extracted from. Do not wait for an expression to become long and meaningless to extract it into a variable. Giving even a smaller expression a name makes the code more manageable and easier to understand.

Replace Temp with Query

Instead of extracting the expressions into local variables that make the methods longer, we used the Replace Temp with Query pattern to turn those temp locals into reusable methods. It's a win-win: we have named our expressions and the method that needs them is short and clean.

Extract Method

When there is an expression that can be explained with a descriptive and concise name, then extract the expression into a method. Do not write a comment that explains the expression. The method name will do that job.

"But the original code was smaller" – I hear that often. Yes, I have more code than when I started the refactoring. If all else is equal, more code is bad. But rarely is all else equal. The extra code breaks the logic into identifiable parts. This modularity makes it easier to understand parts of the code and how they fit together. I choose clarity over brevity.

Breaking a large function into smaller ones adds value only if you are good at naming things. With bad names, I'll still have to jump forth and back to each function to understand what it does. With good, intuitive, intent-revealing names, I don't have to dig inside the body of each function to see what it does. I can infer what it does from its name.

Readable code increases everyone's productivity. You spend much more time reading code than writing code. Given a new feature to implement you spend most of your time reading through the existing code, than actually writing the new code. It is worth spending one's time when writing the code to save the many's time when reading the code.

References

The Elements of Programming Style, 2nd. ed., by Kernighan and Plaugher
Clean Code: A Handbook of Agile Software Craftsmanship, 1st ed., by Robert C. Martin
Refactoring: Improving the Design of Existing Code, by Martin Fowler
Hunting for great names in programming by DHH