September 25, 2016

OOP Anti-Patterns: Utility or Helper Classes

I see utility classes in nearly every bit of Java code I review. Even core libraries like Guava and Apache Commons aggressively promote them. But Java is class-based. The language has no support for a function living outside a class. What force pulls developers toward creating utility classes? What problem do they solve?

If you follow Object-oriented Design principles (OOD), you shouldn't create utility classes. If you follow Functional Programming principles, you shouldn't create utility classes. If you follow Procedural Programming principles, you may create utility classes.

I will discuss utility classes in the context of object-oriented design. My humble advice goes to those with a classic OO mindset who crave to create a beautiful garden of objects. Let me tell you why you want to get rid of the weeds in your backyard and how to do it.

What is a utility class?

A utility class (or helper class) is a bag full of static methods. It provides common functionality that is used everywhere and encapsulates no state. A utility method is used to isolate a "useful" code block.

The Java ecosystem is abundant with examples of that style – Math from JDK, StringUtils and FileUtils from Apache Commons, Iterators from Guava. But if the entire Java world has embraced that idea for over 20 years, how bad can it be?

What is wrong with utility classes?

They are just a pile of random stuff.
They tend to be huge.
They are just globals.
They have low cohesion.
They make testing hard.
They make extracting methods hard.
They are hard to read.
They are hard to maintain.
They are hard to refactor.
They are hard to decorate.
They are hard to extend.
They are hard to inject.
They hinder lazy execution.
They break the principles of OOD.
They are not SOLID.
They have many reasons to change.
They are responsible to many actors.
They result in dead code.

Looking at that cons list without bias, it generally describes static methods. Most of these bad traits hold true for any static method. Depending on which universe you come from, you may call them static methods (Java, Python) or class methods (Ruby). Still, the concept remains the same – a global method, namespaced within a class, living outside of all instances.

You may think of static methods as the supertype for utility methods. Given that, utility methods automatically inherit all of the bad manners of static methods, and on top of those, they add their own baggage. So, before we dive into utility methods, let's pay their parents – the static methods – a visit.

Static methods

Different people have different opinions about static methods – when they make sense and when to be avoided. A popular belief is that you should avoid putting domain logic in something that you cannot call new on (make an instance).

I tend to avoid static methods. But there are use cases where they suit just fine. Alternate constructor (or static factory method) is one valid use case. Method object is another good use case. You may find that pattern under various names – Function object or Command object. But the concept remains the same – you create an instance and then immediately call an instance method on it.

I tend to avoid static methods when designing domain concepts as the whole idea of leaking global state and sharing it around could be very confusing. If you go heavily into class methods, everything becomes globally available. Having static methods mutate global state is a sure way to shoot yourself in the foot sooner rather than later.

On the other hand, internal state is localized to a class instance. It is private. Nobody can see or touch it. An object can mutate its own state, but that tiny local change shouldn't cause any issues in other parts of the system as no one depends directly on the object's state.

Passing shared state around makes method signatures complex – many arguments are passed in. Abstractions become hard to interact with, and you feel the pains of bad design. How am I supposed to use this thing? How to build the whole context it needs? It's hard to wrap your head around concepts. It's hard to read. It's hard to extend. It's hard to refactor.

It's hard to break down a long static method into a group of smaller methods when you have to pass many arguments among them. These methods operate on the same context, but instead of sharing it privately through internal (hidden) state, they pass it around, sharing it publicly through a global state.

You may find these ideas contradict the Functional Programming (FP) paradigm. Global state tossed around pure functions may be valued in FP, but we want to hide data and expose behavior in OOP. But don't get me wrong – Functional Programming is great. I love writing functional code. But when using OO programming languages (Java, Python, Ruby), the idiomatic way to design your code is around objects that have state (or instances).

Utility methods and classes

Although utility methods inherit the disadvantages of static methods, they can be helpful when used in moderation at the edges of your design. Keep them small and away from your core domain concepts, mainly dealing with low-level details, and you shouldn't feel any pain. The utility methods become problematic when you go heavy on them, bundling them in large classes and trying to stitch business logic together using them in a hard-coded way.

Utility classes violate the principles that compose a good object-oriented design. A utility method represents a failure to find a good name and a good home for a piece of functionality. They are not proper objects. Those ancient artifacts from procedural programming have no place in an OO world. We are no longer in the chains of the functional decomposition paradigm.

Utility classes violate SOLID – breaking five out of five OOD principles. A utility class has many reasons to change. It's just a bag of unrelated methods. We cannot derive from it. Its interface is coarse-grained and unspecific to any client. You cannot extend it. You cannot inject it as a dependency.

Shaping application logic with static methods can result in test pains. We want to isolate an object from its collaborators. But when the object under test calls static methods from utility classes, it is non-trivial to mock these hardcoded dependencies. It is hard to replace them with test doubles. In Ruby, it's easy to stub static methods due to the language's dynamic nature, but what about stubbing static methods in Java? How do you test in isolation a method that invokes FileUtils.listFiles()?

Code designed around utility methods is hard to read. You can get away with a few utils when working alone on a project. But when working in a team, the problem has a snowballing effect. Utility classes have a strong gravitational pull. They attract unwanted code and quickly turn into a huge junkyard.

Try to read the code of FileUtils. Do you consider it OOP? Do you see any separation of concerns? Utility classes rip off the beauty of the OOD. They are just procedures packed together in a huge bag. They are hard to read and hard to maintain. Static helper methods belong to the procedural world, where developers follow the notion of functional decomposition. Imperative utility methods make our code procedural rather than object-oriented.

Why do people create utility classes?

Lack of domain knowledge.
Laziness.
Both.
DRY code duplication.
Namespacing that went out of control.
Trying to finish a task quickly.
There is no other way.

Creating utility classes is nearly always an example of laziness. Programmers avoid duplication by DRYing "useful" code out into utility methods. People often excuse themselves with "there is no other way" and "it's a weird use case".

In rare situations, there is a weird use case that doesn't fit the current design, and there is no other way. But having utility classes as part of your design is an anti-pattern. Those utils should be an exception, not the default. People tend to misplace responsibilities. They often miss opportunities to create new abstractions to fulfill new behavior.

How to get rid of utility classes?

Fill the lexical gap. Learn your business domain vocabulary.
Call it what it is. Naming is hard, but at least give it a try.
Talk to a teammate. Talk to your product owner. Solve the problem as a team.
Explain what the code does in simple words. Explain it to a child.
Create a simple abstraction with clear responsibility and a good name.

If you agree that helper classes are a terrible design idea and want to stop using them, let me show you a simple example of how we can replace them with proper objects. Here is a typical Java utility class holding dozens of static methods.

1:    public class Utils {

        ...

587:    public static int calculateRectanglePerimeter(int length, int width) {
588:      return (2 * (length + width));
589:    }

        ...

3603: }

You can't make this up. Even the Java standard library is loaded with utility classes of the same style. Look at Math from JDK or StringUtils from Apache Commons – so many static methods bundled in a single class. Why do they live together? Are they related? How finding a square root is related to generating a random number?

Here is how you use the helper method to calculate the perimeter of a rectangle.

int perimeter = Utils.calculateRectanglePerimeter(1, 2);

Stop for a second and think about what's wrong with that code. Technically, there is nothing wrong – the code works. But it's not object-oriented programming. It's procedural. Static methods are simply Java statements grouped under a unique name.

And, in fact, your class is effectively a singleton, potentially with globally mutable state. I hope not, but potentially with all of the gotchas and warnings that that entails. And so, if you think of your user instance, you need a reference to it, and there can be multiple of them, and you can call methods on them. If everything is happening at the class level, there is a single user class in memory shared by anyone who wants to use it. It's globally accessible. You can all call methods on it. Yeah, in many ways, it does act like a singleton.

Primitive Obsession

In our utility class, we rely on primitives instead of having an object that abstracts the rectangle concept. Two integers are a terrible representation of a rectangle. Reify your primitive obsession into an object. The static operation on that primitive becomes a public instance method on your new domain object following the Tell Don't Ask principle.

Data Clump

Everywhere we need the notion of a Rectangle in our code, the length and width will always come together as we don't have a proper object. We cannot hide data. We are passing the data all around our code. Imagine you see [10, 20] when debugging. Is it 10 dollars and 20 cents? Is it 10:20 AM?

We often consider a common functionality to be a utility method. Maybe because it’s the easiest thing to do, but that helper method could be a specific method that belongs in a domain class of its own. When following object-oriented design, we want to instantiate objects, encapsulate state and expose behavior. We want to compose with objects, not data structures. Let’s re-work our code to follow OOD.

public class Rectangle {
  private int length;
  private int width;

  public Rectangle(int length, int width) {
    this.length = length;
    this.width = width;
  }

  public int perimeter() {
    return (2 * (this.length + this.width));
  }
}

I have created a simple abstraction around our rectangle concept. I have isolated the data inside and exposed the perimeter behavior. The object is immutable, as the values passed in the constructor can never be changed. The data is hidden, and the operations are exposed.

The class is small and is responsible for a single use case. It's easier to change, maintain, and unit-test. When a new requirement for calculating the area of a rectangle comes in, you'll already have a good home for it. Everybody on your team can use the Rectangle object intuitively.

The object-oriented mindset

We aim for modular architectures built around passing objects to other objects. Objects represent the world we are abstracting as code more understandably. I can easily wrap my head around domain objects and quickly see how they talk to each other. A classic OO mindset models the world using richer domain objects that execute tasks.

Data and behavior go together. Create abstractions around domain concepts hiding data and exposing behavior. Every SOLID piece of code is responsible for one thing. Call it what it is. Use a descriptive name. Find a good home for it in your beautiful garden of objects.

References

"Principles of Object-oriented Design" by Uncle Bob
"Elegant Objects" by Yegor Bugayenko
"Are Helper Classes Evil?" by Nick Malik
"Why helper, singletons and utility classes are mostly bad" by Simon Hart
"Avoiding Utility Classes" by Marshal Ward
"Kill That Util Class!" by Dhaval Dalal
"Helper Classes Are A Code Smell" by Rob Bagby
Why Ruby Class Methods Resist Refactoring