June 18, 2016

Ruby on Rails: Clean and Thin Controllers

I often advise Rails developers to pull domain concepts out of their controllers into plain old Ruby objects. But not everyone is convinced that a controller should be thin. If you fall into that group, stay with me while I reveal the problems of fat controllers. On the other hand, not everyone who believes in thin controllers succeeds in that crusade. If you fall into that group, stay with me while I share my thoughts on how I design my controllers.

How do we end up with complex controllers? No one sets out to write complex controllers. But still, it happens all the time. In the beginning, our controller actions are very simple, even boring. But over time, as feature requirements add up, we tend to put more and more logic into the controllers. We forget who is responsible for what.

What is my controller responsible for?

A well-designed Rails controller should be responsible only for the following:

Delegate to a domain object (plain old Ruby object).
Deal with HTTP: handle request params, return a response, or redirect.
Deal with horizontal aspects: user authorization, logging, etc.

Controllers should only retrieve data for the view layer. They shouldn't contain any business logic. Easier said than done. But where should I put all my domain concepts? After all, I have only three boxes: M, V, and C. Well, if you're building a simple web application – like a to-do list or a blog – three boxes are pretty sufficient. But monoliths have lots of business rules, and three boxes come short.

Good controllers are like good managers. They delegate work to someone else. They choose whom to delegate and what to do with the result once the work is done. The ideal controller action should look like this:

class PerfectController
  def action
    PlainOldRubyObject.do_work
  end
end

The PerfectController has one object per action. It could be any domain object, a form object, or a decorator/presenter for the UI-related logic. But enough with the empty postulations and abstract advice. Talk is cheap. Let me show you how I pull stuff out of a controller and tell you why along the way.

The example

I'll explain the concepts behind a well-crafted controller through a simple example. Enumerating vague principles and broad generalizations doesn't help anyone figure out how to apply them. An example makes things clear. The problem with refactoring examples is that if I pick a large controller, it will be too complicated to work you through its implementation details in less than a hundred pages. If I pick a small controller, refactoring may not look like it's worthwhile.

Still, I decided to use a small, comprehensible example. That way, I can explain the behavior briefly and then shift the focus to the code structure. Most people will label methods as clean and simple if they hold just a few lines of code. Let's verify that assumption. Is it enough for a controller action to be small to call it "clean and simple"?

"Subscribe to a discussion" is the new feature we must add to our system. Users can subscribe to discussions in forums to get notifications about recent posts. Users should receive a notification email when subscribing and unsubscribing. That’s the whole task. Can we mess up something so simple?

Our technical approach

The database schema is obvious – we associate users to discussions in a many-to-many relationship through a join table subscriptions. When defining the routes, a head-first approach would be to add two new custom actions (subscribe and unsubscribe) to our existing DiscussionsController. But since we all love REST and want to follow its principles, we'll go the extra mile and draw a new controller instead.

DHH explains the rationale behind making new controllers versus adding custom actions to existing controllers in his interview for the Full Stack Radio. As the radio cast is lengthy, I'll quote only the relevant part here:

What I’ve come to embrace is that being almost fundamentalistic about when I create a new controller to stay adherent to REST has served me better every single time. Every single time I’ve regretted the state of my controllers, it’s been because I’ve had too few of them. I’ve been trying to overload things too heavily. So, in Basecamp 3 we spin off controllers every single time there’s even sort of a subresource that makes sense. The heuristics I use to drive that is: whenever I have the inclination that I want to add a method on a controller that’s not part of the default five or whatever REST actions that we have by default, make a new controller! And just call it that.

Following DHH advice, we define a new resource subscriptions. We spin off a SubscriptionsController with two standard actions – create and destroy. The create action subscribes a user to a discussion, creating a subscription. The destroy action unsubscribes a user from a discussion, destroying a subscription.

Our routes are RESTful. Our resources are manipulated with the standard HTTP verbs. We haven't introduced any custom longhand controller mappings. We haven't polluted our DiscussionsController with subscriptions concepts. But is another controller worth it?

"That's not worth another controller" – I often hear that. The first time you have to add a custom action, maybe it's not. But not following that rule has snowballing effect on the complexity of your controllers over time as more and more people add more and more custom actions to existing controllers.

I am creating a new controller, not because DHH says so, but because it makes my code better and my job easier. Complex actions in your controller are hard to read, maintain and reuse. RPC-like endpoints don't follow conventions, break the noun-verb pattern, controller mappings become hairy, controllers become fat, and the routes file no longer clearly describes our domain objects.

We want small, comprehensible resources that represent our domain concepts. Not every controller should be mapped one-to-one to an ActiveRecord model. We don't want to expose the specifics of our database at the controllers layer. The resources in the routes file should clearly represent our application domain, not our database schema.

Our code

The code is intentionally concise and simple. I've omitted the irrelevant details to focus on the core concepts.

class Subscription < ApplicationRecord
  belongs_to :discussion
  belongs_to :user
end

class Discussion < ApplicationRecord
  has_many :subscriptions
  has_many :users, through: :subscriptions
end

class User < ApplicationRecord
  has_many :subscriptions
  has_many :discussions, through: :subscriptions
end

class SubscriptionsController < ApplicationController
  def create
    discussion = Discussion.find params[:discussion_id]
    current_user.subscriptions.create! discussion: discussion
    UserMailer.with(discussion: discussion, user: current_user).subscribed.deliver_later
    redirect_to discussion, notice: "Subscribed."
  end

  def destroy
    discussion = Discussion.find params[:discussion_id]
    current_user.discussions.delete discussion
    UserMailer.with(discussion: discussion, user: current_user).unsubscribed.deliver_later
    redirect_to discussion, notice: "Unsubscribed."
  end
end

A piece of cake. Despite the shroud of mist and fog surrounding controllers, we've built our new feature in no time. All the coding Gods are with us, and we can grab some beers.

But are we done?

The classes are small, and the methods are small (only four lines). Small methods are labeled as "good code" by many. Even DHH could say our code has “clarity,” so it’s okay. Our implementation will pass through code review in many Rails dev shops.

I don't know about you, but I wouldn't say I like our solution. I smell things. Stop for a moment and think about what is wrong with our code.

Don't Repeat Yourself

Our controller is not DRY. I put DRY as the first principle we broke not because it is the most important one but because it is the easiest to refactor – it's a low-hanging fruit to pick.

In general, I am never too concerned when code is not DRY. The repetitiveness here may not even classify as true code duplication you may argue that those pieces have different reasons to change and most of the repeating code is technically delegation, i.e. the controller is not doing stuff but it is asking someone else to do stuff on his behalf.

Let's not get stuck in this DRY analysis paralysis, but clean the code. I'll use the extract method refactoring pattern to move things around. I'll memoize the Discussion object to avoid hitting the Rails query cache.

class SubscriptionsController < ApplicationController
  def create
    current_user.subscriptions.create! discussion: discussion
    mailer.subscribed.deliver_later
    redirect_to discussion, notice: "Subscribed."
  end

  def destroy
    current_user.discussions.delete discussion
    mailer.unsubscribed.deliver_later
    redirect_to discussion, notice: "Unsubscribed."
  end

  private

  def discussion
    @_discussion ||= Discussion.find params[:discussion_id]
  end

  def mailer
    UserMailer.with discussion: discussion, user: current_user
  end
end

Law of Demeter

Breaking the Law of Demeter (LoD) principle is not the main problem of our controller. But it's yet another low-hanging fruit that we can pick early and get our code in a better shape for the next steps of our iterative refactoring. As you can see, I am applying a series of small behavior-preserving transformations. I am not changing the behavior. I am only improving the design. I can deploy with confidence to production after each iteration, as I am doing very small changes lowering the risk of introducing any bugs.

The Law of Demeter or principle of least knowledge is an object-oriented design guideline that is a specific case of loose coupling. It states that each object should only talk to its immediate collaborators and not talk to the collaborators of its collaborators. Said in common words – only talk to your friends, not to your friend's friends.

Reaching out to your collaborator's collaborators is, again, knowing too much about your collaborator's inner guts. You should strive for dumb abstractions. Your objects shouldn't be aware of the whole world around them. You can easily spot objects that know too much when you try to cover them with tests.

In our controller, the User object is our immediate collaborator. Following the Tell-Don't-Ask principle, we can tell it to subscribe to a discussion instead of asking for its subscriptions and then adding a new one to the list. Like so:

current_user.subscribe_to discussion

Our controller, however, knows how a discussion is linked to a user through a subscription. It knows what methods to call at the other end of the chain. We have coupled the controller to the entire relationship. If one object changes, we’ll have to adapt to that change, not only its immediate dependants but also find their distant dependants and change them. Breaking the LoD principle couples more than two objects tightly together and creates a dependency hell.

Let's refactor our code and fix that pain:

class User < ApplicationRecord
  def subscribe_to(discussion)
    subscriptions.create! discussion: discussion
  end

  def unsubscribe_from(discussion)
    discussions.delete discussion
  end
end

class SubscriptionsController < ApplicationController
  def create
    current_user.subscribe_to discussion
    mailer.subscribed.deliver_later
    redirect_to discussion, notice: "Subscribed."
  end

  def destroy
    current_user.unsubscribe_from discussion
    mailer.unsubscribed.deliver_later
    redirect_to discussion, notice: "Unsubscribed."
  end

  private

  def discussion
    @_discussion ||= Discussion.find params[:discussion_id]
  end

  def mailer
    UserMailer.with discussion: discussion, user: current_user
  end
end

Tight coupling and changeable code

Let's zoom out on the LoD principle. Many design principles sound vague and open to different interpretations. Look at SOLID. People don't know what these principles mean or how they make their code a better place. The problem design principles are trying to solve is tight coupling. Solving the tight coupling problem makes your code changeable.

The more coupled things are together, the harder it is for them to change. The only sure thing in this world is that requirements change. When that happens, we want to change the code without pain. Loosely coupled code is the path to that bliss. A change in one place doesn't inflict numerous changes in other places.

When abstractions are not intertwined into a big ball of mud, we can change them, reuse them, and test them. Every time we have to change a thing, that change in that "thing" will propagate changes in all other "things" that depend on it cascadingly.

When you refer to something, you depend on it. When the things you depend on change, you must change. – Sandy Metz.

That is abstract advice. Let's use an example of decoupling things and minimizing dependencies to show what it means. Take this line of code:

current_user.subscriptions.create! discussion: discussion

Our controller depends on the User model. It may not be obvious, but it also depends on the Subscription model because it invokes its public API to create a subscription. If the public API of the Subscription model changes, we'll have to go to all the objects that depend on it and change them.

We have coupled our controller to the internal guts of the user. The controller – or any other dependant on the user model – shouldn't know how subscriptions are stored and how to manipulate them. To remove that coupling, we need to hide the subscription details from the outside world and remove the Subscription model as a dependency.

current_user.subscribe_to discussion

Our controller depends only on the User model. It subscribes to a discussion through the User public API and knows nothing about subscriptions. Subscriptions can change, but that change won't propagate changes to other objects. There will be no shotgun surgeries.

No domain concepts in the controller

If you have to remember one thing from my preaching, remember – your controllers shouldn't be married to your domain logic. A well-designed controller should know what it needs but not how to do it. Domain concepts don't belong in the controller – get them out of there.

Our entire feature implementation is inside our controller. The controller knows how to perform every single action in the subscription process. All that knowledge doesn't belong there. The controller should make one single call to a domain object and receive one dumb value object from that call.

We want our imperative approach to be more declarative. The controller should declare what it needs instead of knowing how to accomplish it all the way through. It should stay focused on "what", not "how". It should depend on a domain object and delegate work.

When scrutinizing controllers, think about the reasons for them to change. A controller shouldn't change if the business logic changes. The only reason a controller should feel the need to change is if unimportant details have changed – the HTTP params, the response type, and the redirect URL.

How about good naming? By extracting domain concepts out of controllers, we give the new abstractions good names. They become well-known abstractions with clear public interfaces. We can reason about them independently from the delivery mechanisms.

What about reuse? We can reuse the domain objects in other parts of the system. For example, we can build a command line interface or REST API or receive messages through a message broker, and it won't matter. We can plug our PORO anywhere in our application.

Imagine the other way around – you have many controllers and your business logic is scattered among them. When you need to build something similar, instead of just reusing existing code, you'll have to go through your controllers, scan for similar code, and try to pull something out – or worse copy and paste.

Take AWS, for example. You can spin off a new EC2 instance using the command-line interface (CLI), the REST API, and the web UI. Imagine the reaction of the team doing the CLI when they need to reuse the logic around creating an EC2 instance if the team doing the web app has put that logic in the controller. The CLI team cannot reuse a thing.

The controller is a thin layer between the HTTP request/response and the application logic. It should handle the request parameters, invoke the service layer and build a response (redirect or render). It should deal with the transport-related stuff and delegate everything else. In our example, the whole subscription process should be refactored out of the controller.

MVC is not architecture

But where should our domain concepts live in a world ruled by MVC? Rails gives you three boxes (M, V, and C) and pushes you to pick one and put your application logic in there – usually the model. Or worse – scatter your logic across all three boxes.

When you decide on how to structure your application code, you have two main approaches: horizontal and vertical. With the horizontal approach, you'd divide your code into layers of technical concerns – model, view, controller. Separating those horizontal concerns doesn't grant you clean architecture out of the box.

With vertical slicing, you build your abstractions around domain concepts (or use cases). You put your features in separate boxes well-isolated from each other. The abstractions living in those boxes are plain old Ruby objects that know nothing of how data is delivered to them and what happens to the result they return – they know nothing neither of the controllers nor of the views.

MVC doesn't solve any problems with your domain architecture. When working on a large application with many business rules, multiple delivery mechanisms (HTTP, AMQP), and multiple data stores, you'll feel the pain of having domain logic infiltrate your MVC boxes. You'll experience the problems of having your code split among horizontal concerns – of not having good architecture.

Let me share my experience with a real project at work. We had a Rails app that was communicating with the rest of the world using a message broker over AMQP. We decided to rework that application as a web service that talks over HTTP. We succeeded to do that in just a couple of days given it was a relatively large application – the orchestrator for our automation platform.

How was that possible? Well, it was possible because our architecture was not based on MVC. We had invented a new box – the "services" box – where all our domain objects lived. Those objects were isolated entirely from the low-level details – the controllers and the views. Yes, Rails controllers are a low-level detail. They are not important. They are just a delivery mechanism – how your domain objects receive their input.

Your database – the one and only thing toward which your code points – is a low-level detail. It is not important. It is just a storage device. You can replace it if needed. Your views – totally not important. No business rule should care how its result is presented. These are pluggable moving parts of your architecture. The important thing is how you organize your business rules. That's your architecture. MVC is not your architecture.

Will you buy good architecture? Is the value of having the right architecture greater than the cost? Good architecture and good code both require effort. It takes time to do things right. You have to walk instead of sprint.

The effort pays off when designing large applications. Are you building a bank system or a to-do list? When building a to-do list, your entire application logic is to read from and write to a database. It will seem weird (or verbose) if you start isolating your models from your domain objects.

Let me bring another example from work. We have this data extraction pipeline that extracts financial information from receipts. The entire pipeline is decoupled from ActiveRecord – it doesn't know a thing about the application database. It receives a dumb value object as input and returns a dumb value object as output. At one point, we were able to reuse the entire pipeline in a different workflow – one without a database. Testing the pipeline was a breeze. We have to build one value object in the setup phase and expect one value object in the assert phase – no fixtures, no factories, no Rails, no ActiveRecord.

Slow test suites and big hairy specs

Bad tests ensue bad code. Try to cover our controller with specs and you'll have a complex test setup, test every use case through the controller, and have multiple expectations. Test smells unveil code smells. Well-factored code, on the other hand, is easy to test.

If testing seems hard, then there is a problem with your design. - Sandy Metz.

The Ruby community loves Test-driven Development (TDD). TDD shifts your focus from the implementation to the design. Using small tests, you'll design your core abstractions thinking about their public interface from the very start. Once the logic is implemented (and well-covered with tests), you can plug it anywhere.

It is hard to test our controller because it's doing too much. The object under test has many collaborators and responsibilities. We'll have to arrange the world to test our new feature through the controller (or worse – through the UI). A Rails advocate would say here – "That's fine, we have fixtures. We will fix the whole world exactly the way you need it".

It's not OK to test your system through the UI. It's not OK to turn the Test Pyramid upside down. Your test suites become flaky and slow. The failures are hard to troubleshoot.

You may go one step further and make your tests independent of your web framework. If your specs don't need to activate bundler and require Rails, they'll be extremely fast. When you need to run one spec or one file, you won't need to wait for several seconds of startup time. You'll get immediate feedback. But to get there, your domain objects shouldn't know anything about Rails.

Your domain objects should hold only computing logic. They should get dumb value objects as input and return dumb value objects as a result. The whole web framework should be isolated behind clear boundaries with arrows pointing from the low-level details to the high-level business rules and only dumb value objects should be allowed to cross those boundaries.

That goes in conflict with the Rails community where people test everything through the database. And maybe they are right – the effort is just not worth it for small web applications. But what about giant monoliths? Rails is a very heavy dependency. You'll get more than 10 seconds of boot time penalty before the first spec even runs.

But if the controller holds zero logic, should we write controller specs at all? Yes, we should. The value of the controller spec is not in validating behavior. The value of the controller spec is in showing that there is zero behavior inside the controller under test. Hairy specs expose a fat controller. Bad code results in bad tests.

A controller spec brings some benefit in validating low-level details – massaging of the HTTP parameters, expecting specific HTTP status codes, user authorization, logging, etc. You shouldn't re-test your domain concepts through the controller specs. Simply assert that the domain object is called correctly and that's it.

Our Refactoring

A good way to refactor our code would be to use the extract class pattern and extract the domain logic into new abstractions that have only one reason to change. When creating new abstractions focus on using clean interfaces and simple classes with obvious responsibilities.

class User < ApplicationRecord
  def subscribe_to(discussion)
    subscriptions.create! discussion: discussion
  end

  def unsubscribe_from(discussion)
    discussions.delete discussion
  end
end

class DiscussionSubscriber
  def initialize(user, discussion_id)
    @user = user
    @discussion_id = discussion_id
  end

  def subscribe
    @user.subscribe_to discussion
    mailer.subscribed.deliver_later
  end

  def unsubscribe
    @user.unsubscribe_from discussion
    mailer.unsubscribed.deliver_later
  end

  private

  def discussion
    @_discussion ||= Discussion.find @discussion_id
  end

  def mailer
    UserMailer.with discussion: discussion, user: @user
  end
end

class SubscriptionsController < ApplicationController
  def create
    DiscussionSubscriber.new(current_user, discussion_id).subscribe
    redirect_to discussion_path(discussion_id), notice: "Subscribed"
  end

  def destroy
    DiscussionSubscriber.new(current_user, discussion_id).unsubscribe
    redirect_to discussion_path(discussion_id), notice: "Unsubscribed"
  end

  private

  def discussion_id
    params[:discussion_id]
  end
end

We have factored out the domain logic into a separate abstraction – a plain old Ruby object – the DiscussionSubscriber. There are no business rules in the controller now. The controller just makes sense of the HTTP params, delegates the work, and redirects to a new location.

The domain object DiscussionSubscriber is a black box to our controller. The controller is completely ignorant of the its behavior and of the underlying database schema. It doesn't know how to do stuff – how to make or break a subscription. It only knows what needs to be done and where to go after it's done.

We have decoupled our application logic from the delivery mechanism. But not only that, we have decoupled all of our objects from each other and placed each responsibility into its own abstraction. Now all classes can change internally without affecting other pieces. You will rarely need to change the controller code because all of the business logic is isolated out of it.

"But the original code was smaller" – I hear that often. Yes, I have more code than when I started the refactoring. If all else is equal, more code is bad. But rarely is all else equal. The extra code breaks the logic into identifiable parts. This modularity makes it easier to understand parts of the code and how they fit together. I choose clarity over brevity.

Breaking a large function into smaller ones adds value only if you are good at naming things. With bad names, I'll still have to jump forth and back to each function to understand what it does. With good, intuitive, intent-revealing names, I don't have to dig inside the body of each function to see what it does. I can infer what it does from its name.

Rails, Django, Spring, any

The core concepts behind having a skinny controller are programming language and web framework agnostic. The principles described here relate to any web framework – not just Rails. You can apply the same refactoring patterns in Django, Flask, or Spring.

I used the same refactoring techniques back in the past when working on a large Java Spring application. The result was beautiful – we created a lovely garden of well-designed domain classes. Our test suites become stable and, as far as I remember, at least ten times faster. In the end, I was asked to go and teach other teams so they could get to where we were.

References

GORUCO 2009: SOLID Object-Oriented Design, by Sandi Metz
Extracting Domain Objects, by Gary Bernhardt
Ruby Midwest 2011 Keynote: Architecture the Lost Years, by Robert C. Martin