Thursday, March 17, 2011

How & Why I do code reviews

Code review is one of the programming practices that I advocate. However, as with any practice, I find understanding and communicating the values and driving forces behind it even more important than the actual practice itself (see: Cargo Cult). So, here is my take on code reviews.

Why?

I'm pretty much against introducing any practice just for the sake of introducing it. Different goals require different means, and without clear goals, it's rather difficult to evaluate whether a practice has helped the team or not. The reason I practice code reviews are:

Knowledge sharing (note: none of the teams I've been part of practiced pair programming, which serves this purpose quite well). It's a way to break the silos that exist within the team, regardless if they are horizontal (UI, server side, data access, etc.) or vertical (this is Jane Doe's component, noone else can touch that for there dragons live). I rather like that when I ask for a holiday my manager doesn't have to consider whether or not there is anyone who could work with my (recent) code.

Mentoring, though could be considered part of knowledge sharing, deserves to stand out on its own. I've found a lot of the concepts I came across first way too abstract, when read in a book, but when explained in context (what you've just written there is known as the Factory pattern...) I get a better and deeper understanding of it.

Culture creation. Again, abstract vs. concrete, talking vs. doing. One can have so many coding guidelines, best practices, etc. documents written. However, these documents usually highlight high level objectives, and almost any concrete implementation can be argued to fit with the description. I much prefer people arguing whether or not a new class should be extracted or not to a dry design guideline setting limits on the number of methods of a class and just checking against that formally (if at all).

Quality Assurance. Are there enough and appropriate tests? Do they makes sense to the next person (I routinely had to rename tests for my reviewers, because I was already too close to the code and didn't communicate intent well enough)? Are actually all tests passing? Haven't you accidentally reinvented the wheel (I've once failed to recognize that what I'm working with is an existing domain concept - the beauty of working with legacy code - so I created another model class for it)?

Last, but not least, catching obvious bugs is a benefit not to be overlooked. Face it: we all make mistakes, that are embarassing in hindsight. Catching them during the development cycle is keeps the embarassment in a smaller circle :)

What?

Computers have been invented to automate tiresome, repetitive, and boring tasks. Spending programmer time on checking tabs vs. spaces, curly brace placement, and other such, easily automated checks (StyleCop, Checkstyle, FindBugs, etc.) is more than waste. If such things are important for you, do a quick check that there are no warnings in the build for these, and move on to the areas where value is added.

When?

Code reviews should be concerned with small, easily digestible chunks, just like commits. That's why I prefer to review commits. Before they are commited. I can only offer nonscientific justification (experience + anecdotal), and no explanation, but my observation is that committing code makes it just as rigid as clay becomes in the oven - it becomes much more difficult to change it. Renaming a method/variable is much easier while it's still in the staging area (maybe the burden of writing another commit message?). This also gives enough time to ensure that tests are run before each commit (run the command line build while the review is ongoing).

Some suggest that developers should ask for code review when they feel it's a riskier change they've made. Since I'm really bad at predicting when I make mistakes (to be frank, I tend not to be able to predict), just to be on the safe side (and to ensure the flow of information), I advocate to review every commit.

How?

There are multiple ways to achieve the same thing, one that I've found working is the following:

  1. the developer who just completed a commitable chunk of work, signals the others (via a team chat window, though ACM ICPC style balloons could work too) that she can be reviewed.
    1. someone from the team picks up the request in the next few minutes - this delay ensures that the reviewer is not interrupted and dragged out from the flow.
    2. If noone picks it up in a few minutes, the developer stashes his work, and continues working until someone becomes available, when they return to the original change set.
  2. the command line build is started on the developer's machine while the nature of changes is described in a few sentences to the reviewer, so the context is set
  3. the reviewer drives the order the changes are reviewed. She asks questions, requests clarifications, which usually result in rename refactorings.
    I tend to start reviewing the tests - first reviewing the names make sense and no relevant corner cases are missed, then the actual test code. Only once I'm happy with that do I turn to the actual code.
    1. if any major flaws (bug, failing test, duplication - anything that cannot be fixed as fast as discussing it) are discovered, the code cannot be commited, and another review is needed (we tend to have that next reviewer be the same person as the one before in order not to slow down the process).
    2. otherwise, the normal checkin dance begins, and the code is committed.

Cons, worries, and pitfalls (real and unreal ones)

Just like any other practice, this takes time to learn and to do. However, I rather prefer the extra few minutes spent discovering duplication than the days it'd take later to remove. Exposing the potential misunderstandings that wouldn't be discovered while just talking again justifies it. If a review takes longer than a few minutes, that signals there is a deeper issue there, which should be discussed - like agile practices promote, we've exposed a problem early. In summary: I think it's worth it.

Co-location, while certainly makes the process easier, is not required. If you are working in a remote team, you probably have appropriate communication channels. A voice call with shared screen is all that's needed.

It takes effort and discipline. If the reviewer doesn't pay enough attention, isn't engaged, it's not going to work. Discussing regularly what kind of problems slipped through the past code reviews and whether they could (should) have been caught could be part of retrospectives.

Just as it's possible to create horrible designs with TDD, it's just as easy to deviate from the architecture, design of the program in small, correct steps. Even when reviewing only small chunks, keep in mind the big picture.

Some teams do post commit code reviews instead, arguing that if we do it in an asynchronous fashion, the work of others is not interrupted, the developer is not slowed down by waiting for others. While I've seen teams practice post commit reviews effectively (as they say, a team of good developers is likely to achieve good results regardless of the process/practices they follow), so I'm not saying that can't work, but I prefer to do it before. In addition to the clay analogy used above, synchronous communication is more effective than asynchronous (at least between humans, between machines its another story). At the end of the day, developers will still need to talk with each other for clarification, so not much time is saved anyway.

Tooling. Most code review software don't support pre commit reviews, however, they have value for other purposes. I like them to start discussions on things I see in the code that don't need immediate changes for the features we are working on now, but should be discussed at some point - either to become a nice to have task, or to actually realize it's a must have refactoring. I find these tools can be more effective than leaving TODO comments in the source code - threaded discussions are nice, and hard to have in code comments. Another area where they shine is architecture/prototype reviews. It's a great way to prepare for a design review meeting. And yes, I do prefer code based architecture evaluation to UML diagrams (though the later are useful when communicating about the architecture), because concrete coded scenarios can bring up more flaws than the white board - see the Non Coding Architect.

And do I need to say it? Just as any other practice, it's no silver bullet!


With regards to silver bullets - nothing is perfect, including this approach and the post itself. Please, do leave a comment pointing out any improvements/problems!




Update: Code Review During Retrospective lays out a great approach for purely educational focused code reviews, I recommend reading that too!