Monday, August 13, 2012

Improving Code Quality - Scheduling Technical Debt and the Bucket Parable

Overview

Please read the previous post on improving code-quality using LCOM4 and Cyclomatic Complexity before reading this entry. Below I present two ways of thinking about technical debt in your organizations. The first is the old-school approach, and the second is the approach preferred once the vast amount of cyber-fraud was detected.

Old way of thinking

Before cyber-security became a large concern, software development companies thought differently than now about their approach to their applications.  In those days, if you'd run Sonar against your codebase, and  identified some areas where your source-code was butt-hurt and you wanted to fix it, you had a problem.  See, the viewpoint was that acknowledging flaws in your code was also acknowledging legal liability. So, companies would fix bugs in "maintenance releases" but would not acknowledge the specific security failures they were fixing.  This environment of subtlety safeguarded application developers, but exposed end-users to the prospects of security holes as a result of bugs.

The Reality

Currently, there are massive cyber attacks against companies around the world making companies more appreciative of an increased level of due diligence with their applications.  The mark of a successful company is not that they pretend problems don't exist in their code, it is in the fact they acknowledge it, fix it, and communicate it. To understand how to plan to include technical debt resolution, you must first understand the Bucket Parable.

The Bucket Parable.. a digression (bear with me, it pertains)

Back in the olden-days, there was a small village that had a very unique annual contest: the rocks-in-a-bucket contest.  The goal was to see who among the competitors could get the most rocks in their buckets.  Now, their buckets were all the same size;so, the only thing that differed was the size of the rocks.  In the first year's contest, a strong man named "Hugo" won the prize because he had managed to stuff 2 massive rocks into his bucket. The next year, a strapping gentleman on a horse won with 3 rocks, each weighing five pounds each.  And the third year?  Well, the next legendary player was Mike Van himself, and he was able to stuff over 400 rocks into his bucket.  But how, you say?  Well, while all the other folks were worrying about the big rocks, Mike Van paid attention to the little rocks too. So, whenever there was a gap between the big rocks, Mike Van filled them with the little rocks. As he began to build up his bucket, he noticed that the massive number of little rocks also gave the bucket extra weight, making it more formidable. Well, Mike Van was proclaimed the winner, after which he went into retirement from rocks-in-a-bucket, moving into a cave to knit "delicates" for poodles. He apparently feels it is a growth investment zone.

The Bucket Parable and Scheduling Technical Debt

If you've ever been on a good development team, you know about feast and famine. There are times that you have so much work to do there you have to work stupid long hours, and there are other times when you play Call of Duty with your co-worker cuz there aint squat.  It is this time that the Bucket Parable comes into play.

If you are delivering product, meaning software, on a schedule, that means you have a specific period of time to deliver a specific set of functionality.  Think of your timeline as the bucket in the bucket parable.  Now, each of your major tasks are going to take up some of the time of one-or-more of your developers.  These major tasks are your big rocks.  However, what will they do when they're complete?

After you've completed your Sonar analysis and have noted the weaker areas in your codebase, you should create tasks for each item to fix.  Then,. in your tracking tool, allow your developers to choose those "little rocks" they can do when/while they do their big rocks. At the end of the development cycle, not only have you completed all of the intended features, you've also completed a large set of smaller tasks including paying down  your technical debt. Customers like this. Its good. Do it, then drink beer with friends.

Summary

Using Sonar to develop the conceptual model of a "bucket-of-rocks" we are able to identify the most significant areas of risk associated with our technical debt, and manage it while creating a negligible impact on our customer's bottom line.

Improving Code Quality - LCOM4 and Cyclomatic Complexity

Overview

One of the reasons that open-source software is so solid is that we use some of the most cutting-edge open-source code analytic-tools available to ensure our software does what we intend in a bug-free manner.  In this post, I will talk about one tool we use, Sonar, and two specific metrics I've found useful in focusing the resolution of our technical debt.

Technical debt is defined as all the stuff you should have done that you didn't have time to do. For example, you may have left off a couple of unit-tests. Or, perhaps you decided not refactor that 6,000 line java class because it worked as a prototype and "if it ain't broke...".  As your applications grow, the amount of technical debt will also grow.  In many commercial and consulting settings, it may not be realistic to take a few months off of implementing new features to resolve the technical debt.  In that same spirit of realism, the only time a team will realistically focus on resolving technical debt is when there is absolutely nothing else to do.  You know, after all tasks are completed, the Kanban board's WIP column is clear, and your team is tired of playing Call of Duty.

This lack of priority and time to resolve technical debt creates a problem.  Too much technical debt, and your codebase becomes unmaintainable. Your team is literally one production outage away from working 18 hours days seven days a week until the bug is found. You need to have a way to focus the resolution of your technical debt in order to reduce the risk of a production outage.

Sonar is a source-code analysis tool that has proven very useful in doing this.  Specifically, there are two metrics that are very useful for targeting the work: Cyclomatic Complexity and LCOM4. Together, these two metrics will provide you with a very easy, and inexpensive way to target your technical debt resolution.

Cyclomatic Complexity

The measure of the number of unique pathways through a class, method, or application is called "cyclomatic complexity".  This metric was originally proposed by Thomas McCabe in 1963.  It already has a write-up on Wikipedia that goes into great technical detail about what it is, how it is calculated, and even has some pretty pictures.  Instead of completely describing it here, instead I'll hope you clicked the link and skimmed the wiki before continuing to read further.

Another way to think about cyclomatic complexity is as a measure of the difficulty a new developer will have understanding source-code.  Usually, a cyclomatic complexity of 5 or less is considered good.  Anything between 6 and 11 is considered moderately risky. And, any source-code with a complexity over 10 is considered poor.  

In my experience, you should also take into consideration the difference between classes encapsulating your business algorithms, and those containing a large number of utility methods.  A utility class may have 100 methods, each doing something very small.  Your complexity for the utility class may be over 100, but when you look at the average complexity per method, it will be 1 or less because your methods will be very discreet.  Now, compare that with a class containing methods which implement business logic. In the prototype phase of development, these will likely consist of multiple if statements each with for loops and switches.  This kind of class will have a complexity which will grow with each "if" statement in your method.  While you usually can easily ignore large utility classes, you should absolutely refactor classes containing business logic with high complexity.

When using Cyclomatic Complexity to target technical debt resolution, identify those classes with the highest complexity that are not utility classes. Sonar provides this information is an easy format, and is fairly easy to set-up and use!

LCOM4

LCOM stands for "Lack of Cohesion of Methods" and generically is a set of metrics that measure the how methods in a class interact with each other. This metric was updated a number of times until LCOM4 was introduced by Hitz & Montazeri.  LCOM4 measures the connected components within a class. The term "connected components" refers to related methods and class-scope attributes. LCOM4 suggests that only methods and attributes that rely on each other should be in a class.

If you think about it, from a maintainability standpoint, it is a lot easier to understand a class if all of the components of the class refer to each other. Think of this as a single unit of algorithmic activity. Consider if your hello world class contained methods and attributes that convert between Celsius and Fahrenheit in addition to methods to print out the words "Hello World".  How much easier would it be for a new developer to maintain that code if the Celcius-to-Fahrenheit conversion code were in a different class than then hello-world code?

This may seem like a pretty simple example, but imagine a prototype composed of 500 classes each with 1000 lines or more, and with an average LCOM4 score over 5? Can you imagine being handed this codebase to maintain?  Better yet, can you imagine being asked, 4 years after you wrote the code, to come back and "upgrade" it to use the latest-and-greatest architecture?

Just as with complexity, you should also consider the difference between utility classes and classes containing business logic.  The LCOM4 score of a utility class may be in the 100's.  While this is completely unacceptable for classes containing the implementation of business algorithms, it is completely acceptable for utility classes.  When you use LCOM4 to identify classes to refactor, make sure that you don't focus on your utility classes.

Using Them Together

LCOM4 and Cyclomatic Complexity are related to each other, and by taking them both into account, you will be able to determine where to focus your technical debt. Below should help when you compare a given class' LCOM4 versus Cyclomatic Complexity. I am rating the re-factoring on a scale of one to four, where one is the first priority for re-factoring, and four is the lowest.:
  • Complexity is high and LCOM4 is high: This class has a refactor priority of one. This class implements a number of business algorithms and the methods are very complex. Refactor each algorithm into its own class, then simplify the methods.
  • Complexity is high and LCOM4 is low: This class has a refactor priority of two. This class contains a low number of distinct business algorithms, but its methods are very complex.  First simplify the methods. Then, refactor the algorithms into their own classes. This order works here because the problem isn't the mixture of the algorithms but rather the complexity of the methods. By simplifying the methods, you will see that some of the methods were written to apply to both algorithms. Refactoring the methods first will result in an easier refactoring of algorithms.
  • Complexity is low and LCOM4 is high: This class has a refactor priority three This is a utility class.  There are a large number of methods which implement different business algorithms. There is no need to bother with this class.
  • Complexity is low and LCOM4 is low: This class has a refactor priority of four.  This is a well-written class. Don't touch it.  Consider giving the developer accolades, bonuses, or perhaps not stealing their lunch from the fridge on brown-bag Fridays, Marvin!!

Summary

Technical debt represents the skeletons in a software development project's closet.  Tools like Sonar give you access to metrics like LCOM4 and cyclomatic complexity.  Don't be afraid though, the good thing about being able to see your skeletons is that you can fix them. In this case you can fix potential problems as they arise.