A Positive Root Cause Analysis

At the end of every development iteration we always do a retrospective and a root cause analysis meeting. In our retrospective we cover what went well, what did not go so well, and come up with action items for things to try in our next sprint. In our root cause analysis meetings we usually pick some problem that occurred, try to figure out why it happened, then understand how we could prevent this problem from happening again in the future. Our root cause analysis meetings are usually always on a negative topic. However, during our recent project we decided to change things up and do a root cause analysis of the following question:

Why did our last project go so well?

This may seem like a strange question to ask, but often in root cause analysis meetings we tend to focus only on the mistakes made during a project. When problems occur you need to identify the root causes of those problems to prevent them from happening again. While in a project that goes well it is just as important to recognize what we did right this time to ensure we understand “why”. What did we do different this time that worked? What were the crucial decisions made that kept the project on track? In our root cause analysis we found some key decisions, that in retrospect were important but at the time did not seem critical.

Failing fast when a decision starts to become problematic
For our project we thought we had a clear, straight forward design to work with from the beginning. However, after spending even just a day spiking some ideas our design immediately started to show cracks. Our design that on the surface looked simple, turned out to be far more complicated to implement than we had imagined. A large part of the reason for this problem was that our project was to make changes to an existing application about which no one on our project had any previous knowledge. We immediately had a team huddle, called a “Just In Time” design meeting and corrected our course. As a result, we lost a day, instead of a week or a month going down the wrong path.

Consulting experts early
When we started our project we knew of a few ways to accomplish our task, we had received suggestions that sounded fine, but we really were not sure if our approach was the best solution available. Fortunately we have some very experienced people in our company that have spent many years contracting and as a result have an incredible variety of experiences from which to draw upon. So we called a quick design meeting with one of these experts, showed them what we were thinking of doing and just picked their brain for ideas. It turned out our expert was able to come up with an approach to our problem that not only would allow us to complete the task within the time-line given to us by our business team, but at the same time would allow us to implement a cleaner solution.

Keeping code ownership high
We had no one person on the team that if they were sick for a day it would prevent a task from being completed. During our project we made sure every line of code was written with a pair (We always try to pair program every line of code) and switch pairs regularly. Because of this knowledge sharing we did not have any “Experts” on any one area of the application. We always had at least 2-3 members of the team who were knowledgeable enough on any given area of the application be able to bring another developer up to speed.

Break all stories into small tasks with a clear definition of “Done”
The stories we work on during a sprint always show the business value we are adding, but from a developers perspective there are usually multiple tasks required to complete each story. At the start of each sprint we held a task breakdown meeting for breaking each story down into a set of small tasks. Our team found that having a set of clearly defined tasks for each story was very important to keeping the project on track. With any story we receive from the business team their will be questions and as a result we found that doing this task breakdown meeting helped flush out many of those questions at the start of the sprint, as opposed to after development had already began, which in the past was usually what happened. It made it clear to our team lead and business analyst exactly what work was being done, who was doing it, what tasks had been completed, and what tasks had not yet been started. Also this gave our business analyst and team lead a better idea of when to expect demos since they could see how many tasks were remaining before a story would be completed.

Demo to business team often
We started our project doing a fairly poor job of demoing but this was corrected after one of our sprint retrospectives. Business analysts need to see the work being done and often will think of something that was missed, or see something that perhaps spawns another story. One of the easiest ways to ensure that what is being developed matches what the business team wants is to keep them in the loop and an excellent way to do that is through frequent demos.

Quick feedback from business team
During development there are always going to be cases in business logic spotted by developers that were missed during the initial planning phase. When a developer spots a missed case and brings it up to the business team, quick feedback from the business team can play a major role in keeping the project on schedule. In our last project this turn around time was often hours, if not shorter in most cases.

Keep the systems team involved from the start
The people who will be deploying the application and hosting it should be involved right from the start of the project. Your systems team has experienced many deployments and also know the pain of hosting a problematic application. Allowing the systems team, who will be responsible for the application after development has been completed, to be involved in key decisions can greatly improve the chances of a successful deployment and potentially reduce the cost of hosting the application.

Having your team do a positive root cause analysis can be very useful. It sometimes seems like after a problem in one sprint, we focus in the next sprint so much on improving in that one area that we sometimes slip in areas where we were previously doing well (for our team it was demos). On previous projects I have worked on, we definitely tried to follow each of these best practices outlined in our positive root cause analysis. However, since we moved to Agile almost two years ago, this was the first project where everything just “clicked”.