Refactoring to Patterns – Summary

While attending Joshua Kerievsky’s talk at SD WEST he went into a brief overview of a number of refactoring patterns. If you want to read about these patterns in detail (and you should!), you should get a copy of the book Refactoring to Patterns. However, to help peak your interest on the subject, here is a quick overview of a few of the patterns discussed at the talk.

The Three Basic Strategies used in Refactoring

  • Extract method
  • Extract class
  • Move Method

Piecemeal Refactoring Pattern

  • Divid and Conquer
  • Split problem in half
  • Work at solving smaller problems

Narrowed Change Pattern

  • Extract methods to determine where the change needs to be
  • So refactor 3 methods, not 50
  • Change only that area, make the change isolated to only a few places
  • Narrow the change down to a smaller number of change points

Gradual Cutover Refactoring

  • Move from A to B gradually
  • Start with 1,2,3 to 30
  • So the idea is you can ship to production with 10 of 30 pieces of code switched over while in a releasable state the entire time
  • This type of refactoring can be done along side other work without affecting the release schedule for other development work

Parallel Refactoring

  • Write the refactored code along side of the existing code
    • THEN decide later on how and when to finally cut over to the new code
  • This is a very conservative safe approach to refactoring.
  • It also gives management the ability to make the decision whether or not to cut over to the new code in a release
  • So if it is not tested properly by the time of the release, just don’t switch over, no need to roleback changes or take any other steps

Refactoring – Convincing Managers

While at SD WEST I was lucky enough to attend a talk by Joshua Kerievsky on refactoring. The following is just a summary of my notes from the talk.

The 5 stages of a software company (They all make the same mistakes)

  • V1.0 – Developers are happy. The team can develop software quickly. No legacy code to read or fix, no maintenance or scalability issues to deal with (yet). All work is on new features so managers see the value in all the work and are happy.
  • V2.0 – Still pumping out new features just as fast as in V1.0. Starting to notice some technical debt. However, at this point it is small enough to not slow development down or be that big of a concern.
  • V3.0 – More focus on just new features. Same focus as in V1.0 and V2.0. However, now the maintenance bugs are starting to flood in. Stability of the product is decreasing. Customers are getting frustrated. Management reacts by adding more support people and creates a team of maintenance developers.
  • V4.0 – The really great talent starts leaving the company out of frustration. Management starts to look at out sourcing for support and maintenance tasks as a solution.
  • V5.0 – Finally, managers call in experts to help fix the issue, which means a great deal of refactoring work. This is paying a great deal of interest on technical debt that could have been avoided if that debt was dealt with on a regular basis (like paying your credit card bill monthly, not once every year or two).

How to get Managers to Understand the Cost of Legacy Code

Explain it in terms of poorly written driving instructions:

  • You spend a lot of time reading the instructions trying to understand them, this is a waste of time
  • So you then try and ask other people if they understand the instructions, each having a different interpretation
  • Now you try and drive there, how many times do you get lost, how much longer does the drive take?
  • If only the instructions were clear and concise, imagine how much time could have been saved

Now imagine how hard it is for a new developer to understand an existing system that is in a similar state as these driving instructions. The root problem is many different styles from many different developers. As a result code gets more and more confusing over time. Continuous refactoring can help mitigate this problem. Refactoring is the ongoing process of making software simple and concise.

GUI Bloopers

Over the last few years I have spent a lot of time doing web development work. I am definitely no expert on UI design so while at SD West I decided to attend a UI talk put on by Jeff Johnson who is the author of the book GUI Bloopers. I recommend Jeff’s GUI Bloopers series of books, they are an excellent reference to help you avoid making common UI mistakes. The following is a summary of the main GUI Bloopers Jeff stressed in his talk and his book.

Grayed out input fields

Don’t use input fields that are grayed out as a way of displaying text. Always use a control that is never editable such as a label. If you use an editable control that is grayed out, you will just make the user think “How do I enable that functionality?”.

Negative Checkboxes

Never use a checkbox being checked as a way to say “Do not do the following”. You need to be consistent. Checking means do something, unchecking means do not.

Dynamic Menus

Do not have menus where the content of the drop menu changes based on the current task the user is doing. If you want to make certain items inaccessible at certain times, gray them out so the user knows they are still there, just not accessible at this time or in this context. The reason this is a problem is because users tend to remember where something is, but not always what it is called. Users often scan menus and want to go back to where they remember seeing that feature before. So do not automatically add and remove options. Dynamic menus undermines a basic learning strategy.

Popup Not Identifying Itself

The function of the popup is not explained. The user may loose what functionality the popup relates to.

Pages Not Identifying Themselves

A common example is a menu that does not highlight the menu tab to show the user which page they are currently viewing. Also pages without clear, easy to spot titles, can cause this problem.

Distracting Off-Path Links and Buttons

Don’t distract the user from their current task by sending them off in another direction mid stream. Use the pattern “Process Funnel”. This means that when a user tells you what they want to do, what their goal is, put them in a funnel with minimal distraction until they have accomplished their goal.

Too Many Levels of Dialogs and Menus

Look for areas in your interface with too much depth and see how it can be flattened. One approach is to optimize for the most commonly used paths.

Inconsistent Text

Same functionality but called different things in different areas. This makes it very difficult for a user to learn an application.

Rule:
Same Word, Same Thing
Different Word, Different Thing

Misleading Text (e.g., Erroneous Messages)

A common example is a misleading error message, such as an error message on the page telling the user the “Username entered is invalid” when it is actually the login server that is down and not responding. The problem here is the user will keep retrying because they have not been given a clear message.

Easily Missed Information

For example an error message which is placed on the page far away from where the user has entered the data that has caused the error and is currently focusing. Often users will not associate the two.

Burying Relevant Information

This makes it difficult for the user to understand what information is needed to complete their task amongst all of the information being shown. Also avoid too much repetition in naming options such as options that all start with the same name:

How to …
How to ….
How to ….

For Example:
In the Windows start menu under Microsoft Office, shorten the title of each app to be just the name of the application:
Microsoft Office Word 2007
Microsoft Office Excel 2007
Microsoft Office Outlook 2007
Microsoft Office PowerPoint 2007

to:
Word
Excel
Outlook
PowerPoint

Notice how much easier it is to scan the menu now. In an application where the user is seeing it for the first time, you want to make it as easy as possible for the user to quickly scan to find the option they want.

Colour Differences that are Too Subtle

Graphic designers often create this issue by opting for colour patterns that are more stylish but end up being too subtle for the user. However, if a colour difference is the only way to tell something apart, then the design of the UI element needs to be reconsidered. You must also take into consideration people who have colour blindness or even bad monitors. One good test is to print UIs in black and white. Use of colour should only be used as a redundant emphasis along with another visual hint.

Instructions Disappear Too Soon

Giving instructions in a dialog that then must be followed after the dialog has been closed should always be avoided.

Dialog Boxes that Trap Users

For example, an “OK” button that informs you of a problem yet does not allow you to take a step back. It warns you that something you probably don’t want to happen is about to happen, but does not give you an option to not do it. The popup just tells you it is about to happen and waits for you to say “Sure, alright, I guess I have no choice”. The second way to trap a user in a dialog is by having a message on the dialog that does not at all make it clear which options on the dialog mean what. No clear relationship between the text and the options. There are “Yes” and “No” options, but the text does not clearly explain which option will do what you want.

So what to do?

Avoid anarchic development.

  • No Design
  • No UI Standards
  • No oversite

User centered design and Agile are definitely compatible.

Agile and Object Oriented Principles

I had the opportunity to attend a talk on Agile by Robert C. Martin while I was at SD West. The following is a quick summary of the main points stressed in his talk.

In order for a team to really adopt Agile, the project must be structured in a way to allow for small, thin vertical strips of the application to be released independently of any other feature in the application. If we have a story for feature A, then we should be able to implement the story for feature A, test feature A, and release feature A without affecting feature B. If this is not possible in your current project, then you must work to moving in this direction. This division is extremely important at working towards an Agile system that has few bugs and is quick to QA.

Example Scenario

I make a change to feature A, well it is not just feature A that needs to have QA look at it, it is now also feature B that needs a complete regression test. But wait, what if QA is unaware that feature B is dependent on code that was changed in feature A, then that testing may get missed. Now, also because we can’t release feature A independently of feature B we now have to release both together. Now, we have created production bugs that are costly to fix in feature B and now we have maintenance tasks that will now take up time in the next sprint. This is time that could have instead been spent on other new development work to add more business value to the product. Also those bugs in feature B will now require testing time in the next sprint and possibly regression testing of feature A or possible breaking feature C, etc.

So, as was stressed in the talk, you should always be working towards small applications that can be independently released, if not, you my be stuck in this loop.

Testing Legacy Code

While at SD West I had the chance to attend a talk by Elliotte Rusty Harold who has written numerous books such as Refactoring HTML just to give an example. This talk was a tutorial on writing tests for existing production legacy code. Elliotte has a very practical approach to dealing with adding tests to existing legacy code.

The benefits of TDD are clear and 100% code coverage is always the goal. However, with existing legacy code you need to take a much different approach. When optimizing, or refactoring existing legacy code you can apply TDD and it works very well, however you must give up a few things:

  1. 100% Code Coverage
  2. Unit Tests

Focus on Broad Tests

Since you are writing tests for a system that has already been running on production for quite some time, you can assume for the most part it is offering what the user’s want, or there would have already been complaints and issues logged (there will be bugs, but overall it is functioning). So don’t focus on 100% code coverage or unit tests, instead focus on writing broad tests that confirm some functionality that the user relies on is working correctly.

Yes, when you make changes and a broad test does fail, it will not immediately show you the cause of the failure, like a typical unit test will. However, it will still show you that you have broken existing functionality and this is a big improvement over having no tests at all. In a small amount of tests we have been able to at the very least protect some existing functionality. The key is that since we usually have a very limited amount of time to devote to writing tests for legacy code, that it is more important to get broad coverage than targeted coverage.

Just like when using TDD with new functionality, when making any change at all in legacy code, ensure the tests are written BEFORE changing anything, but still in this case broad tests are the place to start. This is not advocating ignoring full coverage for your new code, only the legacy code. If you have time to write full unit tests for existing legacy code, great, but most developers do not. The key is to make sure your broad tests are in place FIRST.

When dealing with legacy code, the fastest gains in code coverage and also the biggest gains in reducing bugs come from starting your tests at the highest level possible. In writing these tests you do not think about the structure of the code, but rather just “What does this application do” and write tests to confirm that functionality and protect it.

Add a New Test and It Fails

Now in TDD when writing new code, the moment a test fails you fix the code until your test passes. However, when writing tests for existing legacy code, if you add a new test to an area that is not currently covered by tests and it fails, but you are confident the test should work, do not stop and fix the code to get the test to pass, this is a common mistake. You are far better off to log an issue for the bug and to continue to write more tests. The reason is because at this stage you do not have enough tests in place to ensure your bug fix will not break another area of the application. So write more tests, leave this test failing, and log an issue for fixing this bug.

Basic Method to Testing Legacy Code

So to get started with writing tests for legacy code you:

  1. Look at the existing application
  2. See what functionality it offers to the user
  3. Write tests to cover that high level functionality

Top Down Approach

When writing tests for legacy code, you want to go in the order of what will give you the most gain for the least amount of effort, so go in the following order:

  • Tests for each Package
  • Tests for each Class
  • Tests for each Method (with legacy code, will probably never get this far)
  • Tests for each Line (100% code coverage)

Once again the Michael Feathers book Working Effectively with Legacy Code was highly recommended. When I read this book I enjoyed it a great deal. At the conference this book was recommended at most talks I attended and is highly regarded as a reference for refactoring and getting code into a testable state.

How to be Agile with Fragile Legacy Code

I attended a talk by Vandana Shah while at SD West 2008 that had a different perspective on transitioning to Agile. Most discussions on Agile focus on teams in the initial stages of a project, not after several years of code has been written. In the short term the costs of adding legacy code appear cheap, but in the long term become increasingly expensive in terms of maintenance and QA costs. The basic approaches to reducing this cost later on, by investing in your code now is to reduce complexity, increase test coverage, introduce best practices (coding standards for both code and unit tests), and as a result reduce the number of issues that make it to the QA cycle and production.

Unit Testing (Adding tests to legacy code)

So where do you begin? Start by adding tests to cover existing functionality in your product. The product is stable and in production, so start covering it with tests. There are two types of tests

  1. High Level – these are the tests that ensure the system does what it is supposed to, protects the business logic. These tests allow us to confirm that the functionality the user expects from the system has not been broken by any additional changes. These tests cover the bugs that will be the most likely to occur
  2. Low Level – these are the tests that cover individual pieces of logic, the small units that make up the high level tests

You should start with high level tests, these are the tests that offer the most business value. It is important to capture the overall functionality of the system with tests first when dealing with legacy code, since these are the tests that will provide the least hindrances to future refactoring.

Teams need training on Unit Testing and often they demand it (especially when it comes to refactoring existing legacy code to get it into a testable state). Document it on the wiki and update it regularly!!!!! Providing this training early on prevents a mess that needs to be cleaned up later.

Switching to Agile

Switching an existing team with a legacy application over to Agile requires a different approach then starting Agile with a new team working on a new application. It is a mistake to take the “All or nothing” approach to SCRUM, the “We must do it all now”. Agile in this type of environment should be introduced in a slow incremental approach where each new process is brought in one at a time, refined in the retrospective, then as that process stabilizes another is introduced. So the cycle is “Start small, test, refine, add more”. This approach may feel like a slower change, but often can make the transition significantly easier. Switching a teams entire process all at once, can lead to frustration amongst the team and loosely followed processes.

Problems with Large Projects (Code size)

Often developers will change their development process just based on the size of the existing application. For example, you have a very large project, it is very slow in your IDE tool, you want to use a refactoring tool, but are unable to because of the size of your project and how slow the tool runs on the project. So what happens? The team does not use a refactoring tool. This is a mistake that misses focusing on the core issue: The project is too big, it needs to be broken up. It is also a hindrance to TDD. Quick feedback is extremely important for TDD, long build times and slow running unit tests end up hurting this process. Developers do not want to wait for the tests to finish, so will write more and test later.

Regular Software Cleanup

Make time in each sprint to not just write unit tests, but to also remove dead code and duplicate code as part of the step of adding those unit tests.

Testing and QA

Allow automation to happen over time, DO NOT force it. Establish the right processes first, everytime those processes get ironed out, QA will see common areas that can be easily automated, then you can move in that direction.

Piggy Backing

One approach discussed to introducing Agile methodologies into an existing team was called “Piggy Backing”. Do not introduce a new process all at once, just introduce new items to the existing process slowly overtime, piggy backing on what the team is already doing. The team will evolve to where you want them to be and often get their quicker, with far less pain and less resistance from the team

Principles and Practices of SCRUM

Mark and I attended a talk called “Principles and Practices of SCRUM” this afternoon, presented by Rob Myers from NetObjectives.

When a team commits to completing a set of stories for a sprint, it is up to the Product Owners to constantly be viewing the Burn Down or Burn Up charts very early on in the sprint and check the velocity. If the velocity shows that the team will not be on track to finish all the stories that have been put in the sprint, it is extremely important to pull stories early in the sprint and not late. The reason is because when a team feels they have more work then they can handle they will rush to try and finish it all for the end of the sprint. This is bad because this is where many bugs get created. Pulling stories early on is far better for the team. It is better to under shoot and add extra items at the end of the sprint, then to over shoot and not complete items or have to pull items at the end of the sprint.

I thought this discussion was interesting and made a lot of sense.

Recommended Refactoring Books

During the talks I have attended this week there have been some books that have been mentioned over and over again.

So far the number one recommended book, and this has been recommended by presenters who also have published excellent books on design and refactoring has been:

“Working Effectively With Legacy Code” by Michael Feathers

Also, other books mentioned repeatedly have been:

“Refactoring: Improving the Design of Existing Code” by Martin Fowler, Kent Beck, John Brant, and William Opdyke

“Refactoring to Patterns” by Joshua Kerievsky

Mastering Design Patterns

While attending the SD West 2008 conference I made sure not to miss any talks by Robert C. Martin. This talk was excellent. The room was packed with standing room only available in the back. Robert C. Martin went through a number of common design patterns. There is no reason for me in trying to summarize them all, just get the book. In the talk we went over the following design patterns:
Design Patterns

  • Adapter
  • Strategy
  • Template
  • State (two level, and three level)
  • Observer
  • Model View Controller
  • Model Presenter
  • Command
  • Actor Model

This was an excellent talk.

http://www.objectmentor.com

TDD

During the talk on TDD (Test Driven Development) the loop of the TDD cycle was discussed. I realize some of you who have already read about this topic and are familiar with this cycle, but I just thought I would mention it.

The TDD Loop

  1. Write a small test for new functionality
  2. Write just enough code for the new functionality to pass the test
  3. Confirm that the revised system passes all tests
  4. Refactor to remove “code smells”
  5. Confirm that the revised system still passes all tests

So the question came up during the talk “How do you know how many tests to write?”. The answer to this question was “Write as many tests as you feel are necessary to feel confident that the code functions as it is supposed to”. Developers are very good at writing positive tests (tests that show the code does what it is supposed to), however developers need to learn to get into the mindset of writing negative tests (tests that exercise conditions that should not happen but might).

TDD is extremely useful and important. Part of the problem pointed out with adoption is that if developers are just told to write unit tests and use TDD but are never given a good explanation of “Why” it can cause resistance to moving in this direction.