Thursday, July 22, 2010

Train Model of Software Development

Traditionally software development has been considered a process in managing a classic 3 way trade-off, which looks something like this:


The basic idea is that there is always a tension pulling in opposite directions in each of these dimensions. If you add more features you need to lower the quality or deliver later. If you compress the schedule you will need to reduce the quality or the number of features.

However, the relationship is not that simple. Because software development is an exercise in constant refinement of unknowns, the three way trade-off ignores the fact that these decisions are made in the face of imprecise information. As a result there is a tendency to underestimate the effects of one of choosing these trade-offs. In particular trading off quality almost always leads to schedule slip, due to the accumulation of technical debt.

Agile development in part works because there is a recognition that this three way trade-off model is fundamentally flawed. In a classic Agile project (e.g. SCRUM) you have a fixed schedule (the iteration), a fixed level of quality (passes the acceptance tests), and it is only the features that change.

In a pure Scrum project, you would end every sprint with a release candidate. However, in practice the length of iterations does not accurately reflect the ebb and flow of a traditional software lifecycle, and as a result there has been a move towards a higher level release cycle which layers on top of this.

The analogy used is to think of software releases like a train. A train leaves the station at regular intervals following a timetable published in advance. If someone/something misses the train then another one will come along soon. Releases are planned at regular intervals (e.g. quarterly, every six months), and features are sized accordingly to fit that schedule. If a feature looks like it won't make it, it will be dropped and appear in the next release.

This model is used very successfully in a number of products (e.g. Eclipse, Ubuntu), and is well suited to software products. The customers and the Product Manager (PM) know that there is going to be a certain release at a certain time and can plan around it. If the PM has a high priority feature/bug fix, they can tell the customer with some degree of certainty when that will appear. Marketing knows in advance when a product will be released (although they don't necessarily know in advance exactly what will be in it). Developers have a clear goal on what to deliver, and performance can be easily measured on a regular basis.

Each release has a lifecycle that gets it out on the release train. While obviously those processes can vary, there are some basic phases which I have given below:

  • Envisioning: This is the initial planning and R&D phase that begins each major release cycle. Features are planned for the release and broken down into user stories; UI mockups are developed and reviewed; any design or prototyping that needs to be done is covered.
  • Feature development: This is the standard sprint/iteration where features are developed and delivered. Care must be taken to ensure that this development is sustainable meaning that the testing effort can keep up with the development pace. Features should not be considered finished until they pass all their User Acceptance Tests which should be written at the time the story/feature is defined (and ideally automated).
  • Hardening: This is the end of phase when much more rigorous testing is performed, release engineering and packaging is performed and bug fixes are made to ready for release. If the feature development phase is done properly the bug fixing is going to be minimal, however there will always be some as testing becomes less confirming and more adverserial.

For example, suppose we want to develop a train schedule that allows for about 2 major releases a year, and allows for a minor release in the off quarters if required. This would require splitting the team into maintenance and research for about one quarter, before merging the development together. The train would look something like this:


At the end of every major release the team splits into two, with most people working on maintenance (bug fixes that didn't make the previous train, minor refactoring, automating tests etc), while a smaller group move into the envisioning phase for the next major release. In the second sprint, the minor release team moves onto minor features before hardening for a dot release in the third sprint. A fairly modest amount of work can be done in this amount of time so the focus is on improving quality and getting features that missed the last train. Meanwhile envisioning will take about 4-8 weeks after which time work on major features can begin. Depending on the resource needs there can be some drift between teams in this phase. If the needs of the minor release are modest more work can be done on starting the major features. If the minor release needs more polish/testing then resources can move on to this. At the end of the quarter the minor release should go out. Alternatively it may be decided that there is no pressing urgency to do a minor release and these fixes can simply go out in the next major release.

Now the major feature development begins in earnest. It is critically important at this stage that major features are correctly prioritized and that development is proceeding at a sustainable pace. The goal is to deliver only finished features, so that means not starting on features until others are finished in order to reduce the risk that in the hardening phase there will be an unsustainably large number of bugs.

There is a fair degree of flexibility in this release process. For example, a minor release can be all maintenance with no additional features, or not released at all. Envisioning may take only a short amount of time leaving more time for feature development, or it may take a full quarter meaning only a minor amount can be done. I have used a quarter as a unit of release, but six months, or an off number (say 7 or 9 months) may be more appropriate.

The only thing that needs to stay consistent are the release points. The reason for this is that if the tidal flow of the project stays consistent, then there will be much better predictability, a lot more happiness and much better results for customers and PM.

The train model has been well proven. I would argue it is a sign of maturity in most teams that they move to something like this more predictable process than the chaos that attends most software development. PM needs to be prepared to give up the fact that major releases will not have everything they want, but the train model allows them to get the things that are important out to customers sooner. I would argue that this delivers considerably more value to customers than shipping them features late that they may or may not need.

It also allows a much more agile response to the inevitable customer feedback you get as a result of releasing your product. If you can only do a major release once every 18 or 24 months then customers will die of boredom waiting for new features. If you can do it twice a year, with minor releases between for those things that are important but can't be delivered on time, your customers will love you.

Saturday, June 5, 2010

Preventing Internet Explorer from caching AJAX requests when using GWT RequestBuilder

I recently spent the best part of a few days trying to work out why some GWT HTTP code I had written was not working (well specifically why it was not working on Internet Explorer). The code was talking to an existing server API that used HTTP GET for a handshake to establish a session-id with the server, and thereafter used HTTP POST to retrieve data. The problem occurred when it attempted to re-handshake. It would always return the same session-id, whereas the server would expect a different session-id. It was all very confusing, until I did some HTTP level tracing and noticed that past the first GET call, the server would never see another GET request.

It turns out, for reasons best known to itself, Internet Explorer will aggressively cache GET requests when using the XMLHttpRequest object. When writing an AJAX application this is particularly painful if you want your result to be dynamic.

If you can modify the server, the best approach is to ensure that the server correctly sets the Cache-Control: no-cache header, but this is not always possible if you are working with a legacy backend.


Another solution is to use an HTTP POST instead, but again that's a problem if you working with existing code, or you are writing to a REST API and you want to keep the GET = Read and POST = Create idiom.


If you want to force the client to get it right this narrows your choices.

A commonly suggested approach is to generate a random request parameter and add this to your request either by appending ?requestId=r4nd0m-v4lu3 to the URI, or if your request already includes parameters appending &requestId=r4nd0m-v4lu3. While this means you won't see the cached responses, the browser still does put these items in the cache, and if you are doing a large number of these requests, this will start to evict things that you actually want to be in there! It may also be a problem if your application enforces strict checking of the passed in parameters.

The best approach I have found is to use the If-Modified-Since header with the date set to the Epoch (1970-01-01 00:00:00 GMT). The browser will not cache the result, and you will not have to worry about clashing with existing parameters. The nice thing about this approach is it is a general solution which you can put in a library and then not worry about problems with the backend.

To do this when using the Google Web Toolkit (GWT) RequestBuilder interface it is simply a matter of using the setHeader method as follows:

void nonCachingHttpGet(String uri, String data, RequestCallback callback)
     RequestBuilder builder = new RequestBuilder(RequestBuilder.GET, uri);
     // This header is required to force Internet Explorer to not cache values from
     // the GET response.
     builder.setHeader("If-Modified-Since", "01 Jan 1970 00:00:00 GMT");
     builder.setHeader("Content-type",
                       "application/x-www-form-urlencoded");
     builder.setCallback(callback);
     builder.send();
}

Monday, May 24, 2010

Writing unit tests when using GWT's Static String Internationalization (I18n) feature

The Google Web Toolkit (GWT) has a fairly simple infrastructure for managing internationalization. While there are a number of different options, the easiest one to use is called Static String Internationalization The basic idea is that you create a properties file for each language and GWT's deferred binding process creates an instance of a Interface which will be loaded depending on the locale. It works something like this:

  1. Create a properties file (e.g. FooConstants.properties). This will contain definitions of the form bar=A string that I want to i18n.
  2. Use the i18nCreator script to generate the Interface definition: i18nCreator -eclipse Foo com.example.foo.client.FooConstants
  3. In your code call GWT.create() to instantiate an instance. e.g:
    public static class MessageDisplayer {
        public MessageDisplayer(Alerter alerter) {
            FooConstants constants = (FooConstants) GWT.create(FooConstants.class);
                alerter.alert(FooConstants.bar());
            }
        }
    }
    
The problem with this is that calling GWT.create from within your code makes it difficult to unit test. If your code calls this directly then you will have to create your unit tests using GWT's Junit3 hack. Running unit tests this way is very slow, and I find it much better to try and factor out as much GWT specific code as possible so that you can write normal boring tests (for example using JUnit 4, or using mocks, or whatever else that GWT tests don't support that takes your fancy). The trick is that if you have any code that relies on an instance of these Constants/Messages files then you are screwed. The solution is to use a DynamicProxy to generate an instance of the interface and then inject these into your code. First we need to refactor our class to have the Constants interface injected, e.g:
public static class MessageDisplayer {
    public MessageDisplayer(Alerter alerter, FooConstants constants) {
        alerter.alert(FooConstants.bar());
    }
}
Next we need to write some code to generate the constants. When the i18nCreator generates the interface it helpfully annotates it with the default text it needs. We can exploit this to generate an instance for testing:
public class ConstantsMocker implements InvocationHandler {
     @SuppressWarnings("unchecked")
     public static <T> T get(Class<? extends T> i18nInterface) {
         return (T) Proxy.newProxyInstance(i18nInterface.getClassLoader(),
                  new Class<?>[] { i18nInterface },
                  new ConstantsMocker());
     }
     public static final String NO_DEFAULT_MESSAGE = "[No Default Message Defined]";

     @Override
     public Object invoke(Object proxy, Method method, Object[] args)
             throws Throwable {
         DefaultMessage message = method.getAnnotation(DefaultMessage.class);
         if (message == null) {
             return NO_DEFAULT_MESSAGE;
         }
         return message.value();
     }
} 
Now when we write our test we can pass down an instance, e.g
@Test
    public void whenConstructorIsCalledAlerterAlertIsCalled {
        AtomicBoolean wasAlerted = new AtomicBoolean(false);
        FooConstants fooConstants = ConstantsMocker.get(FooConstants.class);
        final String expected = fooConstants.bar();
        Alerter alerter = new Alerter() {
            void alert(String msg) {
                assertEquals(expected, msg);
                wasAlerted.set(true);
            }
        }
        new MessageDisplayer(alerter, fooConstants);
        assertTrue(wasAlerted.get());
    }
Whilst this is a toy example it can be a very useful technique if used carefully. Note that regular GWT code just injects the instance from GWT.create() as follows:
MessageDisplayer displayer = 
      new MessageDisplayer(Alerter, (FooConstants)GWT.create(FooConstants.java));

Sunday, May 16, 2010

Creating a bounded LRU Cache with LinkedHashMap

I recently had cause to implement a fixed size cache in some code was writing. I wanted a straightforward map, but with a fixed size and the ability to evict on a least-recently-used (LRU) basis. I thought quickly to get something up and running that I would use a LinkedHashMap and then have my get and put methods use the ordering to implement the LRU policy.

Well it turns out that LinkedHashMap already has this support built-in if you know what you're doing. There are two parts of this jigsaw:

  1. Override the protected removeEldestEntry method which returns a boolean if the eldest entry should be removed. This method is called for every put with the element that
    is eldest according the LRU policy.
  2. Call the LinkedHashMap(int capacity, float loadFactor, boolean accessOrder) constructor. Specifying true for accessOrder means that the order of the elements will be sorted in the order they were last accessed (from least recently used to most recently used).
Putting this together yields:
public class BoundedLruCache<KEY, TYPE> extends LinkedHashMap<KEY, TYPE> {

    private static  final int DEFAULT_INITIAL_CAPACITY = 100;

    private static final float DEFAULT_LOAD_FACTOR = 0.75;

    private final int bound;

    public BoundedLruCache(final int bound) {
        super(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR, true);
        this.bound = bound;
    }

    @Override
    protected boolean removeEldestEntry(Map.Entry<KEY, TYPE> eldest) {
        return size() > bound;
    }
}
Which is remarkably compact. Once again the JRE library proves to have some nice bits and pieces if you know where to look.

Tuesday, April 20, 2010

Collections.addAll for adding an array of elements to a Collection

Shows there is always more that you can learn about the JRE. I have been for ages using the following idiom to add an array of items to a collection:

Collection<Foo> foos = ...;
Foo[] foosToAdd = ...;
foos.addAll(Arrays.asList(foosToAdd);

What I hadn't noticed is that the java.util.Collections class has a static method that does this:

Collection<Foo> foos = ...;
Foo[] foosToAdd = ...;
Collections.addAll(foos, foosToAdd);

Which is not only a little bit simpler but the javadoc says is faster for most implementations of Collection.

Better yet, Collections.addAll takes a variable length argument list and not just a straight array so you can do without the array declaration and just do:

Collection<Foo> foos = ...;
Collections.addAll(foos, foo1, foo2, etc);

See: Collections.addAll Javadoc