Monday, September 17, 2012

Java Closures and The Start-End Problem

I use the term, "Start-End Problem" to describe a resource that needs to be opened and closed, or a header and footer that needs to be printed in a way that would tempt you to make a class with start() and end() methods, that are meant to be called as a pair with other code in-between. Some time around 1997, my mentor Jack Follansbee told me to avoid this pattern whenever practical, because it's too easy to forget to call the end() method. Another problem would be if the code in-between performs a return, continue, break, throws an Exception, or otherwise avoids reaching the end() method.

Java's try-catch-finally block solves the return and Exception problems, but does not ensure that you will remember to call the end() method in the finally block. There are three traditional Java approaches to this problem:

  1. Pre-evaluating the middle code somehow (e.g. with a buffer) and passing it to a procedure which appends to the Start and End of the buffer.
    public void doStartEnd(StringBuilder sB) {
        sB.insert(0, "Start");
        sB.append("End");
    }
    This has limited applications, but is nice when it works.
  2. Dependency Injection: Creating a special-purpose procedure with variables to adapt it to multiple related purposes. If the number of variables is small (or 0) this works very well. But sometimes multiple variables need to be passed to this procedure and even a complex data structure must be created to hold the updated values for the return type. It seems a waste of time and energy to load all the variables into the procedure arguments like passengers on a bus then unloaded their updated values from the return data-type one by one.
    public class ReturnObject {
        public int count;
        public boolean showedAnything;
    }
    public ReturnObject doStartEnd(int count, String middle, boolean showedAnything) {
        System.out.println("Start");
        if (!showedAnything) { System.out.println("New stuff"); }
        for (; count < 5; count++) {
           showedAnything = true;
           System.out.println(middle);
        }
        System.out.println("End");
        ReturnObject ro = new ReturnObject();
        ro.count = count;
        ro.showedAnything = showedAnything;
        return ro;
    }
    Real world examples can get complicated very quickly. Someone on StackExchange: Programmers recently said they routinely found procedures with 10 or 20 parameters where they worked and that they "died a little bit inside" every time they found one. There are numerous ways to cause bugs with this approach: transposing values, confusing Java's pass-by-value for default types with pass-by-reference... just to name a few. The coding effort involved can be impressive.
  3. Create an interface for the middle code being executed and have an object implement that interface. A well designed object, possibly with a Builder pattern can mitigate some of the Bus Station issues, but since this pattern is just the Dependency Injection patter turned inside-out it sometimes just pushes the dependency injection overhead of the above solution from a procedure to an object. This used to be the most general solution to the start-end problem. The first example below uses an abstract class and an anonymous implementation but that is just a minor variation on this technique.

A functional programmer would say that a lexical closure would be the obvious solution to this problem. It allows an arbitrary amount of code (including the caller's local variables) to be executed in the middle of some other code. No pre-evaluation, no Bus Station, no extra objects or interfaces. The outer "enclosing" code can do the start() and end() logic with the "enclosed" code in a little magic closure envelope in the middle. Java doesn't have closures or function pointers, but an anonymous inner class looks a little like a closure and the following code compiles, works, and even solves the Start-End problem for some special cases, though it might not win many beauty contests:

public class ClosureTest {
    private static abstract class StartEnd {
        public void doStartEnd() {
            System.out.println("Start");
            middle();
            System.out.println("End");
        }
        public abstract void middle();
    }

    public static void main(String[] args) {
        new StartEnd() {
            @Override
            public void middle() {
                System.out.println("Middle");
            }
        }.doStartEnd();
    }
}

Output:

Start
Middle
End

A sad limitation of this technique is that Java tries to prevent you from updating the main() method's local variables inside the the anonymous inner class by forcing you to make them final (immutable). The following WILL NOT COMPILE:

public static void main(String[] args) {
    int count = 0;
    new StartEnd() {
        @Override
        public void middle() {
            // ERROR: local variable count is accessed from within
            // inner class; needs to be declared final
            for (; count < 5; count++) {
                System.out.println("Middle");
            }
        }
    }.doStartEnd();
    System.out.println("Total Count " + count);
}
A mutable wrapper class will work around this restriction. Uglier, but it works:

private static class MutableRef<T> {
    public T count;
}
public static void main(String[] args) {
    final MutableRef<Integer> mr = new MutableRef<Integer>();
    mr.count = 0;
    new StartEnd() {
        @Override
        public void middle() {
            for (; mr.count < 5; mr.count++) {
                System.out.println("Middle");
            }
        }
    }.doStartEnd();
    System.out.println("Total Count " + mr.count);
}
Output:
Start
Middle
Middle
Middle
Middle
Middle
End
Total Count 5

Java 7's try-with-resources feature provides my favorite solution to this particular issue. Not a true general-purpose lexical closure, but sure looks like one for solving this particular problem:

public class ClosureTest {
    private static class StartEnd implements AutoCloseable {
        public StartEnd() { System.out.println("Start"); }
        @Override
        public void close() { System.out.println("End"); }
    }

    public static void main(String[] args) {
        int count = 0;
        try (StartEnd se = new StartEnd()) {
            for (; count < 5; count++) {
                System.out.println("Middle");
            }
        } // end of StartEnd
        System.out.println("Total Count " + count);
    }
}

One more detail... If you compile with -Xlint it may complain, "warning: [try] auto-closeable resource se is never referenced in body of corresponding try statement." I have found that using this pattern in my own code I usually use the se variable within the corresponding try block. But for the times that I don't, a preventCompilerWarning() method eliminates the warning:

public class ClosureTest {
    private static class StartEnd implements AutoCloseable {
        public StartEnd() { System.out.println("Start"); }
        @Override
        public void close() { System.out.println("End"); }
        public void preventCompilerWarning() { }
    }

    public static void main(String[] args) {
        int count = 0;
        try (StartEnd se = new StartEnd()) {
            se.preventCompilerWarning();
            for (; count < 5; count++) {
                System.out.println("Middle");
            }
        }
        System.out.println("Total Count " + count);
    }
}

VoilĂ  - the Start-End problem solved! Unlike a lexical closure in a functional language, this technique executes the close() method even when a return statement is reached or an Exception is thrown. This may be good or bad depending on your situation.

Notice that the local variable count is being used "inside" the StartEnd block without being visible to that code? No dependency injection, interfaces, objects, etc. This is the essence of a lexical closure - a little bubble of extra variable scope without otherwise violating the privacy of the enclosed code. Hopefully you can imagine some of the versatile coding paradigms which the new try-with-resources block makes available in Java 7.

For a more general solution to the problem of closures in Java before Java 8, check out this post.