Wednesday, August 31, 2011

Object Mutability

This post is part of a series on Comparing Objects.

I've read a lot lately about making objects immutable whenever possible. "Programming in Scala" lists Immutable Object Tradeoffs as follows:

Advantages of Immutable Objects

  1. Often easier to reason about because they do not have complex state.
  2. Can be passed freely (without making defensive copies) to things that might try to modify them
  3. Impossible for two threads accessing the same immutable object to corrupt it.
  4. They make safe Hashtable keys (if you put a mutable object in a Hashtable, then change it in a way that changes its hashcode, the Hashtable will no longer be able to use it as a key because it will look for that object in the wrong bucket and not find it).

Disadvantages

  1. Sometimes require a large object graph to be copied (to create a new, modified version of the object). This can cause performance and garbage collection bottlenecks.

For most purposes, an object representing a month can be made immutable - February 2003 will never become anything other than what it is. But a User record is not immutable. People get married or change their name for other reasons. Phone numbers, addresses, hair color, height, weight, and virtually every other aspect of a person can change. Yet the person is still the same person. This is what surrogate keys model in a database - that everything about a record can change, yet it can still be meaningfully the same record.

In order to use an object in a hash-backed Collection (in Java), its hashcode must NOT change. The simplest way to accomplish this is to make the hashcode of a mutable persistent object its surrogate key and to use that key as a primary comparison in the equals method as well (see my older post on Implementing equals(), hashcode(), and compareTo()).

To make an immutable object, you sometimes need a mutable constructor object, like StringBuilder and String. StringBuilder allows you to change your object as many times as you want, then get an immutable version by calling toString(). This is clean and safe, but has some small costs in time and memory (transforming the immutable StringBuilder into a new immutable String object, then throwing away the StringBuilder). An alternative that I have not seen much is to create an immutable interface, extend a mutable interface from it, and then extend your object from that.

Here's an example based on java.util.List. Pretend each interface or class is in its own file:

// All the immutable-friendly methods from java.util.List.
// Interfaces like these could easily be retrofitted into
// the existing Java collections framework
public interface ImmutableList {
    int size();
    boolean isEmpty();
    boolean contains(Object o);
    Iterator<E> iterator();
    Object[] toArray();
    <T> T[] toArray(T[] a);
    boolean containsAll(Collection<?> c);
    boolean equals(Object o);
    int hashCode();
    E get(int index);
    int indexOf(Object o);
    int lastIndexOf(Object o);
    ListIterator<E> listIterator();
    ListIterator<E> listIterator(int index);
    List<E> subList(int fromIndex, int toIndex);
}

// This interface adds the mutators
public interface java.util.List extends ImmutableList {
    boolean add(E e);
    boolean remove(Object o);
    boolean addAll(Collection<? extends E> c);
    boolean addAll(int index, Collection<? extends E> c);
    boolean removeAll(Collection<?> c);
    boolean retainAll(Collection<?> c);
    void clear();
    E set(int index, E element);
    void add(int index, E element);
    E remove(int index);
}

public class java.util.ArrayList implements List {
    // just as it is now...
}

public class MyClass {
    someMethod(ImmutableList<String> ils) {
        // can't change the list
    }

    public static void main(String[] args) {
        List<String> myStrings = new ArrayList<String>();
        myStrings.add("hello");
        myStrings.add("world");
        someMethod(myStrings);
        // Totally safe:
        System.out.println(myStrings.get(1));
    }
}

This doesn't solve the problem of passing a list to existing untrusted code that might try to change it. It also doesn't prevent the calling code from modifying myStrings from a separate thread while someMethod() is working on it. But it does provide a way (going forward) for a method like someMethod() to declare that it cannot modify the list. The programmer of someMethod() cannot compile her code if she tries to modify the list (well, short of using reflection).

Guaranteed immutability can be critical in writing concurrent code and for keys in hashtables. Not all objects can be made immutable, but many of those objects have immutable surrogate keys that, if used properly, work around the pitfalls of mutability. Limiting mutability and avoiding common mutable object pitfalls can lead to fewer bugs, easier readability, and improved maintainability.

Thursday, August 25, 2011

Typesafe List in Java 5+ part two...

Looks like John's comment on yesterday's post had us both up half the night working on the same thing. I'm posting my version here because it has a parameter that lets the caller determine the trade-off between speed and safety. When debugging, you can use Check.ALL (or leave it null). In performance-critical code, you can later set it to NONE. Check.FIRST is for situations where you just a sanity check - it's fast and better than nothing.

I've also included some test code and a routine that takes advantage of the String.valueOf() methods to coerce each element of the original list to a String, if necessary, much the way that many dynamically typed languages do. I wonder if it would be worthwhile to further optimize this method for long lists by dividing the original list into runs of items that do not need casting and copying those runs with stringList.addAll(list.subList(fromIdx, toIdx)) instead of copying one item at a time?

John's solution uses B.class.isAssignableFrom(a.getClass()) while mine uses a instanceof B. I think either works in this case because a List can only hold objects, not primitives. If we were dealing with primitives, John's solution would handle more cases than mine. Not sure if there are performance considerations, but I doubt they would be significant.

import java.util.ArrayList;
import java.util.List;

public class TypeSafe {

    public enum Check {
        NONE,
        FIRST,
        ALL;
    }

    @SuppressWarnings({"unchecked"})
    public static <T> List<T> typeSafeList(List list,
                                           Class<T> clazz,
                                           Check safety) {
        if (clazz == null) {
            throw new IllegalArgumentException("typeSafeList() requires a non-null class parameter");
        }
        
        // Should we perform any checks?
        if (safety != Check.NONE) {
            if ( (list != null) && (list.size() > 0) ) {
                for (Object item : list) {
                    if ( (item != null) &&
                         !(item instanceof String) ) {
                        throw new ClassCastException(
                                "List contained a(n) " +
                                item.getClass().getCanonicalName() +
                                " which is not a(n) " +
                                clazz.getCanonicalName());
                    }
                    // Should we stop on first success?
                    if (safety == Check.FIRST) {
                        break;
                    }
                    // Default (Check.ALL) checks every item in the list.
                }
            }
        } // end if perform any checks
        return (List<T>) list;
    } // end typeSafeList()

    public static List<String> coerceToStringList(List list) {
        if (list == null) {
            return null;
        }
        try {
            // Return the old list if it's already safe
            return typeSafeList(list, String.class, Check.ALL);
        } catch (ClassCastException cce) {
            // Old list is not safe.  Make new one.
            List<String> stringList = new ArrayList<String>();

            for (Object item : list) {
                if (item == null) {
                    stringList.add(null);
                } else if (item instanceof String) {
                    stringList.add((String) item);
                } else {
                    // If this throws a classCastException, so be it.
                    stringList.add(String.valueOf(item));
                }
            }
            return stringList;
        }
    } // end typeSafeList()

    @SuppressWarnings({"unchecked"})
    private static List makeTestList() {
        List l = new ArrayList();
        l.add("Hello");
        l.add(null);
        l.add(new Integer(3));
        l.add("world");
        return l;
    } // end makeTestList()

    public static void main(String[] args) {
        List unsafeList = makeTestList();
        List<String> stringList = coerceToStringList(unsafeList);

        System.out.println("Coerced strings:");
        for (String s : stringList) {
            System.out.println(s);
        }

        List<String> safeList = typeSafeList(unsafeList,
                                             String.class,
                                             Check.ALL);

        System.out.println("Safe-casted list:");
        for (String s : safeList) {
            System.out.println(s);
        }
    } // end main()
}

Wednesday, August 24, 2011

Checking an Unchecked Cast in Java 5 or Later

If query.list() returns an untyped List, then the following code will produce an ugly warning: "Unchecked Cast: 'java.util.List' to 'java.util.List<String>'
List<String> tags = null;
try {
    ...
    tags = (List<String>) query.list();
    ...
Unfortunately, you can only use @SuppressWarnings on a declaration. I don't want to add this annotation to my method declaration because my method does a lot more than this one cast and this is a useful warning to detect in the rest of the method. One solution is to add a new variable to make a declaration to use @SuppressWarnings on:
List<String> tags = null;
try {
    ...
    @SuppressWarnings("unchecked")
    List<String> tempTags = (List<String> query.list();
    tags = tempTags;
    ...
Now it produces an inspection warning in IDEA 10.5, "Local variable tempTags is redundant." I should probably just ignore this warning or turn it off altogether, but I wasn't quite comfortable with that either. In situations like these, I naturally turn to Joshua Bloch for guidance and his item #77 - Eliminate Unchecked Warnings has this advice in bold:
If you can't eliminate a warning, and you can prove that the code that provoked the warning is typesafe, then (and only then) suppress the warning with an @SuppressWarnings("unchecked") annotation.
I could probably prove that my query is only going to return Strings, but it occurred to me that there is a more general way to satisfy all the above requirements:
@SuppressWarnings({"unchecked"})
private List<String> getListOfStringsFromList(List list) {
    if ( (list != null) && (list.size() > 0) ) {
        for (Object item : list) {
            if ( (item != null) && !(item instanceof String) ) {
                throw new ClassCastException("List contained non-strings Elements: " + item.getClass().getCanonicalName());
            }
        }
    }
    return (List<String>) list;
}
The method does nothing but make a typesafe cast on a list, so I can @SuppressWarnings on the whole thing. It throws an undeclared runtime ClassCastException, but only if the caller makes a programming error and this exception would get thrown anyway wherever the list was eventually used if I didn't throw it here. In short, this method takes a programming error that cannot be caught by the compiler and does the next best thing: fail-fast, making the error easy to find. Using the above solution hides all that complexity and preserves the intent of the original line of code:
List<String> tags = null;
try {
    ...
    tags = getListOfStringsFromList(query.list());
    ...
I thought this was interesting enough to post. Well, at least interesting enough for people who are considering a Java logo tattoo. If you liked this, you probably want to check out John Yeary's Response and my Updated Generic Version (part 2 of this post).

Wednesday, March 9, 2011

Question about MySQL Update speed

I originally posted this as a question, but Eric Wood helped me solve it in email, so I've added the solution below. The minimum time MySQL with InnoDB tables takes to do an update on a 3Ghz Core 2 Duo runing 64-bit Ubuntu 10.10 is somewhere around 0.06 seconds, though I wonder if hard drive speed could be the gating factor? My average time using Hibernate is 0.15 seconds. I think JDBC would approach 0.06 seconds. This was done using the mysql command line client. Any thoughts would be appreciated. Also timings vs. other databases if you have a sense.
mysql> CREATE TABLE `test` (
    ->   `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
    ->   `last_click_date` datetime DEFAULT NULL,
    ->   `is_active` bit(1) NOT NULL DEFAULT b'0' COMMENT 'True if user is still logged in.',
    ->   PRIMARY KEY (`id`),
    ->   UNIQUE KEY `id` (`id`)
    -> ) ENGINE=InnoDB;
Query OK, 0 rows affected (0.21 sec)

mysql> insert into test (last_click_date, is_active) values ('2011-03-07 14:18:16', b'1');
Query OK, 1 row affected (0.11 sec)

mysql> insert into test (last_click_date, is_active) values ('2011-03-07 14:18:16', b'1');
Query OK, 1 row affected (0.11 sec)

mysql> insert into test (last_click_date, is_active) values ('2011-03-07 14:18:16', b'1');
Query OK, 1 row affected (0.06 sec)

mysql> update test set last_click_date = '2011-03-07 14:18:16' where id = 1;
Query OK, 0 rows affected (0.07 sec)
Rows matched: 1  Changed: 0  Warnings: 0

mysql> update test set last_click_date = now() where id = 1;
Query OK, 1 row affected (0.06 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update test set is_active = b'0' where id = 1;
Query OK, 1 row affected (0.07 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update test set is_active = b'0' where id = 1;
Query OK, 0 rows affected (0.02 sec)
Rows matched: 1  Changed: 0  Warnings: 0

mysql> update test set is_active = b'0' where id = 1;
Query OK, 0 rows affected (0.10 sec)
Rows matched: 1  Changed: 0  Warnings: 0

mysql> update test set last_click_date = 20110307141816 where id = 1;
Query OK, 0 rows affected (0.06 sec)
Rows matched: 1  Changed: 0  Warnings: 0
It turns out that if I change the table to use the MyISAM engine, all the updates take 0.00 seconds.

Thursday, January 6, 2011

10 Most Important Password Manager Features

Maybe 1 in 60 of my accounts reports their passwords stolen every year. For every site that reports a break-in, a few others are probably broken into and don't know it or don't report it. I would guess that if you have accounts at 30 different sites, you should probably assume that one of them gets broken into every year. You can't stop people from discovering passwords this way, but if you use a unique, strong password, you can contain the damage so that a hacker cannot leverage the knowledge of one of your passwords to break into your other accounts.

I just watched How to choose a strong password and while that's good advice, most people can neither remember nor type a good password, or at least not more than one or two good passwords. The only practical way to use a unique, strong password for every site is to use a good password manager. As such, I'm proposing a Password Manager Feature Manifesto for people to use to compare password managers and decide which one is best for them.

Password Manager Feature Manifesto

A password manager needs to do certain things to be worthwhile:

1.) Store passwords securely, in one place so you can find them, change them, secure them as a unit. It always seemed to me that storing your passwords in your browser was a little bit like taping your wallet to the outside of your front door - you are putting your valuables in the most vulnerable place. KeePassX (without any plug-ins) is completely separate from your browser. Browser integration is not necessarily bad, but I think it loose some points from a security perspective. In any case, the passwords must be secured by a strong master password and encrypted on disk (and maybe in memory when possible too).

2.) Generate random passwords - people don't manually make strong passwords. Collecting entropy for the randomness is a huge plus. KeePassX and LastPass both generate strong passwords for you.

3.) Make it equally easy to store your credit-card number or license key for Photoshop as to store a password to a web site.

4.) Must be backed up every time you make a change. LastPass has this built-in. KeePassX must be used with something like Dropbox or SpiderOak and set to save automatically after every change.

5.) To be shared between multiple computers, e.g. LastPass or KeyPass/Dropbox

6.) Needs to be relatively easy to use

7.) To work on all major operating systems (Windows, OS-X, Linux). I look for this every time I choose software. I hate being tied to one vendor's operating system or browser.

If a password manager doesn't do all of those things, I'm not really interested in it. One thing that's not important yet, but I bet it will become critical for most people in the next few years:

8.) To work on your phone or other mobile device. Here is where LastPass may move ahead of KeePassX.

9.) Popular OpenSource software is recommended for security

And finally, not critical, but the icing on the cake:

10.) It's free, or at least a reasonable price.

That leaves KeePassX the clear #1 for me and LastPass #2. LastPass could threaten KeePassX if they keep improving on #6 and #8 - specifically, it is very hard to log into sites with LastPass that have the user ID on one screen and password on the next.

Sadly, no password manager can remember your operating-system login when you boot up your computer, so you have to remember that password yourself. Also, the master password for your manager. But for most people that's just 2 passwords to remember and type, and that's fairly do-able.

I have to thank DigitalMan for his contributions to this article by talking about this with me and sending articles about break-ins, security, and passwords for the past few years, and for encouraging me to improve my own password practices.

Update 2012-09-20: DigitalMan added the following excellent insights:

Re: Point 8: the iPhone now has an excellent, completely free, Open Source app which makes your KeePass database fully functional on the iPhone (and presumably the iPad as well): miniKeePass. To me, that further buries the case that Closed Source LastPass is a better option.

Lastly, Point 9 is crucial to me and not second tier. I'll leave you with my favorite Bruce Schneier quote:

As a cryptography and computer security expert, I have never understood the current fuss about the open source software movement. In the cryptography world, we consider open source necessary for good security; we have for decades. Public security is always more secure than proprietary security. It's true for cryptographic algorithms, security protocols, and security source code. For us, open source isn't just a business model; it's smart engineering practice.

Wednesday, December 29, 2010

Using Java Collections Effectively by Implementing equals() and hashCode()

IMPORTANT: The techniques in this post, while interesting, are outdated and sub-optimal. In short, follow standard equals() and hashCode() practice, but TEST your classes using something like TestUtils. I find a bug almost every time I use that.

This post is the first in a series on Comparing Objects.

These three methods must be implemented correctly in order for the Java collections to work properly.  Even though popular IDEs automatically generate stubs of some of these methods, you should still understand how they work, particularly how the three methods work together because I don't see many IDE's writing meaningful compareTo() methods yet. For much of what follows, I am endebted to Joshua Bloch and his book, Effective Java. Buy it, read it, live it.
  1. The behavior of equals(), hashCode(), and compareTo() must be consistent.
  2. You must base these methods on fields whose values do not change while they are in a collection.
If you store an object in a collection (e.g. as a key in a HashMap) and it's hashcode changes, you generally won't be able to retrieve it from that hashtable anymore! Thanks to "Programming in Scala" Section 30.2 p. 657. See also my later post on Object Mutability. You can use collections effectively with mutable objects so long as those objects use surrogate keys. In these examples I store my surrogate key in a private long id field with public getId() and setId() methods as many popular frameworks expect.

hashCode()

hashCode() is meant to provide a very cheap "can-equal" test.  It allows the put() and contains() methods on hashtables to run blazingly fast.  In small hashtables, the low bits from hashCode() determine which hash bucket an object belongs in.  In larger hashtables, all the bits are used.  The (presumably more expensive) equals() test is then applied against all the other objects already in that bucket.  If you had all your objects return some number, say, 31 for their hashCode(), this would completely destroy the performance of any hashtable based collections, since all objects would go in the same hash bucket and each object would have to be compared to all other objects using equals().

Bloch's Item 9 states, "Always override hashCode() when you override equals()". The following are specifically required (see: Object.hashCode()):
  1. x.hashCode() must always equal y.hashCode() when x.equals(y).
  2. It's OK for x.hashCode() to equal y.hashCode() when x.equals(y) is false, but it's good to minimize this.
Truncating the database row number from a long to an int is an ideal way to ensure efficient, equal distribution of values. If you don't use surrogate keys, you need to construct an int from the "significant" fields (the ones that uniquely identify this object):

@Override
public int hashCode() {
    if (id == 0) {
        return intField1 + intField2 + objField3.hashCode();
    }
    // return (possibly truncated) surrogate key
    return (int) id;
}

If your object does not have a surrogate key, then the field-by-field comparison in this solution is correct, though not quite as fast. If you like playing with bits, you can sometimes or and shift various fields into your hashcode in a way that is very efficient and not too hard to read.

equals()

a.equals(b) should return true only when a and b represent the same object. Bloch (Item 8) says that the equals() method must be reflexive, symmetric, transitive and a few other things as well which I won't cover here. For any non-null value:
  • x.equals(x) must be true.
  • If x.equals(y) then y.equals(x) must be true.
  • If x.equals(y) and y.equals(z) then x.equals(z) must also be true.
The following should get you off to a good start in writing an equals method that is all of the above.  Checking the hashCode() should be cheap and guarantees that two objects can't equal each other if their hashCodes are different.

@Override
public boolean equals(Object other) {
    // Cheapest operation first...
    if (this == other) { return true; }

    if ( (other == null) ||
         !(other instanceof MyClass) ||
         (this.hashCode() != other.hashCode()) ) {
        return false;
    }
    // Details...
    final MyClass that = (MyClass) other;

    // If this is a database object and both have the same surrogate key (id),
    // they are the same.
    if ( (id != 0) && (that.getId() != 0) ) {
        return (id == that.getId());
    }

    // If this is not a database object, compare significant fields here.
    // Return true only if they are all sufficiently the same.
    if (!this.getParent().equals(that.getParent())) {
        return false;
    }

    if (description == null) {
        if (that.getDescription() != null) {
            return false;
        }
    } else if (that.getDescription() == null) {
        return false;
    } else {
        // For each test, check and only return a non-zero result
        int ret = description.compareTo(that.getDescription());
        if (ret != 0) { return false; }
    }

    // Compare other fields
    // If all the same, return true
    return true;
}

Both objects must be valid before you compare them.  Your equals() method should either compare significant fields OR surrogate keys - not both! The danger of providing a field-by-field equals comparison for a database object is that it will work in some cases with invalid objects, but it not always.  This is a case where it's much better to fail fast, than to be scratching your head when an intermittent bug crops up in production. For database objects, using surrogate keys acknowledges that everything about an object can change over time, yet it is still essentially the same object (The Artist Formerly Known as Prince). For non-database objects, (including those that just haven't been given a surrogate key yet) you must compare individual fields.

With care, you can ensure consistency of equals() and compareTo() by defining one in terms of the other, but be careful not to create an infinite loop by defining them both in terms of each other!

Persistence/Hibernate

Persistence or communication frameworks create temporary surrogate objects in order to avoid fetching any extra objects from the database before they are needed.  Hibernate replaces a surrogate object with the actual object the first time a field other than id is accessed, or any methods other than persistent field accessors are accessed.  All of the above examples are designed to work with a persistence framework like Hibernate.

So your object can trust itself to be initialized inside equals(), hashcode(), and compareTo(). It should NOT trust that the other object being compared to is initialized! You can access the this.whatever fields directly, but always use that.getWhatever().

Scala's Case Classes

Declaring your class as a "case" class in Scala takes care of all the above items for you. It prevents inheritance, but for simple classes, it saves a ton of thought and typing! For non-case classes, you must do more work in Scala than in Java in order to support meaningful equals comparisons with inheritance. You have to implement a canEqual() method as well to support the idea that a parent class might think it was "close enough" to a child class, but the child might think they were different (because it defines extra fields relevant to the equals() method), so the child must implement canEqual() and the parent must check it in order to block the parent from thinking they are equal. I've never been bitten by this in Java, but I don't immediately see what prevents it.

Clojure

All Clojure's common built-in datatypes are immutable and implement the above methods for you, making them extremely easy to work with.

SerialVersionUID

I have not verified this, but it stands to reason that if you change hashCode() you probably need to update the SerialVersionUID just as you would if you changed any persistent field. Otherwise, you may end up with two of the same objects in a set (one with the old hashCode and one with the new). I'm not sure if this can happen in practice or not. Maybe someone will post test code in the comments that proves it one way or the other?

Sunday, December 12, 2010

Software Development has only One Metric that Matters

Having read and thoroughly enjoyed More Joel on Software I bought myself Joel on Software this week and find it to be similarly wonderful. Both books are basically just hard-copies of his blog and make for entertaining reading even though they are packed with knowledge from decades of successful software development.

Joel's Measurement article from 2002 is not his best article, but reading Joel helped me crystallize some vague notions that have been bumping around my head for years. The aspects of software that are easiest to measure are generally the least valuable measurements. For instance: lines of code. The more lines of code, generally the worse your software is; it's bloated and complicated. In general, the fewer lines of code for the same functionality, the better, though taken to an extreme, you can make something completely illegible and impossible to change without throwing it out and starting over. How many lines of code are appropriate for the problem you are trying to solve?

Similarly, increasing complexity usually makes a product buggy, unusable, or both. But decreasing complexity, taken to an extreme, can make a product useless (it doesn't do what it needs to). But how do you measure the level of complexity that is "just right" for the problem you are trying to solve? Bugs is an interesting metric because even though fewer bugs is better, the Heisenberg principle comes into play such that there's no way of measuring bugs without skewing your results. Scott Adams sums it up beautifully:
http://www.joeindie.com/images/dilbert-minivan.gif

But there is one metric that combines and trumps all others in just about every meaningful way: Customer satisfaction. Does the software solve the real-world problem it was intended to for the people who need it most? That's the only thing that matters. Not cyclomatic complexity, efferent coupling, or any other measurement that a computer can make on the code directly. It has meet someone's needs.

I recently saw Objectified which was an interesting film. But I didn't know whether to laugh out loud or stare in horror at one artist who made a robot that required human attention to do what it needed. The clip shows some woman dressed like a flight attendant leaning down so that this thing could whisper in her ear so that she would know to move it to the other side of the room. This is exactly how much of our software fails. The technology we create is supposed to make our lives easier, better, more enriching. Not make us it's slave.

How often is the new version of a product a step backward from the old? I remember one person I worked with actually advertising that with the new version of his architectural component, what used to take you one click now takes you several and makes you wait longer. How is that supposed to be a good thing? It's slower and more difficult... why?

Rackspace is on to something, realizing that what you get from a hosting company is not servERS, but servICE. Anyone can set up a few servers. But the first time your server goes down and your hosting company doesn't or can't respond, you realize that the service is what counts. Maybe I'm pushing this a little too far here, but I think software development has more in common with a hosting company than a discount store. That meeting the customer's needs, providing excellent SERVICE is more important than the implementation details of the product. The software is more like an extension of that service (it serves the customer instead of a human serving the customer) than like a shrink-wrapped product.

Providing an effective autonomous electronic servant means understanding the customer's need and designing something that meets that need, then communicating that understanding to the people who actually have to build the software. Get them excited about, or at least involved in solving the customer's actual problem instead of just thinking about some architectural detail or slavishly following a spec.

Obviously, there are pitfalls. In The Iceberg Secret Revealed, Joel says that "Customers don't know what they want." And it's true. In Make Users Happy by Ignoring Requirements I discuss what I should have called the "Excel Syndrome" where users describe the problem as if Excel were the solution. It's not. If it were, they would have made a spreadsheet instead of hiring you.

One last thing... When I say, "customer" I don't mean just the people your company serves. I mean the target audience for your software, which may be inside your company instead of outside it. When I worked for Fidelity, I worked for a little group called FMTC (now Pyramis) that handled retirement plans for large organizations. I think the minimum amount to open an account was over a million dollars. After years of working on the "customer-facing web-site" I learned that the primary users of the site were a handful of customer service people within Fidelity. Customers would call them up, ask a question, and the internal rep would use the web site to find the answer. Had we known this up front, we might have designed it very differently. That was years ago and most people are comfortable logging in and accessing their own account nowadays, but if you are in charge of billions of dollars, you may still have your secretary call the investment company and hand you the phone to get the answer to your question verbally. No password, no logging in, just "Yes Mr. Big-Wig. It's at 42 billion and change Mr. Big-Wig. I'd be happy to explain that for you..."

In short:
1.) Find out the real need.
2.) Meet it.
3.) Measure your success by asking your customers.
4.) Do it better next time (PDCA).