Wednesday, December 12, 2007

Make Users Happy by Ignoring Requirements

People talk about the system they want you to build in the language of what they have now. It is natural to not want to give anything up. Writing web applications, I hear over and over that people want to be able to sort by every column, have column headings print out on every page, filter by any value... That's Microsoft Excel - and it is a truly incredible tool for doing all of those things.

On the other hand, most business systems (as opposed to software systems) are dependent on communication to make people with different skills work together. In general, if you can meet those needs, you can always add a CSV download later if people still want to "slice and dice" data in Excel. I have had consistently good results from ignoring the "Excel on the Web" requirements and asking my clients instead to describe the people, the tasks, and the communication involved in their processes.

I try to break a process into the sets of tasks that each person (or type of user) needs to perform. The usual goal is to show someone only the data they really need to perform their task(s), and give them the tools they need to accomplish that task on the same screen. Sometimes, similar tasks can share a screen or multiple screens might be required for a task, but the goal is to compartmentalize and specialize the business system into units that use common sets of data and functionality.

Example 1: Ignoring User Requirements

Below are the requirements I was given for a year-long project:
  1. A list of the documents
  2. Columns showing which one's are ready, which ones are late, and who is responsible for each
  3. Sort by any column
  4. Filter by any column
  5. Print out with column-headings on the top of every page...

Sounds like Excel, no? I only actually met requirement #1, yet the client was delighted. Why? Because the list of Excel features I was given tipped me off that the requirements were not well thought-through.

Here is a list of real requirements that a team of us had to dig for:
  1. A draft document needs to be written by an analyst
  2. The document must be reviewed and approved
  3. Financial reports need to be produced by a system
  4. Financial reports need to be approved by a person
  5. The documents and financials need to be combined into various client reports
  6. The client reports need to be printed
  7. The client reports need to be mailed
  8. A manager needs to keep track of all of the above

Developers understand systems and the benefits and limitations of your technology. Businesspeople know what they need to accomplish, but are often dimly aware of the business systems they use to achieve their goals. Bridging that communication gap is the hardest part of writing good software.

Example 2: Ignoring Systems Requirements

Stated requirements for a reporting web site that I worked on (on-and-off) for 6 years:
  1. high availability
  2. Load balancing
  3. portal
  4. pluggable, snappable, replaceable software components
  5. Thin, rich client
  6. Fully denormalized reusable data source
  7. Dashboarding
  8. 3-tiered communication layer

Today, you could add to those requirements:
  1. stateless (REST-ful)
  2. SOA (Service-oriented architecture)
  3. AJAX
  4. Web 2.0 (or 3.0, or whatever)

What the business actually needed was:
  1. Show quarterly reports as soon as the data was approved to show
  2. Secure
  3. Support a maximum of 20 simultaneous users
  4. Available 8AM-8PM Eastern (US) time
  5. Show some other marketing information

One web server would have been more than adequate, but we had 2. The slow part was actually some of the queries/reports in the database. The only part of the site we ever plugged/snapped/reused were some hand-coded HTML pages and the login code. If we had known enough to discover the actual needs of the users, we could have saved at least 50% of the effort and used some of that time to further optimize the long-running queries, or even redesign the database to make the queries simple enough that they wouldn't need optimization.

Well, 20/20 hindsight is easy. That's how you learn...

Conclusion

When requirements read like a winning "Buzzword Bingo" card, it's almost a sure sign they weren't thought out very carefully. Time spent digging for real requirements pays off in both medium and long-term system usefulness and cost savings.

Sunday, September 30, 2007

On Generating Random Keys for Use in Cryptography

Computers are deterministic state machines - totally incapable of producing anything random. They employ Pseudo-Random Number Generators: functions that appear to produce random output with respect to their input. Good Pseudo-Random Number Generators (PRNGs) can produce thousands of values a second with a surprising degree of entropy. But even the best PRNG is limited by the variability of its “seed” or input values, and none have perfectly even distribution or an absence of patterns in their output.

Bits of Entropy:
In cryptography, key size, measured in bits (or powers of 2), represents the number of keys which must be tested to break an encrypted message by brute force. But encryption algorithms are generally “lossy” because, being deterministic, they have some degree of predictability and/or collisions of different keys producing the same output values. Some algorithms are “lossier” than others and there are many statements in the literature to the effect that a 1024-bit key using one algorithm is just as secure as an 80-bit key using another. That means one algorithm is 1.49×10248 times more secure than the other. Despite common misuse in advertising claims, bits are the "standard" unit of entropy. In this writing, they represent a mathematically derived number of equally likely possible combinations, expressed in powers of 2. Not the computer space used to store a value generated with a PRNG.

Dirty Secrets of Pseudo-Random Number Generators:
Given the same initial input, a PRNG will not only produce the same output value, but will produce the same sequence of values every time, and that sequence is generally your automatically generated cryptographic key or password! The "moving parts" inside most PRNGs are 64-bit and 32-bit registers and this limits their entropy. But PRNGs are often more limited by the precision of their input or seed. Since the few somewhat random processes in the computer tend to produce clusters of values, they are unsuitable as random number seeds. The only part of the computer that produces an even dispersion of values is the system clock, which measures milliseconds.

There are only 8.64×107 milliseconds in a day. That's about 26 bits of entropy. Using the 95 characters on a US-keyboard, a 5-character truly-random password is more secure! So why not use a longer time frame? There are 3.1×1011 milliseconds in 10 years which is only 38 bits of entropy – and who has a password older than 10 years? Most people generate their keys at work, between the hours of 9-5 M-F which is only one fourth of the hours in the week, so you might be losing about 2 bits of entropy there.

So even if you have a 64-bit random number generator, unless you can get a seed with more than 36 bits of entropy, you only get 36 bits of entropy in your key. That’s the same number of significant digits in a truly random 6-character key (956). How many significant digits of randomness does any computer generated encryption key contain? No more than the significant digits of entropy used to create it times the number of PRNG algorithms that could have been used.

Links:
  • How We Learned to Cheat at Online Poker: A Study in Software Security - If you liked my article here, you will probably like this article even more. One of my favorite reads on the web!

  • The Memorability and Security of Passwords - Some Empirical Results - Using initial letters and punctuation from pass-phrases is as easy to remember as the most commonly hacked English words, but is as hard to crack as truly random characters... until cracking tools learn what to look for. In the meantime, this is the best advice I've seen on telling an end-user how to construct a secure password.

  • Entropy Gathering Daemon - A great way to seed a random-number generator. The only complaint I've heard is that EGD can block, waiting for more entropy if you use it heavily in software such as a web server.

  • KeePass password store - has an entropy-gatherer built-in for generating random passwords... Nice!

  • RANDU Wikipedia article about an infamous random number generator.

  • Humor from xkcd: Random Number, Code Talkers


Creative Commons License