Wednesday, September 14, 2011

Automatic POJO Generation From a Database

Like JPA (or presumably Spring), Hibernate "reverse engineering" tools can generate POJOs (Plain Old Java Objects) from database tables and vice-versa. Generating database tables from Java code is probably best used as a one-time short-cut, suitable for rapid prototyping. Because everything in an application is dependent on the database (and not vice-versa), future changes must be made in the database first (and any existing data migrated there first as well), then propagated to all affected parts of the application.

I have found that anything which the "database reverse engineering" process does not generate for me breaks, usually sooner rather than later. Also that the hardest part of managing maintenance of a large system is the constant refactoring. To that end, I have developed 2 goals for the database reverse-engineering process:

  1. It should keep my application in synch with any database changes, automatically updating as much of my application as is practical.
  2. Where #1 is not possible, it should cause any affected areas of my application to generate a compile-time error.

Imagine a database table:

CREATE TABLE `user` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `last_modifier_id` bigint(20) unsigned DEFAULT NULL,
  `company_id` bigint(20) unsigned NOT NULL,
  `identifier` varchar(64) NOT NULL COMMENT 'PIN.  Unique within a company',
  `first_name_c` varchar(40) DEFAULT NULL,
etc.

Database generation tools normally generate the following sort of fields:

public class User implements java.io.Serializable {

private long id;
private User lastModifier;
private Company company;
private String identifier;
private String firstNameC;
etc.

public long getId() { return id; }
public void setId(long x) { id = x; }

public User getLastModifier() { return lastModifier; }
public void setLastModifier(User x) { lastModifier = x; }

public Company getCompany() { return company; }
public void setCompany(Company x) { company = x; }

public String getIdentifier() { return identifier; }
public void setIdentifier(String x) { identifier = x; }

public String getFirstNameC() { return firstNameC; }
public void setFirstNameC(String x) { firstNameC = x; }
etc.

I have modified the generator to also generate the following fields for each column in the table (prefixed with SQL_) and for each field in the Java object (prefixed with HQL_):

public static final String SQL_user = "user"; // table name
public static final String SQL_id = "id";
public static final String SQL_last_modifier_id = "last_modifier_id";
public static final String SQL_company_id = "company_id";
public static final String SQL_identifier = "identifier";
public static final String SQL_first_name_c = "first_name_c";
etc.

public static final String HQL_User = "User"; // class name
public static final String HQL_id = "id";
public static final String HQL_lastModifier = "lastModifier";
public static final String HQL_company = "company";
public static final String HQL_identifier = "identifier";
public static final String HQL_firstNameC = "firstNameC";
etc.

This allows me to write code like the following:

Crit<User> dupIdentCrit = Crit.create(User.class);
dupIdentCrit.add(Restrictions.eq(User.HQL_company, this));
dupIdentCrit.add(Restrictions.eq(User.HQL_identifier, identifier));
List<User> dupIdentifierUsers = dupIdentCrit.list();

Here's an example of building a SQL statement for use with JDBC:

StringBuilder sqlB = new StringBuilder("select * from " + User.SQL_user +
                                       " where " + User.SQL_company_id +
                                       " = ?");
if (!includeDeleted) {
    sqlB.append(" and " + User.SQL_is_deleted + " is false");
}
sqlB.append(" order by " + User.SQL_last_name_c + ", " +
            User.SQL_first_name_c + ", " +
            User.SQL_middle_name_c);

When the database changes and I rebuild my objects, I get compile-time errors, showing me every piece of code I need to fix.

I am working on modifying the generator to also generate the following for any varchar or char fields because their maximum length is defined in the database columns:

public static final int MAX_identifier = 64;
public static final int MAX_firstNameC = 40;
etc.

I am considering parsing comments on each column to look for something like: 'min: 6' and generate the following from it:

public static final int MIN_identifier = 6;

That same technique could be used with int columns to define minimum and maximum values which could be used in validation, in the GUI, and in the help of an appliction. I'm also experimenting with using the HQL_ tokens as the names of the input fields on the GUI screens.

I'm not sure how or even if this would work with stored procedures.

Special thanks to J. Michael Greata, whose interest and suggestions over the past several years have encouraged and shaped the development of this project. Also to Arash Tavakoli whose post in the LinkedIn Java Architects group inspired me to organize my thoughts into this posting.

No comments: