Regarding Programming: January 2006

Seems like Inversion of Control is one of the new rages (I say one of the new rages because it looks like the programming community, and Java in particular, is splintering into many different factions, each more rabid than the next). In the Java world, the Spring framework is one of the most well-known frameworks that makes Inversion of Control (IoC) easy to program to. The Spring framework website has a quick overview of IoC if you're not familiar with it.

I agree that IoC can be a very useful programming style because my own experiences in trying to build truly reuseable components has led me to construct my own code in a similar fashion. The idea is very simple: reuseable components often rely on certain objects that they should not be in charge of instantiating, because the actual object to be used should be determined further down the line by other components. So instead of instantiating these objects in your reuseable components, simply pass instances of those objects into the reuseable components, and it's a win-win situation.

For example, let's pretend you are writing a data access layer for your enterprise application (yes I know, most of the time an O/R mapping tool should be used instead). Specifically, you want to need to write a data access object that will persist User objects, where User objects have the usual username and userid fields, among other fields. Your first pass at
this might contain something like this:


public class UserDAO {
 private Connection conn;
 public UserDAO () {
   conn = DriverManager.getConnection( "jdbc:some:driver", "username", "password" );
 }
 public User loadUser ( int userID ) {
    // use conn to access the database
 }
}

Of course, as many people have found out when they write code like this, you have now locked this code into one particular use. You have locked in the JDBC url, which means you have forced any applicaton that wants to use this DAO to connect to a particular database. What happens if you want the DAO to connect to the QA database instead of the DEV database? What about the production database??

To fix this, many people then wrote this version, as Sun tells us to in their "official" J2EE guides:


public class UserDAO {
 private Connection conn;
 public UserDAO () {
   Context ctx = new InitialContext( );
   DataSource ds = (DataSource)ctx.lookup( "jdbc/MyDataSource" );
   conn = ds.getConnection( );
 }
 public User loadUser ( int userID ) {
    // use conn to access the database
 }
}

This appears to be better, because now the JDBC url and usename/password are configured by the app server and no longer hard-coded. Theoretically, now when you want to re-use this component, all you have to do is make sure a DataSource is defined in JNDI with the given name. But that's the problem: the code is now restricted to environments that are JNDI-enabled. If you want to use this DAO in a stand-alone Java application, you will have to construct an entire JNDI environment and expose a DataSource in that environment, just so that this DAO will run properly. That's an awful lot to do just to reuse one component. You're likely to give up and rewrite UserDAO rather than figure out how to do all of that, especially if all you're trying to do is write a simple Java app to fetch a report from the database.

Even worse, this JNDI requirement does not appear anywhere in the declaration of the class. While you could note this requirement in the javadocs for this class, there is no way a compiler can enforce that this requirement is met. You won't know until runtime, after you've written reams of code that relies on this component, before you realize that it won't work in your application. This is part of a greater trend that has grown over time and been a part of Java's philosophy from the beginning. Java is always willing to sacrifice compile-time ("static") safety for runtime ("dynamic") capabilities. It is difficult to write code that provides strong compile-time safety, while the most interesting technologies that have come about surrounding Java have usually harnessed the runtime capabilities of the VM.

A better way to write this code to optimize reuseability is to adopt the approach that if a component relies on another object, that object should be passed in by the caller:


public class UserDAO {
 private Connection conn;
 public UserDAO ( Connection cn ) {
   conn = cn;
 }
 public User loadUser ( int userID ) {
    // use conn to access the database
 }
}

Now, the UserDAO class makes no assumptions about how to obtain the database connection. A web-based application can obtain the Connection via JNDI while a standalone application can use the DriverManager to directly obtain a Connection. The UserDAO class is now more reuseable than it was before. At this point you effectively have used Inversion of Control. There is only one more step to go.

As a side note, the DAO should really take a DataSource and not a Connection, but I'll leave my opinions on that for a separate post. As a second note, some practitioners of IoC would insist that setters be used to set the Connection object onto UserDAO instead of using the constructor, but I disagree. Again, a topic for a separate post.

This style of coding points to a more general coding paradigm. You could treat the methods in the UserDAO class as implementing algorithms which operate on generic interfaces that are passed in, without needing to know exactly how those interfaces are implemented. This is essentially polymorphism taken to the level of generic programming. You can focus on writing reuseable algorithms by having the algorithm manipulate generic interfaces, and then somebody else is responsible for providing the actual implementations of those generic interfaces.

But when writing code in this manner, a central concern is trying to decide how to instantiate the concrete objects. If everyone is just dealing with interfaces, not caring what the concrete objects are, who actually creates the concrete object? This is why the GoF authors spend the entire first third of the book discussing patterns to instantiate an object. As an architect, if you want to write reuseable components that require instances of objects to be passed in, you will probably have to determine at what layer in the application those objects will be passed in, and you will probably have to use numerous patterns to enable new instances of those objects to be created when the reuseable components don't know the actual objects they have references to.

This is where the final step in Inversion of Control comes in, where the IoC containers come in handy. The IoC containers provide factories where you can configure what specific objects need to be passed into your reuseable components via XML files. This enables you to push the actual instantiations of the objects all the way into configurable files. Theoretically this should enable your reuseable components to be wired up with other components very quickly and easily, with none of them needing to know the actual objects they are using.

There's one big rub to using IoC containers though, and that's the fact that you can't avoid the laws of object instantiation. At some point somewhere in your code, you will have to hard-code how your objects are instantiated. In a web application, at some point (usually in the servlet) some piece of code is going to instantiate a Spring FactoryBean or ApplicationContext, and use that to instantiate other objects. For a standalone web application, you might define this in main() or in some Factory object somewhere. There's no other way for objects to just magically appear in your code, even with IoC.

If you want to limit how much of your application relies on the IoC container to instantiate objects (again, if you've made all this effort to hide how objects are instantiated so far, you're likely to want to do this), you need to carefully construct all your application layers so that only a few select layers (ideally the top-most layers) use the FactoryBean to instantiate objects. Everything else should be handled in the FactoryBean.

For example, let's say that the UserDAO is itself used by a business object UserBO that implements some business logic surrounding the manipulation of Users. If you want the UserBO object itself to be reuseable across multiple applications, then UserBO shouldn't instantiate a UserDAO object directly, because if it did that then it would have to know what database Connection to use, and we'd have the same set of problems we had with re-using UserDAO. You might be tempted to have the UserBO grab a handle to a Spring FactoryBean and instantiate the UserDAO that way, but if you do that then you've tied UserBO to Spring, and that forces any application that wants to re-use UserBO to use Spring, which might not always be feasible. For maximum reuseability of UserBO, it should either be passed a generic interface to some Factory class that will return objects, or it should be passed and instance to the actual UserDAO to use (in which case UserDAO should become an interface). There are numerous issues with attempting to pass in a generic Factory which are beyond the scope of this blog entry, so instead let's assume that you pass in a IUserDAO interface that represents the actual UserDAO for the UserBO to use.

At this point we still need to figure out who will actually instantiate the IUserDAO object to pass into the UserBO, so logically we go up one more layer. If this is a three-tier application, then the next layer up is the presentation layer. So, what if we have the presentation layer use Spring's FactoryBean to instantiate the actual UserDAO, and then send it into the UserBO? That works ... except that conceptually it's pretty ugly. Why does the presentation layer know exactly what DAO the business layer is supposed to use? That doesn't make much sense.

The solution in Spring is to have the UserBO configured via IoC to have the appropriate DAO passed in, and then have the presentation layer use Spring's FactoryBean to instantiate the UserBO. That way the presentation layer only knows about the UserBO, and the rest is configured in Spring. Specifically, this means that: 1.) the UserBO is configured in XML so that the appropriate UserDAO is passed in, 2.) the UserDAO is configured in XML so that the appropriate DataSource is passed in, and 3.) all of them get automatically instantiated by Spring when somebody uses the FactoryBean to instantiate a UserBO. This sort of chaining of configured objects is common in the Spring framework apparently, where all the objects are setup like dominoes, so that one layer instantiates one object which triggers a whole sequence of objects to be also instantiated.

If this sounds like it might be a little bit complicated to understand and properly configure, I don't think you're alone. Anytime configuring things requires that much coordination, it's usually a warning sign that long-term maintainability is going to be difficult. But only time will tell whether this is true or not, I personally have not had to maintain a long-term project that used Spring in this manner so I can't speak to whether this is maintainable or not. In any case, some could argue that this is the same as hard-coding all of these instantiations into the code itself, and thus no harder to maintain, and potentially easier to maintain because the XML files expose the interrelationships more clearly than code does.

In the meantime, one big question is whether all of this work is worth it. Is it truly that important that a DAO not hard-code a reference to a JNDI DataSource? Will your DAO ever need to be used outside of an application server container? On the projects I've been on, I've found that the answer is usually no. And if that's the case on your project, then you should carefully consider whether or not IoC makes sense.

The determining factor usually appears to be whether you expect your data access objects and/or business objects to be re-used across multiple application environments (web-based and client/server, or lightweight appserver versus full J2EE server, etc). One current design pattern that has gained a lot of traction in the past few years is for companies to expose their APIs via web services. Instead of exposing a business function via a class library that is used in several apps, that business function is exposed via a web service, and applications invoke the web service to obtain the functionality they need. In this way the reuseable code is always running in just one place, and much of the reuseability problem goes away. As always, the proper solution to your problem will be specific to the details of your project and your environment.

What's been your experiences with IoC containers? Where have you found that line to be where the complexity becomes worth it?

Regarding Programming

Tuesday, January 03, 2006

Regarding Inversion of Control

About Me

Links

Previous Posts

Archives