Integrating MongoDB with Spring Batch
Spring Batch is a superb batch framework from, well, Spring. It covers all the concepts of batch architecture and, generally, spares you from reinventing the wheel. It’s cool, really. If you have batch-oriented application, you must go and take a look at Spring Batch. And if you don’t know what batch-oriented application is, just think about reading-validating-saving-to-db a zillion text files every night, unattended. Now you know what batch-oriented application is, go and look at Spring Batch.
Welcome back. As you’ve seen, Spring Batch constantly saves its state in order to be able to recover/restart exactly when it stopped. JobRepository is the bean in charge of saving the state, and its sole implementation uses data access objects layer, which currently has two implementations – in-memory maps and JDBC. It looks like this:
Of course, the maps are for losers testing, JDBC implementation is the one to use in your production environment, since you have RDBMS at your application anyway, right? Or not…
Today, when NoSQL is gaining momentum (justified, if you ask me) the assumption that “you always have RDBMS in enterprise application” is not true anymore. So, how can you work with Spring Batch now? Using in-memory DAOs? Not good enough. Installing, setting up, maintaining, baby-sitting RDBMS only for Spring Batch meta-data? Hum, you’d rather not. There is a great solution – just keep the meta-data in the NoSQL database you use for the application itself. Thanks to Spring, the Spring Batch architecture is modularized and loosely-coupled, and all you have to do in order to make it work is to re-implement the four DAOs.
So, here’s the plan:
- Implement *Dao with NoSqlDb*Dao
- Add them to Spring application context
- Create new SimpleJobRepository, injecting your new NoSqlDb DAOs into it
- Use it instead of the one you would create from JobRepositoryFactoryBean
That was exactly what I did for our customer, implementing the DAOs using MongoDB. Guess what, you must go and take a look at MongoDB. It’s lightning-fast, schema-less document-oriented database, that kicks ass. When you suddenly have a strange feeling that RDBMS might not be the best solution for whatever you do, chances are you’d love MongoDB, as I do now. There are use-cases, in which you just can’t implement whatever you need to do with relational storage. Well, I lied. You can. It will take a year, it will look ugly and perform even worse. That’s my case, and I am just happy the year is 2010 and we know by now that one size doesn’t fit all.
I have to admit -implementing Spring Batch DAOs with MongoDB was fun. Even Spring Batch meta-data model, which was designed with relational storage in mind, persists nicely in MongoDB. Should I even mention that the code is cleaner comparing to JDBC? Even on top of JDBC template?
Now go and grab the Spring Batch over MongoDB implementation and the reference configuration: http://github.com/jbaruch/springbatch-over-mongodb. I have used the samples and the tests from original Spring Batch distribution, trying to make as few changes as necessary. You’ll need MongoDB build for your platform and Gradle 0.9p1 to build and run. (Why Gradle? Because it is truly a better way to build).
If you use MongoDB – enjoy the implementation as is. If you use some other document-oriented DB, the conversion should be straightforward. In any case, I’ll be glad to hear your feedback.