JBoss Cache

It’s been a busy week — much to do at the company and CAJ meetings on the Monday, Tuesday and Wednesday evenings. That’s why I had no time to do some blogging (among other things). But heeere comes the weekend!

Yesterday evening was occupied as well: I went to Stuttgart for a meeting of the local Java User Group. Its JBoss SIG(Special Interest Group) had organized a talk about JBoss Cache held by Bela Ban, project lead for JGroups and JBoss Cache
Bela gave an interesting overview of how to replicate data across a cluster of application server instances using JBoss Cache in its two incarnations, the Tree Cache and the POJO Cache. The Tree Cache divides cachable data into hierarchical nodes with attributes that can be replicated indivually, thus preventing the replication of huge data sets. The POJO Cache ensures that every change of an object gets replicated once the object has been registered in the distributed cache.

Cache instances can be connected to a distributed tree much like a HTTP cache hierarchy can be built with Squid. At what time the replication of changes actually gets done depends on if the changes are made in a transaction context. If not, replication happens immediately. Inside a transaction, replication occurs not until the transaction is commited.

Cache distribution can be extended by cache persistence where cache data is written to a filesystem or database. This provides the possibility of “swapping” data on a cache host or even between cache hosts.

Locking is crucial point in distributed data storage and JBoss offers two opposite strategies, optimistic and pessimistic locking.

When Bela showed a diagram depicting that JBoss HTTP Sessions are based on JBoss Cache as well, I first concluded that this facilitated using a simple load balancer distributing HTTP requests randomly between JBoss instances. JBoss Cache should make sure that every instance can handle every current HTTP session, after all. But Bela pointed out that HTTP session should be sticky to one host each because the cache data isn’t evenly distributed but gravitates to where it’s used the most.

It was an interesting talk supplemented by a small live demonstration. Bela certainly knows what he’s talking about. It seems to me that JBoss Cache is a well thought-out solution to distributed data storage.

Since Bela held this talk already as a keynote at TheServerSide Java Symposium Europe, his slides (in PDF) are available for download on the conference website.

This SIG meeting was an evening well spent and I was even given a JBoss backpack for taking part in suggesting presentation topics for future meetings. Meetings some of which I’ll attend, I’m sure.