samedi, 14 novembre 2009

Clustering Jackrabbit 1.5.x with DataStore configuration

Jackrabbit is the reference implementation of JSR-170 : JCR, Java Content Repository.

One day we felt the needs to run a Jackrabbit repository in a clustered environment for reliability. So a JCR runs on two separated servers, backed by the an Oracle db and a NFS shared datastore.


It exists some difficulties to run such architecture, mainly because both JCR nodes (JCR1 and JCR2) do not know each others. So if we insert some content on one node, it takes some time before we can read it on the other node.

In Jackrabbit 1.5.x, Session.refresh do not work correctly, so we need to do some trick to enable cluster (synchronization) features.

First of all, we add a custom ServletFilter which handle the GET (read) request and transform it on a HEAD one (test if the content exists) using apache.commons.HttpClient.
If the return of the query says the content exists, then we run the normal request. If the content do not exists, we way for some arbitrary time and relaunch the HEAD request.

This filter will let enough time to the node to synchronize itself, and then perform the query in the right way.

Other possibilities like HEAD on the current node and then HEAD on a different node should also be possible, but this will require every nodes to know about every other nodes. No the best idea.

One another tricky point is to store all the content to the configured datastore. Be carefull to not rely on a JNDI persistence manager, but be sure to use a subclass of a bundle persistence manager, otherwise you will have some surprise with the size of the database.

Aucun commentaire: