Our cus­tomers reg­u­lar­ly ask for dis­as­ter re­cov­ery op­tions in com­bi­na­tion with our JTA/XA im­ple­men­ta­tion.

While look­ing around for back­ground in­for­ma­tion, we re­alised that there is lit­tle in­for­ma­tion wait­ing to be found, so we fig­ured we'd study it our­selves and share our find­ings with you. To get the dis­cus­sion start­ed, this is the first part in a se­ries on dis­as­ter re­cov­ery - in­tro­duc­ing the read­er to the prob­lem of dis­as­ter re­cov­ery and XA en­vi­ron­ments. Later parts will dis­cuss a num­ber of pos­si­ble so­lu­tions with vary­ing de­grees of re­cov­er­abil­i­ty.

The con­text

Let's start with set­ting the con­text. The sit­u­a­tion we have in mind is one where re­quests are queued for pro­cess­ing by the ap­pli­ca­tion. A sim­pli­fied de­scrip­tion of what the ap­pli­ca­tion does is the fol­low­ing:
  • it takes a re­quest mes­sage off the queue (i.e., a 'com­mand' in the do­main-dri­ven de­sign par­a­digm)
  • it process­es the mes­sage, sav­ing the re­sults in the data­base
  • it op­tion­al­ly pub­lish­es an event no­ti­fi­ca­tion mes­sage (i.e., a 'do­main event' in the do­main-dri­ven de­sign par­a­digm)
  • all of the above with­in the con­text of one atom­ic trans­ac­tion
We're as­sum­ing that all of the re­sources in­volved (mes­sage bro­kers and data­base) are XA-ca­pa­ble. We're also as­sum­ing that the trans­ac­tion man­ag­er keeps trans­ac­tion log files for re­cov­ery.

In this set­ting, dis­as­ter re­cov­ery typ­i­cal­ly in­volves an ac­tive/pas­sive com­bi­na­tion of dat­a­cen­ters, more or less kept in-sync in some way or an­oth­er.

The prob­lem

The prob­lem is sim­ple: giv­en that the two dat­a­cen­ters must be kept in-sync, how do we do this? The naive an­swer would be data­base repli­ca­tion, with some ven­dor-spe­cif­ic repli­ca­tion mech­a­nism that push­es up­dates from the ac­tive data­base to the pas­sive one. How­ev­er, in our con­text this is not suf­fi­cient: the data­base is not the only com­po­nent main­tain­ing ap­pli­ca­tion state. There are also the mes­sage bro­ker(s) and the trans­ac­tion log file to take into ac­count. Database repli­ca­tion alone will not cut it, be­cause you would only have the data­base state repli­cat­ed to some ex­tent, not the queued re­quests in the bro­ker, nor the state of on­go­ing trans­ac­tions.

The ide­al so­lu­tion

Ideal­ly, you would want to have every­thing repli­cat­ed syn­chro­nous­ly: the data­base, the bro­ker, the trans­ac­tion logs and the on­go­ing XA ses­sions in each re­source. That way, the pas­sive site would be a com­plete mir­ror of the ac­tive one.

The real world

Un­for­tu­nate­ly, the real world is far from ide­al and the ide­al so­lu­tion is hard to ob­tain: you would need a per­fect­ly repli­cat­ed ven­dor set­up for the bro­ker, the data­base and the file sys­tem. More­over, the repli­ca­tion in all these sys­tems should work in 'lock-step' way so that the com­bi­na­tion of repli­cat­ed trans­ac­tion state at the pas­sive site is 'con­sis­tent' with the dis­trib­uted trans­ac­tions hap­pen­ing at the ac­tive site - putting even more con­straints onto the sys­tem. And this is where it starts get­ting re­al­ly dif­fi­cult to im­ple­ment: even the most so­phis­ti­cat­ed repli­ca­tion sys­tems we know of will fail to of­fer repli­ca­tion of on­go­ing XA ses­sions, which makes it un­re­al­is­tic to as­sume that this is ever go­ing to be pos­si­ble (and if it were pos­si­ble, it would sure­ly be the most ex­pen­sive sys­tem con­fig­u­ra­tion you can think of).

So here we are: we've out­lined the prob­lem! Stay tuned for the se­quel, where we'll dis­cuss a first so­lu­tion.

Don't want to miss a post?

RSS

Com­ments

Add a com­ment

Cor­po­rate In­for­ma­tion

Atomikos Cor­po­rate Head­quar­ters
Hove­niersstraat, 39/1, 2800
Meche­len, Bel­gium

Con­tact Us

Copy­right 2026 Atomikos BVBA | Our Pri­va­cy Pol­i­cy
By us­ing this site you agree to our cook­ies. More info. That's Fine