You can now use Hazelcast 4 with our JTA/XA transactions, via an additional module "transactions-hazelcast4". This was required due to breaking API changes that were introduced in Hazelcast 4.
When using the prior Hazelcast integration (made for Hazelcast 3, not 4) you would get the following exception when trying to configure a JTA/XA enabled HazelcastInstance:
java.lang.UnsupportedOperationException: Client config object only supports adding new data structure configurations at com.hazelcast.client.impl.clientside.ClientDynamicClusterConfig.getLicenseKey(ClientDynamicClusterConfig.java:897) at com.hazelcast.config.ConfigXmlGenerator.generate(ConfigXmlGenerator.java:129) at com.atomikos.hazelcast.HazelcastTransactionalResource.<init>(HazelcastTransactionalResource.java:23) at com.atomikos.hazelcast.AtomikosHazelcastInstance.<init>(AtomikosHazelcastInstance.java:31) at com.atomikos.hazelcast.AtomikosHazelcastInstanceFactory.createAtomikosInstance(AtomikosHazelcastInstanceFactory.java:17) at ...
Severity: | 4 |
---|---|
Affected version(s): | 5.0.104 |
When setting recycleActiveConnectionsInTransaction=true, you can now reuse connections more flexibly.
Consider the following use case with recycleActiveConnectionsInTransaction enabled:
With recycleActiveConnectionsInTransaction=true, c1 will be the same connection instance as c2.
So after method bar() closes c2, c1 will also be closed and this caused errors like these in step 4 of method foo():
The underlying XA session is closed at com.atomikos.jdbc.internal.AtomikosSQLException.throwAtomikosSQLException(AtomikosSQLException.java:29) at com.atomikos.jdbc.internal.AtomikosJdbcConnectionProxy.enlist(AtomikosJdbcConnectionProxy.java:108) at com.atomikos.jdbc.internal.AtomikosJdbcConnectionProxy.updateTransactionContext(AtomikosJdbcConnectionProxy.java:61) at com.atomikos.jdbc.internal.AbstractJdbcConnectionProxy.prepareStatement(AbstractJdbcConnectionProxy.java:64) at sun.reflect.GeneratedMethodAccessor228.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.atomikos.util.DynamicProxySupport.callProxiedMethod(DynamicProxySupport.java:162) at com.atomikos.util.DynamicProxySupport.invoke(DynamicProxySupport.java:116) at com.sun.proxy.$Proxy801.prepareStatement(Unknown Source)
None.
You can now allow recycling of active JDBC/XA pooled connections within the same transaction, before they are "closed" by the application. This means that certain deadlock scenarios can be avoided.
Imagine the following use case:
Before this feature, step 4 would return a different physical connection from the pool. This would trigger a new XA branch, with unspecified isolation (locking) behaviour with respect to any updates performed via the connection in step 2. This could even cause deadlocks.
Therefore, people have asked us to allow for step 4 to reuse the same connection, c1. You can now enable this new behaviour by callingsetRecycleActiveConnectionsInTransaction(true)
on the AtomikosDataSourceBean
.
A new, optional setter on our datasource. The default is false for backward compatibility.
Severity: | 2 |
---|---|
Affected version(s): | 5.0.x, 4.0.x |
You now no longer get "Log corrupted - restart JVM" exceptions after you interrupt a thread that is writing to the transaction log file, or after any other exception that make a log checkpoint fail.
com.atomikos.recovery.fs.CachedRepository
class, leaving the instance in an invalid state:
2021-03-01 16:15:56.662 ERROR 41669 --- [pool-1-thread-1] c.a.recovery.fs.FileSystemRepository : Failed to write checkpoint java.nio.channels.ClosedByInterruptException: null at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[na:1.8.0_192] at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:392) ~[na:1.8.0_192] at com.atomikos.recovery.fs.FileSystemRepository.writeCheckpoint(FileSystemRepository.java:196) ~[transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.recovery.fs.CachedRepository.performCheckpoint(CachedRepository.java:84) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.recovery.fs.CachedRepository.put(CachedRepository.java:77) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.recovery.fs.OltpLogImp.write(OltpLogImp.java:46) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.persistence.imp.StateRecoveryManagerImp.preEnter(StateRecoveryManagerImp.java:51) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.finitestates.FSMImp.notifyListeners(FSMImp.java:164) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.finitestates.FSMImp.setState(FSMImp.java:251) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CoordinatorImp.setState(CoordinatorImp.java:284) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CoordinatorStateHandler.commitFromWithinCallback(CoordinatorStateHandler.java:346) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.ActiveStateHandler$6.doCommit(ActiveStateHandler.java:273) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CoordinatorStateHandler.commitWithAfterCompletionNotification(CoordinatorStateHandler.java:587) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.ActiveStateHandler.commit(ActiveStateHandler.java:268) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CoordinatorImp.commit(CoordinatorImp.java:550) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CoordinatorImp.terminate(CoordinatorImp.java:682) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CompositeTransactionImp.commit(CompositeTransactionImp.java:279) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.jta.TransactionImp.commit(TransactionImp.java:168) [transactions-jta-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.jta.TransactionManagerImp.commit(TransactionManagerImp.java:428) [transactions-jta-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.jta.UserTransactionManager.commit(UserTransactionManager.java:160) [transactions-jta-5.0.9-SNAPSHOT.jar:na] at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:1035) [spring-tx-5.2.5.RELEASE.jar:5.2.5.RELEASE] at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:743) [spring-tx-5.2.5.RELEASE.jar:5.2.5.RELEASE] at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:711) [spring-tx-5.2.5.RELEASE.jar:5.2.5.RELEASE] at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:152) [spring-tx-5.2.5.RELEASE.jar:5.2.5.RELEASE] at com.example.atomikos.AtomikosApplicationTests.lambda$4(AtomikosApplicationTests.java:78) [test-classes/:na] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_192] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_192] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_192] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_192]
Later requests trying to read from the transaction logs would get systematic corruption errors like this:
com.atomikos.recovery.LogReadException: Log corrupted - restart JVM at com.atomikos.recovery.fs.CachedRepository.assertNotCorrupted(CachedRepository.java:137) ~[transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.recovery.fs.CachedRepository.findAllCommittingCoordinatorLogEntries(CachedRepository.java:145) ~[transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.recovery.fs.RecoveryLogImp.getExpiredPendingCommittingTransactionRecordsAt(RecoveryLogImp.java:52) ~[transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.RecoveryDomainService.performRecovery(RecoveryDomainService.java:76) ~[transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.RecoveryDomainService$1.alarm(RecoveryDomainService.java:55) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.timing.PooledAlarmTimer.notifyListeners(PooledAlarmTimer.java:101) [atomikos-util-5.0.9-SNAPSHOT.jar:na] at com.atomikos.timing.PooledAlarmTimer.run(PooledAlarmTimer.java:88) [atomikos-util-5.0.9-SNAPSHOT.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_192] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_192] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_192]
This has now been fixed.
None.
Severity: | 3 |
---|---|
Affected version(s): | 5.0.x |
getActive()
in the DataSourceBeanMetadata
classes of module transactions-springboot2 now no longer returns the total number of open connections, but rather the number of connections that are currently being used by the application.
Due to a misunderstanding of Spring Boot's semantics, this method returned the wrong result: the total number of open connections in the pool, rather than the number of connections being used. This has now been fixed.
None.
Severity: | 3 |
---|---|
Affected version(s): | 5.0.x |
DataSourcePoolMetadata
in Spring Boot, even if one of our datasources is used in wrapped or proxied mode in your Spring Boot runtime.
We used to return metadata in the following style:
if (dataSource instanceof AtomikosDataSourceBean) { return new AtomikosDataSourceBeanMetadata((AtomikosDataSourceBean) dataSource); }(and similar for our
AtomikosNonXADataSourceBean
class)
This would not work if the dataSource presented is wrapped or proxied. So we now use the built-in Spring Boot DataSourceUnwrapper.unwrap
to handle those cases.
None.
Severity: | 2 |
---|---|
Affected version(s): | 5.0.x, 4.0.x, 3.9.x |
The XA implementation of PostgreSQL ignores the transaction timeout, which means that you may have long-lived orphaned SQL sessions in your database server.
The following workarounds are available:
Set thequeryTimeout
on your JDBC Statement
objects, or try setting a server-level timeout like this:
SET SESSION idle_in_transaction_session_timeout = '5min’;
If you have any other solution then please let us know - thanks!
You can now choose to disable retrying commit or rollback for heuristic hazard transactions.
Heuristic hazard transactions can arise out of network connectivity issues during the commit phase: if a resource gets a prepare request and subsequently becomes unreachable during commit or rollback then the transaction will go into "heuristic hazard" mode. This essentially means that commit will be retried a number of times, even if com.atomikos.icatch.oltp_max_retries is set to zero. The rationale being: it is better to terminate pending in-doubt transactions sooner rather than later because of the pending locks they may be holding on to.
If you don't want this behaviour then you can now disable this, and rely on the recovery process in the background to take care of it (which also works, but will happen only periodically). To disable, just set this new property to false:
com.atomikos.icatch.retry_on_heuristic_hazard=false
A new startup property that can optionally be set. If not present, it will default to true to preserve compatibility with existing behaviour.
You can now explicitly trigger recovery in your application, via our API.
import com.atomikos.icatch.RecoveryService; import com.atomikos.icatch.config.Configuration; boolean lax = true; //false to force recovery, true to allow intelligent mode RecoveryService rs = Configuration.getRecoveryService(); rs.performRecovery(lax);
In order for this to work, make sure to set (in jta.properties):
# set to Long.MAX_VALUE so background recovery is disabled com.atomikos.icatch.recovery_delay=9223372036854775807L
We have added methods on an existing API interface, which does not break existing clients.
Severity: | 2 |
---|---|
Affected version(s): | 5.0.x |
XAResource.recover()
when failures happen during the regular commit or rollback, so the overhead for the backend is reduced.
For historical reasons we used to call the XA recovery routine on the backed whenever commit or rollback failed. The most common cause is network glitches, meaning that big clusters with a short network problem would suddenly hit the backends with recovery for all active transactions. Since recovery can be an expensive operation, this would result in needless load on the backends.
The rationale behind this was to avoid needless commit retries (based on the value of com.atomikos.icatch.oltp_max_retries), but the overhead does not justify the possible benefit.
From now on we no longer do this, since it is either the recovery process (in the background) or the application (via our API) that controls when recovery happens.
Worst case, this can lead to needless commit retries, in which case the backend should respond with error code XAER_NOTA and our code will handle this gracefully. However, we have historical records where some older version of ActiveMQ did not behave like this. This would result in errors in the ActiveMQ log files, in turn leading to alerts for the operations team.
If you experience issues with this, then it suffices to set com.atomikos.icatch.oltp_max_retries to zero. That will disable regular commit retries and delegate to the recovery background process.
Severity: | 2 |
---|---|
Affected version(s): | 5.0.x |
For releases 5.0 or higher, the maximum timeout should not be set to 0 or recovery will interfere with regular application-level commits.
The 5.0 release has a new recovery workflow that is incompatible with com.atomikos.icatch.max_timeout being zero. That is because recovery depends on the maximum timeout to perform rollback of pending (orphaned) prepared transactions in the backends. If the maximum timeout is zero then recovery (in the background) will rollback prepared transactions that are concurrently being committed in your application. This will result in heuristic exceptions and inconsistent transaction outcomes.
Keep in mind that the maximum timeout is also indicative of maximum lock duration in your databases, so choose it wisely! If you are / were depending on an unlimited maximum timeout then you are also allowing unlimited lock times.