This is our review of an excellent article published at InfoQ by some of our colleagues at Red Hat. See the references section at the end for the original publication, but this overview aims to give you all of the insights and more in just a few lines of text. Where applicable the content has been complemented with lessons learned from our own experience…
The bad guys are trying to get into your projects. What can you do to avoid pulling in bad code?
We now offer remoting support for gRPC so your transactions can span gRPC calls.
Sometimes you may want commit or rollback to extend across one more more outgoing gRPC calls. This is now possible.
See the example module: examples-jta-grpc in the download for a working sample.New interceptors for your gRPC architecture are included.
Generously donated by Pivotal's Spring Boot team, we were able to add the Spring Boot starter module to our code base.
Instead of adding Spring's starter module, you should now add "our" starter module to your Spring Boot project's pom.
In particular, instead of specifying:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jta-atomikos</artifactId>
</dependency>
You should now specify:
<dependency>
<groupId>com.atomikos</groupId>
<artifactId>transactions-spring-boot-starter</artifactId>
<version>5.0.9</version> <!-- or any later atomikos release that contains our starter module -->
</dependency>
Additional details are here...
No real API changes, only pom dependency changes.
You can now explicitly trigger recovery in your application, via our API.
import com.atomikos.icatch.RecoveryService; import com.atomikos.icatch.config.Configuration; boolean lax = true; //false to force recovery, true to allow intelligent mode RecoveryService rs = Configuration.getRecoveryService(); rs.performRecovery(lax);
In order for this to work, make sure to set (in jta.properties):
# set to Long.MAX_VALUE so background recovery is disabled com.atomikos.icatch.recovery_delay=9223372036854775807L
We have added methods on an existing API interface, which does not break existing clients.
| Severity: | 2 |
|---|---|
| Affected version(s): | 5.0.x |
Timed out transactions (in particular heuristic hazard cases after connection issues) used to stay around in the JVM and kept on generating timeout events. This has now been fixed.
Abandoned transaction instances now also stop any pending threads that generate timeout warnings. This prevents timed out transactions from generating endless warnings that are no longer relevant.
None.
| Severity: | 2 |
|---|---|
| Affected version(s): | 5.0.x |
We optimised connection pool efficiency for non-JTA/XA aware use cases, so connections are reused more efficiently when waiting for busy connections.
Previously, waiting connection requests were not notified immediately when a busy connection was being closed (i.e., marked for reuse) by the application. This has now been fixed.
None.
| Severity: | 2 |
|---|---|
| Affected version(s): | 5.0 |
In some rare cases the XAResource used for committing a transaction after prepare may become null (presumably due to connection errors, unconfirmed though). Instead of logging an error (like we used to), we now trust background-level recovery to handle this - for which it was designed in the first place. This should avoid repeated the errors logged in such a case.
This "bug" would lead to millisecond-level repeated errors similar to the following:
19/11/2020 15:20:43.467 [Atomikos:3321] ERROR com.atomikos.icatch.imp.CommitMessage - Unexpected error in commit
com.atomikos.icatch.HeurHazardException: XAResourceTransaction: 31302E3235342E3134362E3131302E746D313539353531353037383735393139373038:31302E3235342E3134362E3131302E746D373030333533: no XAResource to commit?
at com.atomikos.datasource.xa.XAResourceTransaction.commit(XAResourceTransaction.java:529)
at com.atomikos.icatch.imp.CommitMessage.send(CommitMessage.java:52)
at com.atomikos.icatch.imp.CommitMessage.send(CommitMessage.java:23)
at com.atomikos.icatch.imp.PropagationMessage.submit(PropagationMessage.java:67)
at com.atomikos.icatch.imp.Propagator$PropagatorThread.run(Propagator.java:63)
at com.atomikos.icatch.imp.Propagator.submitPropagationMessage(Propagator.java:42)
at com.atomikos.icatch.imp.HeurHazardStateHandler.onTimeout(HeurHazardStateHandler.java:71)
at com.atomikos.icatch.imp.CoordinatorImp.alarm(CoordinatorImp.java:650)
at com.atomikos.timing.PooledAlarmTimer.notifyListeners(PooledAlarmTimer.java:95)
at com.atomikos.timing.PooledAlarmTimer.run(PooledAlarmTimer.java:82)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
19/11/2020 15:20:53.468 [Atomikos:3321] ERROR com.atomikos.datasource.xa.XAResourceTransaction - XAResourceTransaction: 31302E3235342E3134362E3131302E746D313539353531353037383735393139373038:31302E3235342E3134362E3131302E746D373030333533: no XAResource to commit?
Now we no longer log these as errors, since we designed recovery to take care of exactly those kinds of exceptions.
None.
| Severity: | 4 |
|---|---|
| Affected version(s): | 5.0.x |
Fixed a bug that would happen in certain class loading environments and prevented CallableStatements from being created.
Fixed a bug that would happen in certain class loading environments and prevented CallableStatements from being created. This would lead to errors like this:
java.lang.ClassCastException: com.sun.proxy.$Proxy364 cannot be cast to java.sql.CallableStatement
at com.atomikos.jdbc.internal.AbstractJdbcConnectionProxy.prepareCall(AbstractJdbcConnectionProxy.java:73)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.atomikos.util.DynamicProxySupport.callProxiedMethod(DynamicProxySupport.java:162)
at com.atomikos.util.DynamicProxySupport.invoke(DynamicProxySupport.java:116)
at com.sun.proxy.$Proxy64.prepareCall(Unknown Source)
None.
| Severity: | 2 |
|---|---|
| Affected version(s): | 5.0.x |
Remoting participants now also respect commit ordering so you can avoid issues with JMS notifications going out before the backend databases are up to date.
Consider the following scenario for your microservice:
1. receive a JMS message 2. do a remote call to a different microservice that in turn updates a DB 3. send out a JMS "ticket" message
Previously, the JMS message (3) would be committed first, when DB of the remote call in 2 still needed to commit. Recipients of (3) would sometimes not see the updates in the DB when they receive the message.
This has now been fixed, by incorporating the remote participant of step 2 in the commit order.
None.
| Severity: | 4 |
|---|---|
| Affected version(s): | 5.0.x |
We now check for null when adding class loaders to create dynamic proxies.
At least one user has reported null issues when creating a dynamic proxy, like this:
java.lang.NullPointerException: null
at java.util.ArrayDeque.addLast(ArrayDeque.java:249) ~[?:1.8.0_251]
at java.util.ArrayDeque.add(ArrayDeque.java:423) ~[?:1.8.0_251]
at com.atomikos.util.DynamicProxySupport.getClassLoadersToTry(DynamicProxySupport.java:194) ~[?:?]
at com.atomikos.util.DynamicProxySupport.createDynamicProxy(DynamicProxySupport.java:189) ~[?:?]
at com.atomikos.jdbc.internal.AtomikosXAPooledConnection.doCreateConnectionProxy(AtomikosXAPooledConnection.java:119) ~[?:?]
at com.atomikos.jdbc.internal.AtomikosXAPooledConnection.doCreateConnectionProxy(AtomikosXAPooledConnection.java:31) ~[?:?
at com.atomikos.datasource.pool.AbstractXPooledConnection.createConnectionProxy(AbstractXPooledConnection.java:86) ~[?:?]
at com.atomikos.datasource.pool.ConnectionPoolWithConcurrentValidation.concurrentlyTryToUse(ConnectionPoolWithConcurrentValidation.java:61) ~[?:?]
at com.atomikos.datasource.pool.ConnectionPoolWithConcurrentValidation.retrieveFirstAvailableConnection(ConnectionPoolWithConcurrentValidation.java:43) ~[?:?]
at com.atomikos.datasource.pool.ConnectionPool.retrieveFirstAvailableConnectionAndGrowPoolIfNecessary(ConnectionPool.java:140) ~[?:?]
at com.atomikos.datasource.pool.ConnectionPool.findOrWaitForAnAvailableConnection(ConnectionPool.java:128) ~[?:?]
at com.atomikos.datasource.pool.ConnectionPool.borrowConnection(ConnectionPool.java:119) ~[?:?]
at com.atomikos.jdbc.internal.AbstractDataSourceBean.getConnection(AbstractDataSourceBean.java:371) ~[?:?]
We now fixed this by simply checking if a class loader is not null before attempting to use it.
None.
2
5.0.x
We used to pass all remote participants as direct participants, meaning that two-phase commit would sometimes be repeated and fail. This would particularly be the case in "diamond" calls - or also in deeper call hierarchies of more than one level down.
Consider a remoting client service A calling another remoting service B.
For remoting we distinguish between direct participants (i.e., endpoints at B to be included for two-phase commit at A) and indirect participants (i.e., URIs added as metadata only for checking orphaned calls). The class DefaultImportingTransactionManager in module transactions-remoting used to pass all its remote participants as direct participants, i.e. participants to be included in the remoting's two-phase commit set. In particular for diamond-shaped calls (A calls B and C, and both B and C in turn call D) this would give repeated two-phase commit calls to D, because A would incorrectly receive D as a direct participant via the call hierarchies of both B and C.We fixed this, and for this specific case A would now only receive 2 direct participants: B and C.
This fix also resolves an issue with a simpler call stack: A → B → C, where C would also be called for two-phase commit by both A and B. The is no longer the case.
None.
3
4.0.x, 5.0.x
Server-side (Tomcat) integration created threads for individual transactions (at web app level) with the class loader of the web application. This would cause (Tomcat) warnings when the web application is stopped because those threads will hang around in the thread pool.
Solution: we now start new threads with the server-level class loader.
If you have the transaction core at the server level (in the server's classpath) then individual transactions started in the web application will use the server-level thread pool of Atomikos (named com.atomikos.thread.TaskManager).
This would typically give warnings (in Tomcat) like this when the web application is being shut down:
WARNING [http-nio-3030-exec-6] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [mywebapp] appears to have started a thread named [Atomikos:2] but has failed to stop it. This is very likely to create a memory leak.
That's because the server-side thread pool threads used to be created with the web application's class loader.
This also happened for our built-in Tomcat integration (since it is configured at the server-level).
We now create those threads with the thread pool's classloader instead. When configured at the server level, this will no longer be the application's class loader and that seems to prevent this problem.
None.
2
4.0.x, 5.0.x
You now no longer get "Log corrupted - restart JVM" exceptions after you interrupt a thread that is writing to the transaction log file, or after any other exception that make a log checkpoint fail.
com.atomikos.recovery.fs.CachedRepository class, leaving the instance in an invalid state:
2021-03-01 16:15:56.662 ERROR 41669 --- [pool-1-thread-1] c.a.recovery.fs.FileSystemRepository : Failed to write checkpoint java.nio.channels.ClosedByInterruptException: null at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[na:1.8.0_192] at sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:392) ~[na:1.8.0_192] at com.atomikos.recovery.fs.FileSystemRepository.writeCheckpoint(FileSystemRepository.java:196) ~[transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.recovery.fs.CachedRepository.performCheckpoint(CachedRepository.java:84) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.recovery.fs.CachedRepository.put(CachedRepository.java:77) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.recovery.fs.OltpLogImp.write(OltpLogImp.java:46) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.persistence.imp.StateRecoveryManagerImp.preEnter(StateRecoveryManagerImp.java:51) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.finitestates.FSMImp.notifyListeners(FSMImp.java:164) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.finitestates.FSMImp.setState(FSMImp.java:251) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CoordinatorImp.setState(CoordinatorImp.java:284) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CoordinatorStateHandler.commitFromWithinCallback(CoordinatorStateHandler.java:346) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.ActiveStateHandler$6.doCommit(ActiveStateHandler.java:273) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CoordinatorStateHandler.commitWithAfterCompletionNotification(CoordinatorStateHandler.java:587) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.ActiveStateHandler.commit(ActiveStateHandler.java:268) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CoordinatorImp.commit(CoordinatorImp.java:550) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CoordinatorImp.terminate(CoordinatorImp.java:682) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.CompositeTransactionImp.commit(CompositeTransactionImp.java:279) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.jta.TransactionImp.commit(TransactionImp.java:168) [transactions-jta-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.jta.TransactionManagerImp.commit(TransactionManagerImp.java:428) [transactions-jta-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.jta.UserTransactionManager.commit(UserTransactionManager.java:160) [transactions-jta-5.0.9-SNAPSHOT.jar:na] at org.springframework.transaction.jta.JtaTransactionManager.doCommit(JtaTransactionManager.java:1035) [spring-tx-5.2.5.RELEASE.jar:5.2.5.RELEASE] at org.springframework.transaction.support.AbstractPlatformTransactionManager.processCommit(AbstractPlatformTransactionManager.java:743) [spring-tx-5.2.5.RELEASE.jar:5.2.5.RELEASE] at org.springframework.transaction.support.AbstractPlatformTransactionManager.commit(AbstractPlatformTransactionManager.java:711) [spring-tx-5.2.5.RELEASE.jar:5.2.5.RELEASE] at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:152) [spring-tx-5.2.5.RELEASE.jar:5.2.5.RELEASE] at com.example.atomikos.AtomikosApplicationTests.lambda$4(AtomikosApplicationTests.java:78) [test-classes/:na] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_192] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_192] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_192] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_192]
Later requests trying to read from the transaction logs would get systematic corruption errors like this:
com.atomikos.recovery.LogReadException: Log corrupted - restart JVM at com.atomikos.recovery.fs.CachedRepository.assertNotCorrupted(CachedRepository.java:137) ~[transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.recovery.fs.CachedRepository.findAllCommittingCoordinatorLogEntries(CachedRepository.java:145) ~[transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.recovery.fs.RecoveryLogImp.getExpiredPendingCommittingTransactionRecordsAt(RecoveryLogImp.java:52) ~[transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.RecoveryDomainService.performRecovery(RecoveryDomainService.java:76) ~[transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.icatch.imp.RecoveryDomainService$1.alarm(RecoveryDomainService.java:55) [transactions-5.0.9-SNAPSHOT.jar:na] at com.atomikos.timing.PooledAlarmTimer.notifyListeners(PooledAlarmTimer.java:101) [atomikos-util-5.0.9-SNAPSHOT.jar:na] at com.atomikos.timing.PooledAlarmTimer.run(PooledAlarmTimer.java:88) [atomikos-util-5.0.9-SNAPSHOT.jar:na] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_192] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_192] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_192]
This has now been fixed.
None.
We were informed about a potential security issue with Log4J:
Watch the explanation in this YouTube videoPlease adjust your Log4J dependency versions accordingly, to avoid any risk. No new Atomikos release needs to be installed since this is isolated to a 3rd party library dependency marked as optional, so it is not pulled in transitively. The API has been stable so it works with the latest secure Log4J versions at the time of publishing.
You can now configure a networkTimeout parameter for the pool.
Network issues are a recurring problem for connection pools: a pool attempts to keep connections open, whereas intermediaries on the network tend to close them (silently). In addition, backed servers going down can also invalidate the pool's connections.
These conditions can easily lead to long block times on the pool and its connections and the application thus becomes unresponsive. By setting the new networkTimeout property on our datasource classes you can limit the time that applications can block on the network.
This new feature only works if the underlying driver supports it (leave the property unset if not). Also, any timeout value you configure must be higher than the typical duration of your SQL operations, so it must also be higher than the transaction timeout.
FREE TEXT / OPTIONAL
| Severity: | 3 |
|---|---|
| Affected version(s): | 5.0.107 |
We now log warnings for errors during the prepare phase.
When an error happens during prepare then we used to log debug information. Consequently, some useful information was hard to find, in particular failures due to deferred constraint violations. We now log as warnings instead.
None.
| Severity: | 4 |
|---|---|
| Affected version(s): | 5.0.107 |
You can now (again) access the javadoc in your IDE.
We encountered a release problem in the 5.0.107 release for which we had to disable the javadoc plugin. This meant that most of that release went undocumented. We have now fixed this.
None, except that the documentation is now included.