JBoss.orgCommunity Documentation

Narayana Product Documentation

Mark Little

Jonathan Halliday

Andrew Dinn

Kevin Connor

Michael Musgrove

Gytis Trikleris

Abstract

The Transactions Overview Guide contains information on how to use Narayana to develop applications that use transaction technology to manage business processes.


Preface
1. Document Conventions
1.1. Typographic Conventions
1.2. Pull-quote Conventions
1.3. Notes and Warnings
2. We Need Feedback!
1. Transactions Overview
1.1. What is a transaction?
1.2. The Coordinator
1.3. The Transaction Context
1.4. Participants
1.5. Commit protocol
1.6. The Synchronization Protocol
1.7. Optimizations to the Protocol
1.8. Non-Atomic Transactions and Heuristic Outcomes
1.9. Interposition
1.10. A New Transaction Protocol
1.10.1. Addressing the Problems of Transactioning in Loosely Coupled Systems
2. Failure Recovery
2.1. Architecture of the Recovery Manager
2.1.1. Crash Recovery Overview
2.1.2. Recovery Manager
2.1.3. Recovery Modules
2.1.4. A Recovery Module for XA Resources
2.1.5. Recovering XAConnections
2.1.6. Alternative to XAResourceRecovery
2.1.7. Shipped XAResourceRecovery implementations
2.1.8. TransactionStatusConnectionManager
2.1.9. Expired Scanner Thread
2.1.10. Application Process
2.1.11. TransactionStatusManager
2.1.12. Object Store
2.1.13. Socket free operation
2.2. How Narayana manages the OTS Recovery Protocol
2.2.1. Recovery Protocol in OTS - Overview
2.2.2. RecoveryCoordinator in Narayana
2.2.3. The default RecoveryCoordinator in JacOrb
2.3. Configuration Options
2.3.1. Recovery Protocol in OTS - Overview
3. Development Guide
3.1. Transactions
3.1.1. The Java Transaction API (JTA)
3.1.2. Introducing the API
3.1.3. UserTransaction
3.1.4. TransactionManager
3.1.5. Suspend and resuming a transaction
3.1.6. The Transaction interface
3.1.7. Resource enlistment
3.1.8. Transaction synchronization
3.1.9. Transaction equality
3.1.10. TransactionSynchronizationRegistry
3.2. The Resource Manager
3.2.1. The XAResource interface
3.2.2. Opening a resource manager
3.2.3. Closing a resource manager
3.2.4. Thread of control
3.2.5. Transaction association
3.2.6. Externally controlled connections
3.2.7. Resource sharing
3.2.8. Local and global transactions
3.2.9. Transaction timeouts
3.2.10. Dynamic registration
3.3. General Transaction Issues
3.3.1. Advanced transaction issues with ArjunaCore
3.4. Tools
3.4.1. ObjectStore command-line browsers and editors
3.4.2. GUI Based Tools
3.4.3. View Transaction Statistics using an Application Server
3.5. Configuration options
3.5.1. Loading a configuration
3.5.2. ArjunaCore Options
3.5.3. Narayana JTA Configuration options
3.5.4. Narayana JTS Options
3.6. Important Log Messages
3.6.1. Transaction State Change
3.6.2. Multi cause log message
3.7. Troubleshooting
3.7.1. WS-BA Participant-Completion Race Condition
4. XTS Guide
4.1. Introduction
4.1.1. Managing service-Based Processes
4.1.2. Servlets
4.1.3. SOAP
4.1.4. Web Services Description Language (WDSL)
4.2. Getting Started
4.2.1. Enable XTS on WildFly Application Server
4.2.2. Working With WS-AT
4.2.3. Working With WS-BA
4.2.4. Configuration of The Transaction Context Propagation
4.2.5. Summary
4.3. The XTS API
4.3.1. Participants
4.3.2. API for the Atomic Transaction Protocol
4.3.3. API for the Business Activity Protocol
4.4. Stand-Alone Coordination
4.4.1. Introduction
4.4.2. Configuring the Activation Coordinator
4.5. Participant Crash Recovery
4.5.1. WS-AT Recovery
4.5.2. WS-BA Recovery
4.6. Web Service Transaction Service (XTS) Management
4.6.1. Transaction manager overview
4.6.2. Configuring the transaction manager
4.6.3. Deployment descriptors
4.7. Quickstarts Overview
4.7.1. WS-AT Multi-Service
4.7.2. WS-AT Multi-Hop
4.7.3. XTS with SSL
4.7.4. Raw XTS API Demo
4.7.5. Non-transactional Resource with Compensating Transactions API
4.7.6. Travel Agent with Compensating Transactions API
5. TXBridge Guide
5.1. Introduction
5.1.1. Contextual Overview
5.1.2. Transaction Bridging
5.2. Transaction Bridge Architecture
5.2.1. Overview
5.2.2. Shared Design Elements
5.2.3. Inbound Bridging
5.2.4. Outbound Bridging
5.2.5. Crash Recovery
5.3. Using the Transaction Bridge
5.3.1. Introduction
5.3.2. Enabling
5.3.3. Inbound Bridging
5.3.4. Outbound Bridging
5.3.5. Loops and Diamonds
5.3.6. Distributed JTA and the JTS
5.3.7. Logging
5.4. Known Limitations
5.5. Design Notes
5.5.1. General Points
5.5.2. Crash Recovery Considerations
5.5.3. Test framework

This manual uses several conventions to highlight certain words and phrases and draw attention to specific pieces of information.

In PDF and paper editions, this manual uses typefaces drawn from the Liberation Fonts set. The Liberation Fonts set is also used in HTML editions if the set is installed on your system. If not, alternative but equivalent typefaces are displayed. Note: Red Hat Enterprise Linux 5 and later includes the Liberation Fonts set by default.

Four typographic conventions are used to call attention to specific words and phrases. These conventions, and the circumstances they apply to, are as follows.

Mono-spaced Bold

Used to highlight system input, including shell commands, file names and paths. Also used to highlight keycaps and key combinations. For example:

The above includes a file name, a shell command and a keycap, all presented in mono-spaced bold and all distinguishable thanks to context.

Key combinations can be distinguished from keycaps by the hyphen connecting each part of a key combination. For example:

The first paragraph highlights the particular keycap to press. The second highlights two key combinations (each a set of three keycaps with each set pressed simultaneously).

If source code is discussed, class names, methods, functions, variable names and returned values mentioned within a paragraph will be presented as above, in mono-spaced bold. For example:

Proportional Bold

This denotes words or phrases encountered on a system, including application names; dialog box text; labeled buttons; check-box and radio button labels; menu titles and sub-menu titles. For example:

The above text includes application names; system-wide menu names and items; application-specific menu names; and buttons and text found within a GUI interface, all presented in proportional bold and all distinguishable by context.

Mono-spaced Bold Italic or Proportional Bold Italic

Whether mono-spaced bold or proportional bold, the addition of italics indicates replaceable or variable text. Italics denotes text you do not input literally or displayed text that changes depending on circumstance. For example:

Note the words in bold italics above — username, domain.name, file-system, package, version and release. Each word is a placeholder, either for text you enter when issuing a command or for text displayed by the system.

Aside from standard usage for presenting the title of a work, italics denotes the first use of a new and important term. For example:

Consider the following situation: a user wishes to purchase access to an on-line newspaper and requires to pay for this access from an account maintained by an on-line bank. Once the newspaper site has received the user’s credit from the bank, they will deliver an electronic token to the user granting access to their site. Ideally the user would like the debiting of the account, and delivery of the token to be “all or nothing” (atomic). However, hardware and software failures could prevent either event from occurring, and leave the system in an indeterminate state.

A transaction can be terminated in two ways: committed or aborted (rolled back). When a transaction is committed, all changes made within it are made durable (forced on to stable storage, e.g., disk). When a transaction is aborted, all of the changes are undone. Atomic actions can also be nested; the effects of a nested action are provisional upon the commit/abort of the outermost (top-level) atomic action.

Transactions have emerged as the dominant paradigm for coordinating interactions between parties in a (distributed) system, and in particular to manage applications that require concurrent access to shared data. A classic transaction is a unit of work that either completely succeeds, or fails with all partially completed work being undone. When a transaction is committed, all changes made by the associated requests are made durable, normally by committing the results of the work to a database. If a transaction should fail and is rolled back, all changes made by the associated work are undone. Transactions in distributed systems typically require the use of a transaction manager that is responsible for coordinating all of the participants that are part of the transaction.

A two-phase commit protocol is required to guarantee that all of the action participants either commit or abort any changes made. See Figure 1.2, “Two-Phase Commit Overview” which illustrates the main aspects of the commit protocol: during phase 1, the action coordinator, C, attempts to communicate with all of the action participants, A and B, to determine whether they will commit or abort. An abort reply from any participant acts as a veto, causing the entire action to abort. Based upon these (lack of) responses, the coordinator arrives at the decision of whether to commit or abort the action. If the action will commit, the coordinator records this decision on stable storage, and the protocol enters phase 2, where the coordinator forces the participants to carry out the decision. The coordinator also informs the participants if the action aborts.

When each participant receives the coordinator’s phase 1 message, they record sufficient information on stable storage to either commit or abort changes made during the action. After returning the phase 1 response, each participant who returned a commit response must remain blocked until it has received the coordinator’s phase 2 message. Until they receive this message, these resources are unavailable for use by other actions. If the coordinator fails before delivery of this message, these resources remain blocked. However, if crashed machines eventually recover, crash recovery mechanisms can be employed to unblock the protocol and terminate the action.


Note

During two-phase commit transactions, coordinators and resources keep track of activity in non-volatile data stores so that they can recover in the case of a failure.

Besides the two-phase commit protocol, traditional transaction processing systems employ an additional protocol, often referred to as the synchronization protocol . With the original ACID properties, Durability is important when state changes need to be available despite failures. Applications interact with a persistence store of some kind, such as a database, and this interaction can impose a significant overhead, because disk access is much slower to access than main computer memory.

One solution to the problem disk access time is to cache the state in main memory and only operate on the cache for the duration of a transaction. Unfortunately, this solution needs a way to flush the state back to the persistent store before the transaction terminates, or risk losing the full ACID properties. This is what the synchronization protocol does, with Synchronization Participants .

Synchronizations are informed that a transaction is about to commit. At that point, they can flush cached state, which might be used to improve performance of an application, to a durable representation prior to the transaction committing. The synchronizations are then informed about when the transaction completes and its completion state.

The synchronization protocol does not have the same failure requirements as the traditional two-phase commit protocol. For example, Synchronization participants do not need the ability to recover in the event of failures, because any failure before the two-phase commit protocol completes cause the transaction to roll back, and failures after it completes have no effect on the data which the Synchronization participants are responsible for.

There are several variants to the standard two-phase commit protocol that are worth knowing about, because they can have an impact on performance and failure recovery. Table 1.1, “Variants to the Two-Phase Commit Protocol” gives more information about each one.

Table 1.1. Variants to the Two-Phase Commit Protocol

Variant

Description

Presumed Abort

If a transaction is going to roll back, the coordinator may record this information locally and tell all enlisted participants. Failure to contact a participant has no effect on the transaction outcome. The coordinator is informing participants only as a courtesy. Once all participants have been contacted, the information about the transaction can be removed. If a subsequent request for the status of the transaction occurs, no information will be available and the requester can assume that the transaction has aborted. This optimization has the benefit that no information about participants need be made persistent until the transaction has progressed to the end of the prepare phase and decided to commit, since any failure prior to this point is assumed to be an abort of the transaction.

One-Phase

If only a single participant is involved in the transaction, the coordinator does not need to drive it through the prepare phase. Thus, the participant is told to commit, and the coordinator does not need to record information about the decision, since the outcome of the transaction is the responsibility of the participant.

Read-Only

When a participant is asked to prepare, it can indicate to the coordinator that no information or data that it controls has been modified during the transaction. Such a participant does not need to be informed about the outcome of the transaction since the fate of the participant has no affect on the transaction. Therefore, a read-only participant can be omitted from the second phase of the commit protocol.


In order to guarantee atomicity, the two-phase commit protocol is blocking . As a result of failures, participants may remain blocked for an indefinite period of time, even if failure recovery mechanisms exist. Some applications and participants cannot tolerate this blocking.

To break this blocking nature, participants that are past the prepare phase are allowed to make autonomous decisions about whether to commit or rollback. Such a participant must record its decision, so that it can complete the original transaction if it eventually gets a request to do so. If the coordinator eventually informs the participant of the transaction outcome, and it is the same as the choice the participant made, no conflict exists. If the decisions of the participant and coordinator are different, the situation is referred to as a non-atomic outcome, and more specifically as a heuristic outcome .

Resolving and reporting heuristic outcomes to the application is usually the domain of complex, manually driven system administration tools, because attempting an automatic resolution requires semantic information about the nature of participants involved in the transactions.

Precisely when a participant makes a heuristic decision depends on the specific implementation. Likewise, the choice the participant makes about whether to commit or to roll back depends upon the implementation, and possibly the application and the environment in which it finds itself. The possible heuristic outcomes are discussed in Table 1.2, “Heuristic Outcomes” .


Heuristic decisions should be used with care and only in exceptional circumstances, since the decision may possibly differ from that determined by the transaction service. This type of difference can lead to a loss of integrity in the system. Try to avoid needing to perform resolution of heuristics, either by working with services and participants that do not cause heuristics, or by using a transaction service that provides assistance in the resolution process.

Interposition is a scoping mechanism which allows coordination of a transaction to be delegated across a hierarchy of coordinators. See Figure 1.3, “Interpositions” for a graphical representation of this concept.


Interposition is particularly useful for Web Services transactions, as a way of limiting the amount of network traffic required for coordination. For example, if communications between the top-level coordinator and a web service are slow because of network traffic or distance, the web service might benefit from executing in a subordinate transaction which employs a local coordinator service. In Figure 1.3, “Interpositions” ,to prepare , the top-level coordinator only needs to send one prepare message to the subordinate coordinator, and receive one prepared or aborted reply. The subordinate coordinator forwards a prepare locally to each participant and combines the results to decide whether to send a single prepared or aborted reply.

Many component technologies offer mechanisms for coordinating ACID transactions based on two-phase commit semantics. Some of these are CORBA/OTS, JTS/JTA, and MTS/MSDTC. ACID transactions are not suitable for all Web Services transactions, as explained in Reasons ACID is Not Suitable for Web Services .

The main architectural components within Crash Recovery are illustrated in the diagram below:


The Recovery Manager is a daemon process1 responsible for performing crash recovery. Only one Recovery Manager runs per node. The Object Store provides persistent data storage for transactions to log data. During normal transaction processing each transaction will log persistent data needed for the commit phase to the Object Store. On successfully committing a transaction this data is removed, however if the transaction fails then this data remains within the Object Store.

The Recovery Manager functions by:

  • Periodically scanning the Object Store for transactions that may have failed. Failed transactions are indicated by the presence of log data after a period of time that the transaction would have normally been expected to finish.

  • Checking with the application process which originated the transaction whether the transaction is still in progress or not.

  • Recovering the transaction by re-activating the transaction and then replaying phase two of the commit protocol.

The following sections describe the architectural components in more detail.

On initialization the Recovery Manager first loads in configuration information via a properties file. This configuration includes a number of recovery activators and recovery modules, which are then dynamically loaded.

The Recovery Manager is not specifically tied to an Object Request Broker or ORB. Hence, the OTS recovery protocol is not implicitly enabled. To enable such protocol, we use the concept of recovery activator, defined with the interface RecoveryActivator, which is used to instantiate a recovery class related to the underlying communication protocol. For instance, when used with OTS, the RecoveryActivitor has the responsibility to create a RecoveryCoordinator object able to respond to the replay_completion operation.

All RecoveryActivator instances inherit the same interface. They are loaded via the following recovery extension property:

<entry key="RecoveryEnvironmentBean.recoveryActivators">
  list_of_class_names
<entry>

For instance the RecoveryActivator provided in the distribution of JTS/OTS, which shall not be commented, is as follow:

<entry key="RecoveryEnvironmentBean.recoveryActivators">
      com.arjuna.ats.internal.jts.orbspecific.recovery.RecoveryEnablement
<entry>

When loaded all RecoveryActivator instances provide the method startRCservice invoked by the Recovery Manager and used to create the appropriate Recovery Component able to receive recovery requests according to a particular transaction protocol. For instance the RecoveryCoordinator defined by the OTS protocol.

Each recovery module is used to recover a different type of transaction/resource, however each recovery module inherits the same basic behavior.

Recovery consists of two separate passes/phases separated by two timeout periods. The first pass examines the object store for potentially failed transactions; the second pass performs crash recovery on failed transactions. The timeout between the first and second pass is known as the backoff period. The timeout between the end of the second pass and the start of the first pass is the recovery period. The recovery period is larger than the backoff period.

The Recovery Manager invokes the first pass upon each recovery module, applies the backoff period timeout, invokes the second pass upon each recovery module and finally applies the recovery period timeout before restarting the first pass again.

The recovery modules are loaded via the following recovery extension property:

<entry key="RecoveryEnvironmentBean.recoveryExtenstions">
   list_of_class_names
<entry>

The backoff period and recovery period are set using the following properties:

<entry key="RecoveryEnvironmentBean.recoveryBackoffPeriod">

<entry key="RecoveryEnvironmentBean.periodicRecoveryPeriod">

The following java classes are used to implement the Recovery Manager:

After failure it is sometimes desirable to recover on a different node from the one where the transaction manager failed. This kind of usage is only supported in JTA mode running inside an application server (with certain restrictions) and is not typical because of the consequences of incorrect configuration:

This is a long list of caveats and if it is not possible to simply restart the failed node then, in order to avoid the consequences of incorrect configuration, we advise that the application server on the recovering node uses the same configuration file as the failed node.

As stated before each recovery module is used to recover a different type of transaction/resource, but each recovery module must implement the following RecoveryModule interface, which defines two methods: periodicWorkFirstPass and periodicWorkSecondPass invoked by the Recovery Manager.


During recovery, the Transaction Manager needs to be able to communicate to all resource managers that are in use by the applications in the system. For each resource manager, the Transaction Manager uses the XAResource.recover method to retrieve the list of transactions that are currently in a prepared or heuristically completed state. Typically, the system administrator configures all transactional resource factories that are used by the applications deployed on the system. An example of such a resource factory is the JDBC XADataSource object, which is a factory for the JDBC XAConnection objects.

Because XAResource objects are not persistent across system failures, the Transaction Manager needs to have some way to acquire the XAResource objects that represent the resource managers which might have participated in the transactions prior to the system failure. For example, a Transaction Manager might, through the use of JNDI lookup mechanism, acquire a connection from each of the transactional resource factories, and then obtain the corresponding XAResource object for each connection. The Transaction Manager then invokes the XAResource.recover method to ask each resource manager to return the transactions that are currently in a prepared or heuristically completed state.

One of the following recovery mechanisms will be used:

To manage recovery, we have seen in the previous chapter that the Recovery Manager triggers a recovery process by calling a set of recovery modules that implements the two methods defined by the RecoveryModule interface. To enable recovery of participants controlled via the XA interface, a specific recovery module named XARecoveryModule is provided. The XARecoveryModule, defined in the packages com.arjuna.ats.internal.jta.recovery.arjunacore and com.arjuna.ats.internal.jta.recovery.jts, handles recovery of XA resources (databases etc.) used in JTA.

Its behavior consists of two aspects: “transaction-initiated” and “resource-initiated” recovery. Transaction-initiated recovery is possible where the particular transaction branch had progressed far enough for a JTA Resource Record to be written in the ObjectStore.

A JTA Resource record contains the information needed to link the transaction, as known to the rest of Narayana, to the database. Resource-initiated recovery is necessary for branches where a failure occurred after the database had made a persistent record of the transaction, but before the JTA ResourceRecord was persisted. Resource-initiated recovery is also necessary for datasources for which it is not possible to hold information in the JTA Resource record that allows the recreation in the RecoveryManager of the XAConnection/XAResource that was used in the original application.

Transaction-initiated recovery is automatic. The XARecoveryModule finds the JTA Resource Record that need recovery, then uses the normal recovery mechanisms to find the status of the transaction it was involved in (i.e., it calls replay_completion on the RecoveryCoordinator for the transaction branch), (re)creates the appropriate XAResource and issues commit or rollback on it as appropriate. The XAResource creation will use the same information, database name, username, password etc., as the original application.

Resource-initiated recovery has to be specifically configured, by supplying the Recovery Manager with the appropriate information for it to interrogate all the databases (XADataSources) that have been accessed by any Narayana application. The access to each XADataSource is handled by a class that implements the com.arjuna.ats.jta.recovery.XAResourceRecovery interface, as illustrated in Figure 4. Instances of classes that implements the XAResourceRecovery interface are dynamically loaded, as controlled by properties with names beginning “com.arjuna.ats.jta.recovery.XAResourceRecovery”.


The XARecoveryModule will use the XAResourceRecovery implementation to get a XAResource to the target datasource. On each invocation of periodicWorkSecondPass, the recovery module will issue an XAResource.recover request – this will (as described in the XA specification) return a list of the transaction identifiers (Xid’s) that are known to the datasource and are in an indeterminate (in-doubt) state. The list of these in-doubt Xid’s received on successive passes (i.e. periodicWorkSecondPass-es) is compared. Any Xid that appears in both lists, and for which no JTA ResourceRecord was found by the intervening transaction-initiated recovery is assumed to belong to a transaction that was involved in a crash before any JTA ResourceRecord was written, and a rollback is issued for that transaction on the XAResource.

This double-scan mechanism is used because it is possible the Xid was obtained from the datasource just as the original application process was about to create the corresponding JTA_ResourceRecord. The interval between the scans should allow time for the record to be written unless the application crashes (and if it does, rollback is the right answer).

An XAResourceRecovery implementation class can be written to contain all the information needed to perform recovery to some datasource. Alternatively, a single class can handle multiple datasources. The constructor of the implementation class must have an empty parameter list (because it is loaded dynamically), but the interface includes an initialise method which passes in further information as a string. The content of the string is taken from the property value that provides the class name: everything after the first semi-colon is passed as the value of the string. The use made of this string is determined by the XAResourceRecovery implementation class.

For further details on the way to implement a class that implements the interface XAResourceRecovery, read the JDBC chapter of the JTA Programming Guide. An implementation class is provided that supports resource-initiated recovery for any XADataSource. This class could be used as a template to build your own implementation class.

If a failure occurs in the transaction environment after the transaction coordinator had told the XAResource to commit but before the transaction log has been updated to remove the participant, then recovery will attempt to replay the commit. In the case of a Serialized XAResource, the response from the XAResource will enable the participant to be removed from the log, which will eventually be deleted when all participants have been committed. However, if the XAResource is not recoverable then it is extremely unlikely that any XAResourceRecovery instance will be able to provide the recovery sub-system with a fresh XAResource to use in order to attempt recovery; in which case recovery will continually fail and the log entry will never be removed.

There are two possible solutions to this problem:

When recovering from failures, Narayana requires the ability to reconnect to databases that were in use prior to the failures in order to resolve any outstanding transactions. Most connection information will be saved by the transaction service during its normal execution, and can be used during recovery to recreate the connection. However, it is possible that not all such information will have been saved prior to a failure (for example, a failure occurs before such information can be saved, but after the database connection is used). In order to recreate those connections it is necessary to provide implementations of the following Narayana interface com.arjuna.ats.jta.recovery.XAResourceRecovery, one for each database that may be used by an application.

To inform the recovery system about each of the XAResourceRecovery instances, it is necessary to specify their class names through the JTAEnvironmentBean.xaResourceRecoveryInstances property variable, whose values is a list of space separated strings, each being a classname followed by optional configuration information.

JTAEnvironmentBean.xaResourceRecoveryInstances=com.foo.barRecovery

Additional information that will be passed to the instance when it is created may be specified after a semicolon:

JTAEnvironmentBean.xaResourceRecoveryInstances=com.foo.barRecovery;myData=hello

Any errors will be reported during recovery.


Each method should return the following information:

  • initialise: once the instance has been created, any additional information which occurred on the property value (anything found after the first semi-colon) will be passed to the object. The object can then use this information in an implementation specific manner to initialise itself, for example.

  • hasMoreResources: each XAResourceRecovery implementation may provide multiple XAResource instances. Before any call to getXAResource is made, hasMoreResources is called to determine whether there are any further connections to be obtained. If this returns false, getXAResource will not be called again during this recovery sweep and the instance will not be used further until the next recovery scan. It is up to the implementation to maintain the internal state backing this method and to reset the iteration as required. Failure to do so will mean that the second and subsequent recovery sweeps in the lifetime of the JVM do not attempt recovery.

  • getXAResource: returns an instance of the XAResource object. How this is created (and how the parameters to its constructors are obtained) is up to the XAResourceRecovery implementation. The parameters to the constructors of this class should be similar to those used when creating the initial driver or data source, and should obviously be sufficient to create new XAResources that can be used to drive recovery.

Note

If you want your XAResourceRecovery instance to be called during each sweep of the recovery manager then you should ensure that once hasMoreResources returns false to indicate the end of work for the current scan it then returns true for the next recovery scan.

The iterator based approach used by XAResourceRecovery leads to a requirement for implementations to manage state, which makes them more complex than necessary.

As an alternative, starting with Narayana 4.4, users may provide an implementation of the public interface


During each recovery sweep the getXAResources method will be called and recovery attempted on each element of the array. For the majority of resource managers it will be necessary to have only one XAResource in the array, as the recover() call on it can return multiple Xids.

Unlike XAResourceRecovery instances, which are configured via the xml properties file and instantiated by Narayana, instances of XAResourceRecoveryHelper and constructed by the application code and registered with Narayana by calling

XARecoveryModule.addXAResourceRecoveryHelper(...)

The initialize method is not called by Narayana in the current implementation, but is provided to allow for the addition of further configuration options in later releases.

XAResourceRecoveryHelper instances may be deregistered, after which they will no longer be called by the recovery manager. Deregistration may block for a time if a recovery scan is in progress.

XARecoveryModule.removeXAResourceRecoveryHelper(...)

The ability to dynamically add and remove instances of XAResourceRecoveryHelper whilst the system is running makes this approach an attractive option for environments in which e.g. datasources may be deployed or undeployed, such as application servers. Care should be taken with classloading behaviour in such cases.

Recovery of XA datasources can sometimes be implementation dependant, requiring developers to provide their own XAResourceRecovery instances. However, Narayana ships with several out-of-the-box implementations that may be useful.

Because these classes are XAResourceRecovery instances they are passed any necessary initialization information via the initialise operation. In the case of BasicXARecovery and JDBCXARecovery this should be the location of a property file and is specified in the Narayana configuration file. For example:

com.arjuna.ats.jta.recovery.XAResourceRecoveryJDBC=com.arjuna.ats.internal.jdbc.recovery.JDBCXAResourceRecovery;thePropertyFile

When the Recovery Manager initialises an expiry scanner thread ExpiryEntryMonitor is created which is used to remove long dead items from the ObjectStore. A number of scanner modules are dynamically loaded which remove long dead items for a particular type.

Scanner modules are loaded at initialisation and are specified as properties beginning with

<entry key="RecoveryEnvironmentBean.expiryScanners"> 
  list of class names
</entry>

All the scanner modules are called periodically to scan for dead items by the ExpiryEntryMonitor thread. This period is set with the property:

<entry key="RecoveryEnvironmentBean.expiryScanInterval"> 
  number_of_hours
</entry>

All scanners inherit the same behaviour from the java interface ExpiryScanner. A scan method is provided by this interface and implemented by all scanner modules, this is the method that gets called by the scanner thread.

The ExpiredTransactionStatusManagerScanner removes long dead TransactionStatusManagerItems from the Object Store. These items will remain in the Object Store for a period of time before they are deleted. This time is set by the property:

<entry key="RecoveryEnvironmentBean.transactionStatusManagerExpiryTime"> 
  number_of_hours
</entry> (default 12 hours)

The AtomicActionExpiryScanner moves transaction logs for AtomicActions that are assumed to have completed. For instance, if a failure occurs after a participant has been told to commit but before the transaction system can update the log, then upon recovery Narayana recovery will attempt to replay the commit request, which will obviously fail, thus preventing the log from being removed. This is also used when logs cannot be recovered automatically for other reasons, such as being corrupt or zero length. All logs are moved to a location based on the old location appended with /Expired.

The use of TCP/IP sockets for TransactionStatusManager and RecoveryManager provides for maximum flexibility in the deployment architecture. It is often desirable to run the RecoveryManager in a separate JVM from the Transaction manager(s) for increased reliability. In such deployments, TCP/IP provides for communication between the RecoveryManager and transaction manager(s), as detailed in the preceding sections. Specifically, each JVM hosting a TransactionManager will run a TransactionStatusManager listener, through which the RecoveryManager can contact it to determine if a transaction is still live or not. The RecoveryManager likewise listens on a socket, through which it can be contacted to perform recovery scans on demand. The presence of a recovery listener is also used as a safety check when starting a RecoveryManager, since at most one should be running for a given ObjectStore.

There are some deployment scenarios in which there is only a single TransactionManager accessing the ObjectStore and the RecoveryManager is co-located in the same JVM. For such cases the use of TCP/IP sockets for communication introduces unnecessary runtime overhead. Additionally, if several such distinct processes are needed for e.g. replication or clustering, management of the TCP/IP port allocation can become unwieldy. Therefore it may be desirable to configure for socketless recovery operation.

The property CoordinatorEnvironmentBean.transactionStatusManagerEnable can be set to a value of NO to disable the TransactionStatusManager for any given TransactionManager. Note that this must not be done if recovery runs in a separate process, as it may lead to incorrect recovery behavior in such cases. For an in-process recovery manager, the system will use direct access to the ActionStatusService instead.

The property RecoveryEnvironmentBean.recoveryListener can likewise be used to disable the TCP/IP socket listener used by the recovery manager. Care must be taken not to inadvertently start multiple recovery managers for the same ObjectStore, as this error, which may lead to significant crash recovery problems, cannot be automatically detected and prevented without the benefit of the socket listener.

On each resource registration a RecoveryCoordinator Object is expected to be created and returned to the application that invoked the register_resource operation. Behind each CORBA object there should be an object implementation or Servant object, in POA terms, which performs operations made on a RecoveryCoordinator object. Rather than to create a RecoveryCoordinator object with its associated servant on each register_resource, Narayana enhances performance by avoiding the creation of servants but it relies on a default RecoveryCoordinator object with it’s associated default servant to manage all replay_completion invocations.

In the next sections we first give an overview of the Portable Object Adapter architecture, then we describe how this architecture is used to provide RecoveryCoordinator creation with optimization as explained above.

Basically, the Portable Object Adapter, or POA is an object that intercepts a client request and identifies the object that satisfies the client request. The Object is then invoked and the response is returned to the client.


The object that performs the client request is referred as a servant, which provides the implementation of the CORBA object requested by the client. A servant provides the implementation for one or more CORBA object references. To retreive a servant, each POA maintains an Active Object Map that maps all objects that have been activated in the POA to a servant. For each incoming request, the POA looks up the object reference in the Active Object Map and tries to find the responsible servant. If none is found, the request is either delegated to a default servant, or a servant manager is invoked to activate or locate an appropriate servant. In addition to the name space for the objects, which are identified by Object Ids, a POA also provides a name space for POAs. A POA is created as a child of an existing POA, which forms a hierarchy starting with the root POA.

Each POA has a set of policies that define its characteristics. When creating a new POA, the default set of policies can be used or different values can be assigned that suit the application requirements. The POA specification defines:

  • Thread policy – Specifies the threading model to be used by the POA. Possible values are:

    • ORB_CTRL_MODEL – (default) The POA is responsible for assigning requests to threads.

    • SINGLE_THREAD_MODEL – the POA processes requests sequentially

  • Lifespan policy - specifies the lifespan of the objects implemented in the POA. The lifespan policy can have the following values:

    • TRANSIENT (Default) Objects implemented in the POA cannot outlive the process in which they are first created. Once the POA is deactivated, an OBJECT_NOT_EXIST exception occurs when attempting to use any object references generated by the POA.

    • PERSISTENT Objects implemented in the POA can outlive the process in which they are first created.

  • Object ID Uniqueness policy - allows a single servant to be shared by many abstract objects. The Object ID Uniqueness policy can have the following values:

    • UNIQUE_ID (Default) Activated servants support only one Object ID.

    • MULTIPLE_ID Activated servants can have one or more Object IDs. The Object ID must be determined within the method being invoked at run time.

  • ID Assignment policy - specifies whether object IDs are generated by server applications or by the POA. The ID Assignment policy can have the following values:

    • USER_ID is for persistent objects, and

    • SYSTEM_ID is for transient objects

  • Servant Retention policy - specifies whether the POA retains active servants in the Active Object Map. The Servant Retention policy can have the following values:

    • RETAIN (Default) The POA tracks object activations in the Active Object Map. RETAIN is usually used with ServantActivators or explicit activation methods on POA.

    • NON_RETAIN The POA does not retain active servants in the Active Object Map. NON_RETAIN is typically used with ServantLocators.

  • Request Processing policy - specifies how requests are processed by the POA.

    • USE_ACTIVE_OBJECT_MAP (Default) If the Object ID is not listed in the Active Object Map, an OBJECT_NOT _EXIST exception is returned. The POA must also use the RETAIN policy with this value.

    • USE_DEFAULT_SERVANT If the Object ID is not listed in the Active Object Map or the NON_RETAIN policy is set, the request is dispatched to the default servant. If no default servant has been registered, an OBJ_ADAPTER exception is returned. The POA must also use the MULTIPLE_ID policy with this value.

    • USE_SERVANT_MANAGER If the Object ID is not listed in the Active Object Map or the NON_RETAIN policy is set, the servant manager is used to obtain a servant.

  • Implicit Activation policy - specifies whether the POA supports implicit activation of servants. The Implicit Activation policy can have the following values:

    • IMPLICIT_ACTIVATION The POA supports implicit activation of servants. Servants can be activated by converting them to an object reference with org.omg.PortableServer.POA.servant_to_reference() or by invoking _this()on the servant. The POA must also use the SYSTEM_ID and RETAIN policies with this value.

    • NO_IMPLICIT_ACTIVATION (Default) The POA does not support implicit activation of servants.

It appears that to redirect replay_completion invocations to a default servant we need to create a POA with the Request Processing policy assigned with the value set to USE_DEFAULT_SERVANT. However to reach that default Servant we should first reach the POA that forward the request to the default servant. Indeed, the ORB uses a set of information to retrieve a POA; these information are contained in the object reference used by the client. Among these information there are the IP address and the port number where resides the server and also the POA name. To perform replay_completion invocations, the solution adopted by Narayana is to provide one Servant, per machine, and located in the RecoveryManager process, a separate process from client or server applications. The next section explains how the indirection to a default Servant located on a separate process is provided for JacORB.

JacORB does not define additional policies to redirect any request on a RecoveryCoordinator object to a default servant located in the Recovery Manager process. However it provides a set of APIs that allows building object references with specific IP address, port number and POA name in order to reach the appropriate default servant.

When the Recovery Manager is launched it seeks in the configuration the RecoveryActivator that need be loaded. Once done it invokes the startRCservice method of each loaded instances. As seen in in the previous chapter (Recovery Manager ) the class to load that implements the RecoveryActivator interface is the class RecoveryEnablement. This generic class, located in the package com.arjuna.ats.internal.jts.orbspecific.recovery, hides the nature of the ORB being used by the application (JacORB). The following figure illustrates the behavior of the RecoveryActivator that leads to the creation of the default servant that performs replay_completion invocations requests.

In addition to the creation of the default servant, an object reference to a RecoveryCoordinator object is created and stored in the ObjectStore. As we will see this object reference will be used to obtain its IP address, port number and POA name and assign them to any RecoveryCoordinator object reference created on register_resource.


When an application registers a resource with a transaction, a RecoveryCoordinator object reference is expected to be returned. To build that object reference, the Transaction Service uses the RecoveryCoordinator object reference created within the Recovery Manager as a template. The new object reference contains practically the same information to retrieve the default servant (IP address, port number, POA name, etc.), but the Object ID is changed; now, it contains the Transaction ID of the transaction in progress and also the Process ID of the process that is creating the new RecoveryCoordinator object reference, as illustrated in Figure 11.


Since a RecoveryCoordintaor object reference returned to an application contains all information to retrieve the POA then the default servant located in the Recovery Manager, all replay_completion invocation, per machine, are forwarded to the same default RecoveryCoordinator that is able to retreive the Object ID from the incoming request to extract the transaction identifier and the process identifier needed to determine the status of the requested transaction.

Narayana is highly configurable. For full details of the configuration mechanism used, see the Programmer's Guide.

The following table shows the configuration features, with default values shown in italics. More details about each option can be found in the relevant sections of this document.

Configuration NamePossible ValuesDescription

...periodicRecoveryPeriod

120/any positive integer

Interval between recovery attempts, in seconds.

...recoveryBackoffPeriod

10/any positive integer

Interval between first and second recovery passes, in seconds.

...periodicRecoveryInitilizationOffset

0/any non-negative integer

Interval before first recovery pass, in seconds.

...expiryScanInterval

12/any integer

Interval between expiry scans, in hours. 0 disables scanning. Negative values postpone the first run.

...transactionStatusManagerExpiryTime

12/any positive integer

Interval after which a non-contactable process is considered dead. 0 = never.

A transaction is a unit of work that encapsulates multiple database actions such that that either all the encapsulated actions fail or all succeed.

Transactions ensure data integrity when an application interacts with multiple datasources.

The interfaces specified by the many transaction standards tend to be too low-level for most application programmers. Therefore, Sun Microsystems created the Java Transaction API (JTA), which specifies higher-level interfaces to assist in the development of distributed transactional applications.

Note, these interfaces are still low-level. You still need to implement state management and concurrency for transactional applications. The interfaces are also optimized for applications which require XA resource integration capabilities, rather than the more general resources which other transactional APIs allow.

With reference to JTA 1.1 ( http://www.oracle.com/technetwork/java/javaee/tech/jta-138684.html ), distributed transaction services typically involve a number of participants:

application server

provides the infrastructure required to support the application run-time environment which includes transaction state management, such as an EJB server.

transaction manager

provides the services and management functions required to support transaction demarcation, transactional resource management, synchronization, and transaction context propagation.

resource manager

Using a resource adapter , provides the application with access to resources. The resource manager participates in distributed transactions by implementing a transaction resource interface used by the transaction manager to communicate transaction association, transaction completion and recovery.

A resource adapter is used by an application server or client to connect to a Resource Manager. JDBC drivers which are used to connect to relational databases are examples of Resource Adapters.

communication resource manager

supports transaction context propagation and access to the transaction service for incoming and outgoing requests.

From the point of view of the transaction manager, the actual implementation of the transaction services does not need to be exposed. You only need to define high-level interfaces to allow transaction demarcation, resource enlistment, synchronization and recovery process to be driven from the users of the transaction services. The JTA is a high-level application interface that allows a transactional application to demarcate transaction boundaries, and also contains a mapping of the X/Open XA protocol.

Compatibility

the JTA support provided by Narayana is compliant with the 1.1 specification.

The TransactionManager interface allows the application server to control transaction boundaries on behalf of the application being managed.

To obtain a TransactionManager , invoke the static method com.arjuna.ats.jta.TransactionManager.transactionManager .

The TransactionManager maintains the transaction context association with threads as part of its internal data structure. A thread’s transaction context may be null or it may refer to a specific global transaction. Multiple threads may be associated with the same global transaction. As noted in Section 3.1.3, “UserTransaction” , nested transactions are not supported.

Each transaction context is encapsulated by a Transaction object, which can be used to perform operations which are specific to the target transaction, regardless of the calling thread’s transaction context.


In a multi-threaded environment, multiple threads may be active within the same transaction. If checked transaction semantics have been disabled, or the transaction times out, a transaction may terminated by a thread other than the one that created it. In this case, the creator usually needs to be notified. Narayana notifies the creator during operations commit or rollback by throwing exception IllegalStateException .

The JTA supports the concept of a thread temporarily suspending and resuming transactions in order to perform non-transactional work. Call the suspend method to temporarily suspend the current transaction that is associated with the calling thread. The thread then operates outside of the scope of the transaction. If the thread is not associated with any transaction, a null object reference is returned. Otherwise, a valid Transaction object is returned. Pass the Transaction object to the resume method to reinstate the transaction context.

The resume method associates the specified transaction context with the calling thread. If the transaction specified is not a valid transaction, , the thread is associated with no transaction. if resume is invoked when the calling thread is already associated with another transaction, the IllegalStateException exception is thrown.


Note

Narayana allows a suspended transaction to be resumed by a different thread. This feature is not required by JTA, but is an important feature.

When a transaction is suspended, the application server must ensure that the resources in use by the application are no longer registered with the suspended transaction. When a resource is de-listed this triggers the Transaction Manager to inform the resource manager to disassociate the transaction from the specified resource object. When the application’s transaction context is resumed, the application server must ensure that the resources in use by the application are again enlisted with the transaction. Enlisting a resource as a result of resuming a transaction triggers the Transaction Manager to inform the resource manager to re-associate the resource object with the resumed transaction.

Typically, an application server manages transactional resources, such as database connections, in conjunction with some resource adapter and optionally with connection pooling optimization. For an external transaction manager to coordinate transactional work performed by the resource managers, the application server must enlist and de-list the resources used in the transaction. These resources, called participants , are enlisted with the transaction so that they can be informed when the transaction terminates, by being driven through the two-phase commit protocol.

As stated previously, the JTA is much more closely integrated with the XA concept of resources than the arbitrary objects. For each resource the application is using, the application server invokes the enlistResource method with an XAResource object which identifies the resource in use.

The enlistment request causes the transaction manager to inform the resource manager to start associating the transaction with the work performed through the corresponding resource. The transaction manager passes the appropriate flag in its XAResource.start method call to the resource manager.

The delistResource method disassociates the specified resource from the transaction context in the target object. The application server invokes the method with the two parameters: the XAResource object that represents the resource, and a flag to indicate whether the operation is due to the transaction being suspended ( TMSUSPEND ), a portion of the work has failed ( TMFAIL ), or a normal resource release by the application ( TMSUCCESS ).

The de-list request causes the transaction manager to inform the resource manager to end the association of the transaction with the target XAResource . The flag value allows the application server to indicate whether it intends to come back to the same resource whereby the resource states must be kept intact. The transaction manager passes the appropriate flag value in its XAResource.end method call to the underlying resource manager.

Transaction synchronization allows the application server to be notified before and after the transaction completes. For each transaction started, the application server may optionally register a Synchronization call-back object to be invoked by the transaction manager, which will be one of the following:

beforeCompletion

Called before the start of the two-phase transaction complete process. This call is executed in the same transaction context of the caller who initiates the TransactionManager.commit or the call is executed with no transaction context if Transaction.commit is used.

afterCompletion

Called after the transaction completes. The status of the transaction is supplied in the parameter. This method is executed without a transaction context.

NOTE: If a JTA XAResource throws a RuntimeException, this method will not be called as the transaction has not and cannot complete. Please see JBTM-2148 for more details.

The javax.transaction.TransactionSynchronizationRegistry interface, added to the JTA API in version 1.1, provides for registering Synchronizations with special ordering behavior, and for storing key-value pairs in a per-transaction Map. Full details are available from the JTA 1.1 API specification and javadoc. Here we focus on implementation specific behavior.


Accessing the TransactionSynchronizationRegistry via JNDI.  In application server environments, the standard JNDI name binding is java:comp/TransactionSynchronizationRegistry .

Ordering of interposed Synchronizations is relative to other local Synchronizations only. In cases where the transaction is distributed over multiple JVMs, global ordering is not guaranteed.

The per-transaction data storage provided by the TransactionSynchronizationRegistry methods getResource and putResource are non-persistent and thus not available in Transactions during crash recovery. When running integrated with an application server or other container, this storage may be used for system purposes. To avoid collisions, use an application-specific prefix on map keys, such as put(“myapp_”+key, value) . The behavior of the Map on Thread s that have status NO_TRANSACTION or where the transaction they are associated with has been rolled back by another Thread , such as in the case of a timeout, is undefined. A Transaction can be associated with multiple Thread s. For such cases the Map is synchronized to provide thread safety.

Some transaction specifications and systems define a generic resource which can be used to register arbitrary resources with a transaction, the JTA is much more XA-specific. Interface javax.transaction.xa.XAResource is a Java mapping of the XA interface. The XAResource interface defines the contract between a ResourceManager and a TransactionManager in a distributed transaction processing environment. A resource adapter for a ResourceManager implements the XAResource interface to support association of a top-level transaction to a resource such as a relational database.

The XAResource interface can be supported by any transactional resource adapter designed to be used in an environment where transactions are controlled by an external transaction manager, such a database management system. An application may access data through multiple database connections. Each database connection is associated with an XAResource object that serves as a proxy object to the underlying ResourceManager instance. The transaction manager obtains an XAResource for each ResourceManager participating in a top-level transaction. The start method associates the transaction with the resource, and the end method disassociates the transaction from the resource.

The ResourceManager associates the transaction with all work performed on its data between invocation of start and end methods. At transaction commit time, these transactional ResourceManager s are informed by the transaction manager to prepare, commit, or roll back the transaction according to the two-phase commit protocol.

For better Java integration, the XAResource differs from the standard XA interface in the following ways:

By default, whenever an XAResource object is registered with a JTA-compliant transaction service, there is no way to manipulate the order in which it is invoked during the two-phase commit protocol, with respect to other XAResource objects. Narayana, however, provides support for controlling the order via the two interfaces com.arjuna.ats.jta.resources.StartXAResource and com.arjuna.ats.jta.resources.EndXAResource . By inheriting your XAResource instance from either of these interfaces, you control whether an instance of your class is invoked first or last, respectively.

The ArjunaCore Development Guide discusses the Last Resource Commit optimization (LRCO) , whereby a single resource that is only one-phase aware, and does not support the prepare phase, can be enlisted with a transaction that is manipulating two-phase aware participants. This optimization is also supported within the Narayana.

In order to use the LRCO, your XAResource implementation must extend the com.arjuna.ats.jta.resources.LastResourceCommitOptimisation marker interface. A marker interface is an interface which provides no methods. When enlisting the resource via method Transaction.enlistResource, Narayana ensures that only a single instance of this type of participant is used within each transaction. Your resource is driven last in the commit protocol, and no invocation of method prepare occurs.

By default an attempt to enlist more than one instance of a LastResourceCommitOptimisation class will fail and false will be returned from Transaction.enlistResource. This behavior can be overridden by setting the com.arjuna.ats.jta.allowMultipleLastResources to true. However, before doing so you should read the section on enlisting multiple one-phase aware resources.

One-phase commit is used to process a single one-phase aware resource, which does not conform to the two-phase commit protocol. You can still achieve an atomic outcome across resources, by using the LRCO, as explained earlier.

Multiple one-phase-aware resources may be enlisted in the same transaction. One example is when a legacy database runs within the same transaction as a legacy JMS implementation. In such a situation, you cannot achieve atomicity of transaction outcome across multiple resources, because none of them enter the prepare state. They commit or roll back immediately when instructed by the transaction coordinator, without knowledge of other resource states and without a way to undo if subsequent resources make a different choice. This can result in data corruption or heuristic outcomes.

You can approach these situations in two different ways:

If neither of these options is viable, Narayana support enlisting multiple one-phase aware resources within the same transaction, using LRCO, which is discussed in the ArjunaCore Development Guide in detail.

When the same transactional resource is used to interleave multiple transactions, the application server must ensure that only one transaction is enlisted with the resource at any given time. To initiate the transaction commit process, the transaction manager is allowed to use any of the resource objects connected to the same resource manager instance. The resource object used for the two-phase commit protocol does not need to have been involved with the transaction being completed.

The resource adapter must be able to handle multiple threads invoking the XAResource methods concurrently for transaction commit processing. This is illustrated in Example 3.4, “Resource sharing example” .


You can associate timeout values with transactions in order to control their lifetimes. If the timeout value elapses before a transaction terminates, by committing or rolling back, the transaction system rolls it back. The XAResource interface supports a setTransactionTimeout operation, which allows the timeout associated with the current transaction to be propagated to the resource manager and if supported, overrides any default timeout associated with the resource manager. Overriding the timeout can be useful when long-running transactions may have lifetimes that would exceed the default, and using the default timeout would cause the resource manager to roll back before the transaction terminates, and cause the transaction to roll back as well.

If You do not explicitly set a timeout value for a transaction, or you use a value of 0, an implementation-specific default value may be used. In Narayana, property value CoordinatorEnvironmentBean.defaultTimeout represents this implementation-specific default, in seconds. The default value is 60 seconds. A value of 0 disables default transaction timeouts.

Unfortunately, imposing the same timeout as the transaction on a resource manager is not always appropriate. One example is that your business rules may require you to have control over the lifetimes on resource managers without allowing that control to be passed to some external entity. Narayana supports an all-or-nothing approach to whether or not method setTransactionTimeout is called on XAResource instances.

If the JTAEnvironmentBean.xaTransactionTimeoutEnabled property is set to true , which is the default, it is called on all instances. Otherwise, use the setXATransactionTimeoutEnabled method of com.arjuna.ats.jta.common.Configuration .

Atomic actions (transactions) can be used by both application programmers and class developers. Thus entire operations (or parts of operations) can be made atomic as required by the semantics of a particular operation. This chapter will describe some of the more subtle issues involved with using transactions in general and ArjunaCore in particular.

Note: in the past ArjunaCore was also referred to as TxCore.

In a multi-threaded application, multiple threads may be associated with a transaction during its lifetime, sharing the context. In addition, it is possible that if one thread terminates a transaction, other threads may still be active within it. In a distributed environment, it can be difficult to guarantee that all threads have finished with a transaction when it is terminated. By default, ArjunaCore will issue a warning if a thread terminates a transaction when other threads are still active within it. However, it will allow the transaction termination to continue.

Other solutions to this problem are possible. One example would be to block the thread which is terminating the transaction until all other threads have disassociated themselves from the transaction context. Therefore, ArjunaCore provides the com.arjuna.ats.arjuna.coordinator.CheckedAction class, which allows the thread or transaction termination policy to be overridden. Each transaction has an instance of this class associated with it, and application programmers can provide their own implementations on a per transaction basis.


When a thread attempts to terminate the transaction and there are active threads within it, the system will invoke the check method on the transaction’s CheckedAction object. The parameters to the check method are:

isCommit

Indicates whether the transaction is in the process of committing or rolling back.

actUid

The transaction identifier.

list

A list of all of the threads currently marked as active within this transaction.

When check returns, the transaction termination will continue. Obviously the state of the transaction at this point may be different from that when check was called, e.g., the transaction may subsequently have been committed.

A CheckedAction instance is created for each transaction. As mentioned above, the default implementation simply issues warnings in the presence of multiple threads active on the transaction when it is terminated. However, a different instance can be provided to each transaction in one of the following ways:

  • Use the setCheckedAction method on the BasicAction instance.

  • Define an implementation of the CheckedActionFactory interface, which has a single method getCheckedAction ( final Uid txId , final String actionType ) that returns a CheckedAction . The factory class name can then be provided to the Transaction Service at runtime by setting the CoordinatorEnvironmentBean.checkedActionFactory property.

By default, the Transaction Service does not maintain any history information about transactions. However, by setting the CoordinatorEnvironmentBean.enableStatistics property variable to YES , the transaction service will maintain information about the number of transactions created, and their outcomes. This information can be obtained during the execution of a transactional application via the com.arjuna.ats.arjuna.coordinator.TxStats class.


The class ActionManager gives further information about specific active transactions through the classes getTimeAdded , which returns the time (in milliseconds) when the transaction was created, and inflightTransactions , which returns the list of currently active transactions.

Narayana supports a number of different transaction log implementations. They are outlined below.

This chapter describes the various tools for managing transactions.

There are currently three command-line editors for manipulating the ObjectStore. These tools are used to manipulate the lists of heuristic participants maintained by a transaction log. They allow a heuristic participant to be moved from that list back to the list of prepared participants so that transaction recovery may attempt to resolve them automatically.

The WildFly Application Server provides a command-line based Management CLI which supports the ability to browse and manipulate transaction records. This functionality is provided by the interaction between the Transaction Manager (TM) and the Management API of the application server. To start the CLI on a non-windows based OS type the following command in application server install directory:

./bin/jboss-cli.sh --connect controller=IP_ADDRESS

On Windows platforms use the jboss-cli.bat script

The transaction manager stores information about each active transaction, and the participants involved in the transaction, in a persistent storage area called the object store . The Management API exposes the object store as a resource called the log-store . An API operation called probe reads the transaction logs and creates a node in the management model corresponding to each log. These nodes can be inspected using the CLI. Transaction logs are transient so these nodes quickly become out of date but you can call the probe command manually whenever you need to refresh the log-store .


Example 3.8. View All Prepared Transactions

To view all prepared transactions, first refresh the log store (see Example 3.7, “Refresh the Log Store” ), then run the following command, which functions similarly to a filesystem ls command.

ls /subsystem=transactions/log-store=log-store/transactions

Each transaction is shown, along with its unique identifier. Individual operations can be run against an individual transaction (see Manage a Transaction ).


Manage a Transaction

View a transaction's attributes.

To view information about a transaction, such as its JNDI name, EIS product name and version, or its status, use the :read-resource CLI command.

/subsystem=transactions/log-store=log-store/transactions=0\:ffff7f000001\:-b66efc2\:4f9e6f8f\:9:read-resource
View the participants of a transaction.

Each transaction log contains a child element called participants . Use the read-resource CLI command on this element to see the participants of the transaction. Participants are identified by their JNDI names (or some other unique identifier if the JNDI name is not available).

/subsystem=transactions/log-store=log-store/transactions=0\:ffff7f000001\:-b66efc2\:4f9e6f8f\:9/participants=java\:\/JmsXA:read-resource

The result may look similar to this:

{
   "outcome" => "success",
   "result" => {
       "eis-product-name" => "HornetQ",
       "eis-product-version" => "2.0",
       "jndi-name" => "java:/JmsXA",
       "status" => "HEURISTIC_HAZARD",
       "type" => "/StateManager/AbstractRecord/XAResourceRecord"
   }
}
                        

The outcome status shown here is in a HEURISTIC_HAZARD state and is eligible for recovery. Refer to Recover a transaction participant. for more details.

Delete a transaction.

Each transaction log supports a :delete operation, to delete the transaction log representing the transaction.

/subsystem=transactions/log-store=log-store/transactions=0\:ffff7f000001\:-b66efc2\:4f9e6f8f\:9:delete

Warning

If failures occur, transaction logs may remain in the object store until crash recovery facilities have resolved the transactions they represent. Therefore, it is very important that the contents of the object store are not deleted inadvertently, as this will make it impossible to resolve in-doubt transactions. In addition, if multiple users share the same object store, they must understand that it is not an exclusive resource, and not delete transaction logs without careful consideration.

Delete a transaction participant.

Each transaction log participant supports a :delete operation which will delete the participant log that represents the participant:

/subsystem=transactions/log-store=log-store/transactions=0\:ffff7f000001\:-b66efc2\:4f9e6f8f\:9/participants=0\:ffff7f000001\:-f30b80c\:58480e0a\:2c:delete
                        

Warning

Normally you would leave participant log management to the transaction log that owns it or to the recovery system. However, this delete operation for participant logs is provided for those cases where you know it is safe to do so and, in the case of heuristically completed XA resources, you wish to trigger a forget call so that the XA resource vendors' logs are cleaned correctly. By default, if this forget call fails then the delete operation will still succeed. The system administrator may override this behaviour by setting a system property:

ObjectStoreEnvironmentBean.ignoreMBeanHeuristics

to the value false.

Recover a transaction participant.

Each transaction participant log may support recovery via the :recover CLI command if it is in a heuristic state.

Refresh the status of a transaction which needs recovery.

If a transaction needs recovery, you can use the :refresh CLI command to be sure it still requires recovery, before attempting the recovery.

/subsystem=transactions/log-store=log-store/transactions=0\:ffff7f000001\:-b66efc2\:4f9e6f8f\:9:refresh

Transaction logs may also be managed using JMX. Each transaction log record is instrumented as an MBean. Any JMX client may be used to manage logs using this mechanism.

The JMX MBean for the object store contains one method and one attribute. The probe operation scans the object store creating JMX MBeans for the various log records contained in the store. The default behaviour is to only create MBeans for particular record types. If there is a need to view everything in the store then set the ExposeAllRecordsAsMBeans attribute to true Note that transaction logs are transient so these beans quickly become out of date and will not be refreshed automatically so you must invoke the probe operation again to get the current up to date list of MBeans.

MBeans can be queried using the standard JMX query mechanism. ObjectStore Object Names are in the format:

domain:key-property-list

where domain is jboss.jta and key-property-list is a comma separated list of key=value pairs.




Manage a Transaction

View a transaction's attributes.

To view information about a transaction or a transaction participant, such as its JNDI name, EIS product name and version, or its status, use a JMX client or alternatively use the JMX api:

    // obtain connection to the MBean server
    MBeanServer mbs = ...

    // query all ObjectStore MBean instances
    ObjectName on = new ObjectName("jboss.jta:type=ObjectStore,*", null);
    Set<ObjectInstance> transactions = mbs.queryMBeans(on);

    // lookup the attribute names of an ObjectInstance
    MBeanInfo info = mbs.getMBeanInfo( oi.getObjectName() );
    MBeanAttributeInfo[] attributeArray = info.getAttributes();

    // find the values of the attributes of an ObjectInstance
    AttributeList attributes = mbs.getAttributes(
                            oi.getObjectName(), attributeNames);
                        
View the participants of a transaction.

A transaction log may contain one or more participants which can be viewed as MBeans using a JMX client or programmatically as follows:

    ObjectInstance transaction = ... //
    ObjectName on = transaction.getObjectName();
    String participantQuery =  on + ",puid=*";
    Set<bjectInstance> participants = mbs.queryMBeans(
                                new ObjectName(participantQuery), null)
                            

For example the attributes of an XAResource record might look similar to:

       "eis-product-name" => "HornetQ",
       "eis-product-version" => "2.0",
       "jndi-name" => "java:/JmsXA",
       "status" => "HEURISTIC_HAZARD",
       "type" => "/StateManager/AbstractRecord/XAResourceRecord"
                        

The status attribute shown in this example is in a HEURISTIC_HAZARD state and is eligible for recovery. Refer to Recover a transaction. for more details.

Delete a transaction or transaction participant.

MBeans for transaction logs and participants contain a remove operation. Invoke this MBean operation to remove the record from the ObjectStore.

Warning

If failures occur, transaction logs may remain in the object store until crash recovery facilities have resolved the transactions they represent. Therefore, it is very important that the contents of the object store are not deleted inadvertently, as this will make it impossible to resolve in-doubt transactions. In addition, if multiple users share the same object store, they must understand that it is not an exclusive resource,

Normally you would leave participant log management to the transaction log that owns it or to the recovery system. However, this remove operation for participant logs is provided for those cases where you know it is safe to do so and, in the case of heuristically completed XA resources, you wish to trigger a forget call so that the XA resource vendors' logs are cleaned correctly. By default, if this forget call fails then the delete operation will still succeed. The system administrator may override this behaviour by setting a system property:

ObjectStoreEnvironmentBean.ignoreMBeanHeuristics

to the value false.

Recover a transaction.

Transaction participants support recovery via the clearHeuristic operation.

If you are using the Transaction Manager (TM) inside the WildFly Application Server and if the TM statistics are enabled, then you can view statistics about the TM and transaction subsystem using tools provide by the application server.

You can view statistics either via the web-based Management Console or the command-line Management CLI. In the web-based Management Console, Transaction statistics are available via RuntimeSubsystem MetricsTransactions . Transaction statistics are available for each server in a managed domain, as well. You can specify the server in the Server selection box at the top left.

The following table shows each available statistic, its description, and the CLI command to view the statistic.

Table 3.5. Transaction Subsystem Statistics

Statistic Description CLI Command
Total

The total number of transactions processed by the TM on this server.

/subsystem=transactions/:read-attribute(name=number-of-transactions,include-defaults=true)
Committed

The number of committed transactions processed by the TM on this server.

/subsystem=transactions/:read-attribute(name=number-of-committed-transactions,include-defaults=true)
Aborted

The number of aborted transactions processed by the TM on this server.

/subsystem=transactions/:read-attribute(name=number-of-aborted-transactions,include-defaults=true)
Timed Out

The number of timed out transactions processed by the TM on this server.

/subsystem=transactions/:read-attribute(name=number-of-timed-out-transactions,include-defaults=true)
Heuristics

Not available in the Management Console. Number of transactions in a heuristic state.

/subsystem=transactions/:read-attribute(name=number-of-heuristics,include-defaults=true)
In-Flight Transactions

Not available in the Management Console. Number of transactions which have begun but not yet terminated.

/subsystem=transactions/:read-attribute(name=number-of-inflight-transactions,include-defaults=true)
Failure Origin - Applications

The number of failed transactions whose failure origin was an application.

/subsystem=transactions/:read-attribute(name=number-of-application-rollbacks,include-defaults=true)
Failure Origin - Resources

The number of failed transactions whose failure origin was a resource.

/subsystem=transactions/:read-attribute(name=number-of-resource-rollbacks,include-defaults=true)

Each module of the system contains a module propertyManager class., which provides static getter methods for one or more name EnvironmentBean classes. An example is com.arjuna.ats.arjuna.commmon.arjPropertyManager . These environment beans are standard JavaBean containing properties for each configuration option in the system. Typical usage is of the form:

int defaultTimeout = 
    arjPropertyManager.getCoordinatorEnvironmentBean().getDefaultTimeout();

These beans are singletons, instantiated upon first access, using the following algorithm.

Procedure 3.2. Algorithm for environment bean instantiation

  1. The properties are loaded and populated from a properties file named and located as follows:

    1. If the properties file name property is set ( com.arjuna.ats.arjuna.common.propertiesFile ), its value is used as the file name.

    2. If not, the default file name is used.

  2. The file thus named is searched for by, in order

    1. absolute path

    2. user.dir

    3. user.home

    4. java.home

    5. directories contained on the classpath

    6. a default file embedded in the product .jar file.

  3. The file is treated as being of standard java.util.Properties xml format and loaded accordingly. The entry names are of the form EnvironmentBeanClass.propertyName: <entry key="CoordinatorEnvironmentBean.commitOnePhase">YES</entry> or EnvironmentBeanClass.<storeType>propertyName: <entry key="ObjectStoreEnvironmentBean.communicationStore.objectStoreType">com.arjuna.ats.internal.arjuna.objectstore.VolatileStore</entry> The second form is required if you want to set properties on configuration beans other that the default bean instances. Valid values for Boolean properties are case-insensitive, and may be one of:

    • NO

    • YES

    • FALSE

    • TRUE

    • OFF

    • ON

    In the case of properties that take multiple values, they are white-space-delimited.


  4. After the file is loaded, it is cached and is not re-read until the JVM is restarted. Changes to the properties file require a restart in order to take effect.

  5. After the properties are loaded, the EnvironmentBean is then inspected and, for each field, if the properties contains a matching key in the search order as follows, the setter method for that field is invoked with the value from the properties, or the system properties if different.

  6. The bean is then returned to the caller, which may further override values by calling setter methods.

The implementation reads most bean properties only once, as the consuming component or class is instantiated. This usually happens the first time a transaction is run. As a result, calling setter methods to change the value of bean properties while the system is running typically has no effect, unless it is done prior to any use of the transaction system. Altered bean properties are not persisted back to the properties file.

You can configure the system using a bean wiring system such as JBoss Microcontainer or Spring. Take care when instantiating beans, to obtain the singleton via the static getter (factory) method on the module property manager. Using a new bean instantiated with the default constructor is ineffective, since it is not possible to pass this configured bean back to the property management system.

The transaction manager can generate a lot of logging information when configured to log in trace level. Here is a list of some of the log messages to check for.

The following table

Transaction Begin

When a transaction begins the following code is executed:

com.arjuna.ats.arjuna.coordinator.BasicAction::Begin:1342

tsLogger.logger.trace("BasicAction::Begin() for action-id "+ get_uid());

Transaction Commit

When a transaction commits the following code is executed:

com.arjuna.ats.arjuna.coordinator.BasicAction::End:1342

tsLogger.logger.trace("BasicAction::End() for action-id "+ get_uid());

Transaction Rollback

When a transaction commits the following code is executed:

com.arjuna.ats.arjuna.coordinator.BasicAction::Abort:1575

tsLogger.logger.trace("BasicAction::Abort() for action-id "+ get_uid());

Transaction Timeout

When a transaction times out the following code is executed:

com.arjuna.ats.arjuna.coordinator.TransactionReaper::doCancellations:349

tsLogger.logger.trace("Reaper Worker " + Thread.currentThread() + " attempting to cancel " + e._control.get_uid());

You will then see the same thread rolling back the transaction as shown above

The following table shows some log messages that you may see with an explanation of alternate reasons

INFO [com.arjuna.ats.arjuna] ObjectStore record was deleted during restoration, users should not deleted records manually

If you manually deleted a transaction log then this applies to you - you deleted a transaction that was in flight and so may have caused a data integrity issue in so far as one of the resources may be committed and without the log you will not be able to infer this.

If a transaction is committed at the same time as a resource adapter or remote server attempts recovery then you may see the message in the log due to intentional but unavoidable interaction between distributed transaction managers and the local recovery manager.

The log message will indicate the path of the removed file something like: ***/ShadowNoFileLockStore/defaultStore/StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/SubordinateAtomicAction/JCA/***: java.io.FileNotFoundException: ***/ShadowNoFileLockStore/defaultStore/StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction/SubordinateAtomicAction/JCA/*** (No such file or directory)

This chapter covers issues that you may hit when developing applications with Narayana.

The WS-BA participant-completion protocol has a benign race condition that, in unusual circumstances, can cause some Business Activities to be cancelled that would have otherwise been able to close. This is safe as no inconsistency arrises, but it can be annoying for users. This section explains why this can happen, under what conditions, and what you can do to tolerate it.


The messages are numbered to indicate the order in which they are sent:

  • 1. request. This represents the application request made by the client.

  • 2. completed. After the participant has completed its work, it notifies the coordinator that it has completed.

  • 3. response. This represents the response to the client's application request.

  • 4. close. The client notifies the coordinator that it wishes to close the activity. It then waits for a 'closed' or failure response from the coordinator.

  • 5a. close/5b. closed. The coordinator has processed the '2.completed' message so can close the activity. It starts by sending the 'close' message to the participant and waits for the 'closed' response as confirmation. These two messages are asynchronous.

  • 6. closed. The coordinator now has all 'closed' acknowledgments so notifies the client that the activity successfully closed.

Messages '2.completed' and '4.close' are asynchronous (or 'one way' in Web services parlance) so effectively, there is a race condition with the following competing parties:

When running in the same VM, or on a low latency network, '3.response' will be sent very quickly. This is because it is simply travelling on the HTTP response over an already open socket. This just leaves messages '2.completed' and '4.close' which will take much longer relative to '3.response'. To understand this, lets take a look at what happens when an asynchronous Web service call is made:

The race condition occurs because steps 1-3 can happen relatively quickly in a single VM, and thus it's likely that both messages 2 and 4, will be waiting to be processed at the same time. The order in which they are processed is dependent on the implementation of the thread pool and is also at the mercy of thread scheduling in the VM, so it's possible that either could be processed first.

This race condition is much less likely to happen in a distributed environment as the network costs will be significantly higher. As a result message '3.response' will take long enough to send, so as to give message '2.completed' enough of a head start. But it is still possible so the client application must be coded defensively to catch and handle a TransactionRollbackException. The client code ought to be doing this anyway to deal with server crashes.

The following diagram shows what messages are exchanged when the race condition occurs. Notice that the activity ends in a consistent state.


Messages 1-3 are omitted from the following explanation as they are the same as in the success case.

  • 4. close. This message is processed by the coordinator before message '2.completed'

  • 5a. cancel. The coordinator has not yet processed the '2.completed' message so cannot close the activity. The coordinator then sends a 'cancel' message to the participant as it thinks it has not yet completed. This message and subsequent retires, are dropped by the participant as they are not valid for a completed participant.

  • 5b. compensate/5c. compensated. After one or more unacknowledged 'cancel' messages, the coordinator switches to sending 'compensate' messages which will cause the participant to compensate the work. The participant acknowledges with a 'compensated' reply.

  • 6. Transaction rolledback exception. The coordinator notifies the client that the activity failed to close.

As you can see from the steps above, when this race condition arises, any work done by participants is compensated and the client is notified of the outcome. Thus a consistent outcome is achieved.

The XML Transaction Service (XTS) component of Narayana supports the coordination of private and public Web Services in a business transaction. Therefore, to understand XTS, you must be familiar with Web Services, and also understand something about transactions. This chapter introduces XTS and provides a brief overview of the technologies that form the Web Services standard. Additionally, this chapter explores some of the fundamentals of transactioning technology and how it can be applied to Web Services. Much of the content presented in this chapter is detailed throughout this guide. However, only overview information about Web Services is provided. If you are new to creating Web services, please consult your Web Services platform documentation.

Narayana provides the XTS component as a transaction solution for Web Services. Using XTS, business partners can coordinate complex business transactions in a controlled and reliable manner. The XTS API supports a transactional coordination model based on the WS-Coordination , WS-Atomic Transaction , and WS-Business Activity specifications.

Narayana implements versions 1.1, and 1.2 of these three specifications. Version specifications are available from http://www.oasis-open.org/specs/ .

Note

The 1.1, and 1.2 specifications only differ in a small number of details. The rest of this document employs version 1.1 of these specifications when providing explanations and example code. On the few occasions where the modifications required to adapt these to the 1.1 specifications are not obvious, an explanatory note is provided.

Web Services are modular, reusable software components that are created by exposing business functionality through a Web service interface. Web Services communicate directly with other Web Services using standards-based technologies such as SOAP and HTTP. These standards-based communication technologies enable customers, suppliers, and trading partners to access Web Services, independent of hardware operating system, or programming environment. The result is a vastly improved collaboration environment as compared to today's EDI and business-to-business (B2B) solutions, an environment where businesses can expose their current and future business applications as Web Services that can be easily discovered and accessed by external partners.

Web Services, by themselves, are not fault-tolerant. In fact, some of the reasons that the Web Services model is an attractive development solution are also the same reasons that service-based applications may have drawbacks.

Properties of Web Services

  • Application components that are exposed as Web Services may be owned by third parties, which provides benefits in terms of cost of maintenance, but drawbacks in terms of having exclusive control over their behavior.

  • Web Services are usually remotely located, increasing risk of failure due to increased network travel for invocations.

Applications that have high dependability requirements need a method of minimizing the effects of errors that may occur when an application consumes Web Services. One method of safeguarding against such failures is to interact with an application’s Web Services within the context of a transaction . A transaction is a unit of work which is completed entirely, or in the case of failures is reversed to some agreed consistent state. The goal, in the event of a failure, is normally to appear as if the work had never occurred in the first place. With XTS, transactions can span multiple Web Services, meaning that work performed across multiple enterprises can be managed with transactional support.

XTS allows you to create transactions that drive complex business processes, spanning multiple Web Services. Current Web Services standards do not address the requirements for a high-level coordination of services. This is because in today’s Web Services applications, which use single request/response interactions, coordination is typically not a problem. However, for applications that engage multiple services among multiple business partners, coordinating and controlling the resulting interactions is essential. This becomes even more apparent when you realize that you generally have little in the way of formal guarantees when interacting with third-party Web Services.

XTS provides the infrastructure for coordinating services during a business process. By organizing processes as transactions, business partners can collaborate on complex business interactions in a reliable manner, insuring the integrity of their data - usually represented by multiple changes to a database – but without the usual overheads and drawbacks of directly exposing traditional transaction-processing engines directly onto the web. An Evening On the Town demonstrates how an application may manage service-based processes as transactions:

An Evening On the Town.  The application in question allows a user to plan a social evening. This application is responsible for reserving a table at a restaurant, and reserving tickets to a show. Both activities are paid for using a credit card. In this example, each service represents exposed Web Services provided by different service providers. XTS is used to envelop the interactions between the theater and restaurant services into a single (potentially) long-running business transaction. The business transaction must insure that seats are reserved both at the restaurant and the theater. If one event fails the user has the ability to decline both events, thus returning both services back to their original state. If both events are successful, the user’s credit card is charged and both seats are booked. As you may expect, the interaction between the services must be controlled in a reliable manner over a period of time. In addition, management must span several third-party services that are remotely deployed.

Without the backing of a transaction, an undesirable outcome may occur. For example, the user credit card may be charged, even if one or both of the bookings fail.

An Evening On the Town describes the situations where XTS excels at supporting business processes across multiple enterprises. This example is further refined throughout this guide, and appears as a standard demonstrator (including source code) with the XTS distribution.

Sometimes more control is needed over the client and the server applications. Also JTA transactions are not always wanted in the application. In such case it is possible to create client and service applications using the Raw XTS API.

The two parts to implementing a Web service using XTS are the transaction management and the business logic.

The bulk of the transaction management aspects are organized in a clear and easy-to-implement model by means of the XTS’s Participant API , provides a structured model for negotiation between the web service and the transaction coordinator. It allows the web service to manage its own local transactional data, in accordance with the needs of the business logic, while ensuring that its activities are in step with those of the client and other services involved in the transaction. Internally, this API uses SOAP to invokes operations on the various WS-C and WS-AT services, to drive the transaction to completion.

A participant is a software entity which is driven by the transaction manager on behalf of a Web service. When a web service wants to participate in a particular transaction, it must enroll a participant to act as a proxy for the service in subsequent negotiations with the coordinator. The participant implements an API appropriate to the type of transaction it is enrolled in, and the participant model selected when it is enrolled. For example, a Durable2PC participant, as part of a WS-Atomic Transaction, implements the Durable2PCParticipant interface. The use of participants allows the transactional control management aspects of the Web service to be factored into the participant implementation, while staying separate from the the rest of the Web service's business logic and private transactional data management.

The creation of participants is not trivial, since they ultimately reflect the state of a Web service’s back-end processing facilities, an aspect normally associated with an enterprise’s own IT infrastructure. Implementations must use one of the following interfaces: com.arjuna.wst11.Durable2PCParticipant , com.arjuna.wst11.Volatile2PCParticipant .

A full description of XTS’s participant features is provided in Section 4.3, “The XTS API” .

There are two aspects to a client application using XTS, the transaction declaration aspects, and the business logic. The business logic includes the invocation of Web Services.

Transaction declaration aspects are handled automatically with the XTS client API. This API provides simple transaction directives such as begin , close , and cancel , which the client application can use to initialize, manage, and terminate transactions. Internally, this API uses SOAP to invoke operations on WS-BA services, in order to create a coordinator and drive the transaction to completion.

The theory behind creating WS-BA web services is similar to the WS-AT Raw API Section 4.2.2.3.2, “Creating Transactional Web Services” . However, different participant classes are used: com.arjuna.wst11.BusinessAgreementWithParticipantCompletionParticipant , or com.arjuna.wst11.BusinessAgreementWithCoordinatorCompletionParticipant .

A full description of XTS’s participant features is provided in Section 4.3, “The XTS API” .

You can enable transaction propagation for all Web service calls that are invoked within a JTA, WS-AT or WS-BA transaction. This is done with the 'default-context-propagation' property in the XTS subsystem config of the standalone-xts.xml.

As this is enabled by default (for standalone-xts.xml), calls to all Web services that support WS-AT or WS-BA will automatically receive the transaction context allowing them to participate in the distributed transaction.

The transaction context is simply ignored if the service does not support WS-AT or WS-BA. This is done by setting MustUnderstand=”false” on the 'CoordinationContext' SOAP header. Unfortunately, this may cause issues when invoking WS-AT or WS-BA enabled Web services on other vendors’ application servers. This is because the WS-Coordination specification states that MustUnderstand must be set to true. If you are affected by this issue, you will need to explicitly enable the transaction propagation for every port.

The default context propagation policy can also be overridden on a per Web Service port basis. This allows the developer to easily state which Web Service clients must and must-not propagate the transaction context. This is done through the standard JAX-WS WebServiceFeature facility. A JAX-WS WebServiceFeature allows meta-information to be added to a port that describe cross-cutting behaviour, such as logging, security or compression. In our case we use the Section 4.3.2.10, “JTAOverWSATFeature” and Section 4.3.2.9, “WSTXFeature” features.

Section 4.3.2.10, “JTAOverWSATFeature” states that any JTA, WS-AT, or WS-BA transactions should be distributed via calls on this client. This feature is recommended to use, if you have a JTA transactions which should be propagated.

Section 4.3.2.9, “WSTXFeature” states that any WS-AT or WS-BA transaction should be distributed via calls on this client. You should use this feature, if you use Raw XTS or WS-BA APIs.

Calls to the service will fail if the Web service does not support WS-AT or WS-BA (in this case, XTS sets MustUnderstand=true on the 'CoordinationContext' SOAP header as the developer has explicitly stated that it is required).

The developer may also state that the transaction must-not be distributed over calls to this Web service. This is done by setting the Section 4.3.2.10, “JTAOverWSATFeature” or Section 4.3.2.9, “WSTXFeature” feature to disabled.

The use of Section 4.3.2.10, “JTAOverWSATFeature” and Section 4.3.2.9, “WSTXFeature” overrides whatever default context propagation is set to in the standalone-xts.xml.

This chapter discusses the XTS API. You can use this information to write client and server applications which consume transactional Web Services and coordinate back-end systems.

The participant is the entity that performs the work pertaining to transaction management on behalf of the business services involved in an application. The Web service (in the example code, a theater booking system) contains some business logic to reserve a seat and inquire about availability, but it needs to be supported by something that maintains information in a durable manner. Typically this is a database, but it could be a file system, NVRAM, or other storage mechanism.

Although the service may talk to the back-end database directly, it cannot commit or undo any changes, since committing and rolling back are ultimately under the control of a transaction. For the transaction to exercise this control, it must communicate with the database. In XTS, participant does this communication, as shown in Figure 4.1, “Transactions, Participants, and Back-End Transaction Control” .


The participant provides the plumbing that drives the transactional aspects of the service. This section discusses the specifics of Participant programming and usage.

Transactional web services and transactional clients are regular Java EE applications and can be deployed into the application server in the same way as any other Java EE application. The XTS Subsystem exports all the client and web service API classes needed to manage transactions and enroll and manage participant web services. It provides implementations of all the WS-C and WS-T coordination services, not just the coordinator services. In particular, it exposes the client and web service participant endpoints which are needed to receive incoming messages originating from the coordinator.

Normally, a transactional application client and the transaction web service it invokes will be deployed in different application servers. As long as XTS is enabled on each of these containers it will transparently route coordination messages from clients or web services to their coordinator and vice versa. When the client begins a transaction by default it creates a context using the coordination services in its local container. The context holds a reference to the local Registration Service which means that any web services enlisted in the transaction enrol with the coordination services in the same container.

The coordinator does not need to reside in the same container as the client application. By configuring the client deployment appropriately it is possible to use the coordinator services co-located with one of the web services or even to use services deployed in a separate, dedicated container. See Chapter 8 Stand-Alone Coordination for details of how to configure a coordinator located in a different container to the client.

All participants which support Durable2PC protocol have to implement com.arjuna.wst.Durable2PCParticipant interface.

Durable2PCParticipant Methods

prepare

The participant should perform any work necessary, so that it can either commit or roll back the work performed by the Web service under the scope of the transaction. The implementation is free to do whatever it needs to in order to fulfill the implicit contract between it and the coordinator.

The participant indicates whether it can prepare by returning an instance of Section 4.3.2.3, “Vote” .

commit

The participant should make its work permanent. How it accomplishes this depends upon its implementation. For instance, in the theater example, the reservation of the ticket is committed. If commit processing cannot complete, the participant should throw a SystemException error, potentially leading to a heuristic outcome for the transaction.

rollback

The participant should undo its work. If rollback processing cannot complete, the participant should throw a SystemException error, potentially leading to a heuristic outcome for the transaction.

unknown

This method has been deprecated and is slated to be removed from XTS in the future.

error

In rare cases when recovering from a system crash, it may be impossible to complete or roll back a previously prepared participant, causing the error operation to be invoked.

All participants which support Volatile2PC protocol have to implement com.arjuna.wst.Volatile2PCParticipant interface.

Volatile2PCParticipant Methods

prepare

The participant should perform any work necessary to flush any volatile data created by the Web service under the scope of the transaction, to the system store. The implementation is free to do whatever it needs to in order to fulfill the implicit contract between it and the coordinator.

The participant indicates whether it can prepare by returning an instance of Section 4.3.2.3, “Vote” .

commit

The participant should perform any cleanup activities required, in response to a successful transaction commit. These cleanup activities depend upon its implementation. For instance, it may flush cached backup copies of data modified during the transaction. In the unlikely event that commit processing cannot complete, the participant should throw a SystemException error. This will not affect the outcome of the transaction but will cause an error to be logged. This method may not be called if a crash occurs during commit processing.

rollback

The participant should perform any cleanup activities required, in response to a transaction abort. In the unlikely event that rollback processing cannot complete, the participant should throw a SystemException error. This will not affect the outcome of the transaction but will cause an error to be logged. This method may not be called if a crash occurs during commit processing.

unknown

This method is deprecated and will be removed in a future release of XTS.

error

This method should never be called, since volatile participants are not involved in recovery processing.

com.arjuna.mw.wst11.UserTransaction is the class that clients typically employ. Before a client can begin a new atomic transaction, it must first obtain a UserTransaction from the UserTransactionFactory . This class isolates the user from the underlying protocol-specific aspects of the XTS implementation. A UserTransaction does not represent a specific transaction. Instead, it provides access to an implicit per-thread transaction context, similar to the UserTransaction in the JTA specification. All of the UserTransaction methods implicitly act on the current thread of control.

Defines the interaction between a transactional web service and the underlying transaction service implementation. A TransactionManager does not represent a specific transaction. Instead, it provides access to an implicit per-thread transaction context.

Methods

currentTransaction

Returns a TxContext for the current transaction, or null if there is no context. Use the currentTransaction method to determine whether a web service has been invoked from within an existing transaction. You can also use the returned value to enable multiple threads to execute within the scope of the same transaction. Calling the currentTransaction method does not disassociate the current thread from the transaction.

suspend

Dissociates a thread from any transaction. This enables a thread to do work that is not associated with a specific transaction.

The suspend method returns a TxContext instance, which is a handle on the transaction.

resume

Associates or re-associates a thread with a transaction, using its TxContext . Prior to association or re-association, the thread is disassociated from any transaction with which it may be currently associated. If the TxContext is null, then the thread is associated with no transaction. In this way, the result is the same as if the suspend method were used instead.

enlistForVolitaleTwoPhase

Enroll the specified participant with the current transaction, causing it to participate in the Volatile2PC protocol. You must pass a unique identifier for the participant.

enlistForDurableTwoPhase

Enroll the specified participant with the current transaction, causing it to participate in the Durable2PC protocol. You must pass a unique identifier for the participant.

Participant which support business agreement with coordinator completion protocol have to implement com.arjuna.wst.BusinessAgreementWithCoordinatorCompletionParticipant interface.

In order for the Business Activity protocol to work correctly, the participants must be able to autonomously notify the coordinator about changes in their status. Unlike the Atomic Transaction protocol, where all interactions between the coordinator and participants are instigated by the coordinator when the transaction terminates, the BAParticipantManager interaction pattern requires the participant to be able to talk to the coordinator at any time during the lifetime of the business activity.

Whenever a participant is registered with a business activity, it receives a handle on the coordinator. This handle is an instance of interface com.arjuna.wst11.BAParticipantManager .

com.arjuna.wst11.UserBusinessActivity is the class that most clients employ. A client begins a new business activity by first obtaining a UserBusinessActivity from the UserBusinessActivityFactory . This class isolates them from the underlying protocol-specific aspects of the XTS implementation. A UserBusinessActivity does not represent a specific business activity. Instead, it provides access to an implicit per-thread activity. Therefore, all of the UserBusinessActivity methods implicitly act on the current thread of control.

Methods

begin

Begins a new activity, associating it with the invoking thread.

close

First, all Coordinator Completion participants enlisted in the activity are requested to complete the activity. Next all participants, whether they enlisted for Coordinator or Participant Completion, are requested to close the activity. If any of the Coordinator Completion participants fails to complete at the first stage then all completed participants are asked to compensate the activity while any remaining uncompleted participants are requested to cancel the activity.

cancel

Terminates the business activity. All Participant Completion participants enlisted in the activity which have already completed are requested to compensate the activity. All uncompleted Participant Completion participants and all Coordinator Completion participants are requested to cancel the activity.

com.arjuna.mw.wst11.BusinessActivityManager is the class that web services typically employ. Defines how a web service interacts with the underlying business activity service implementation. A BusinessActivityManager does not represent a specific activity. Instead, it provides access to an implicit per-thread activity.

Methods

currentTransaction

Returns the TxContext for the current business activity, or NULL if there is no TxContext . The returned value can be used to enable multiple threads to execute within the scope of the same business activity. Calling the currenTransaction method does not dissociate the current thread from its activity.

suspend

Dissociates a thread from any current business activity, so that it can perform work not associated with a specific activity. The suspend method returns a TxContext instance, which is a handle on the activity. The thread is then no longer associated with any activity.

resume

Associates or re-associates a thread with a business activity, using its TxContext . Before associating or re-associating the thread, it is disassociated from any business activity with which it is currently associated. If the TxContext is NULL , the thread is disassociated with all business activities, as though the suspend method were called.

enlistForBusinessAgreementWithParticipantCompletion

Enroll the specified participant with current business activity, causing it to participate in the BusinessAgreementWithParticipantCompletion protocol. A unique identifier for the participant is also required.

The return value is an instance of BAParticipantManager which can be used to notify the coordinator of changes in the participant state. In particular, since the participant is enlisted for the Participant Completion protcol it is expected to call the completed method of this returned instance when it has completed all the work it expects to do in this activity and has made all its changes permanent. Alternatively, if the participant does not need to perform any compensation actions should some other participant fail it can leave the activity by calling the exit method of the returned BAParticipantManager instance.

enlistForBusinessAgreementWithCoordinatorCompletion

Enroll the specified participant with current activity, causing it to participate in the BusinessAgreementWithCoordinatorCompletion protocol. A unique identifier for the participant is also required.

The return value is an instance of BAParticipantManager which can be used to notify the coordinator of changes in the participant state. Note that in this case it is an error to call the completed method of this returned instance. With the Coordinator Completion protocol the participant is expected to wait until its completed method is called before it makes all its changes permanent. Alternatively, if the participant determiens that it has no changes to make, it can leave the activity by calling the exit method of the returned BAParticipantManager instance.

The simplest way to configure a stand-alone coordinator is to provide a complete URL for the remote coordinator. This can be done by changing the 'url' property of the 'xts-environment' element of the XTS Subsystem configuration in the standalone-xts.xml . Example 4.2, “ Example standalone-xts.xml configuration settings ” shows the snippet of XML that you should change.


The XTS module ( modules/system/layers/base/org/jboss/xts/main/jbossxts-${XTS_VERSION}.jar) in the WildFly Application Server includes a configuration file, xts-properties.xml, in the root of the jar. These properties can be edited and then re-packaged in the jar. The changes will take affect on next boot of the WildFly Application Server. Example 4.3, “ Example xts-properties.xml configuration settings ” shows a fragment of this file which details the options for changing the coordinator URL.


You can also specify the individual elements of the URL using the properties coordinator.scheme , coordinator.address , and so forth. These values only apply when the coordinator.url is not set. The URL is constructed by combining the specified values with default values for any missing elements. This is particularly useful for two specific use cases.

  1. The first case is where the client is expected to use an XTS coordinator deployed in another WildFly Application Server. If, for example, this WildFly Application Server is bound to address 10.0.1.99 , setting property coordinator.address to 10.0.1.99 is normally all that is required to configure the coordinator URL to identity the remote WildFly Application Server's coordination service. If the Web service on the remote WildFly Application Server were reset to 9090 then it would also be necessary to set property coordinator.port to this value.

  2. The second common use case is where communications between client and coordinator, and between participant and coordinator, must use secure connections. If property coordinator.scheme is set to value https, the client's request to begin a transaction is sent to the coordinator service over a secure https connection. The XTS coordinator and participant services will ensure that all subsequent communications between coordinator and client or coordinator and web services also employ secure https connections. Note that this requires configuring the trust stores in the WildFly Application Server running the client, coordinator and participant web services with appropriate trust certificates.

Note

The property names have been abbreviated in order to fit into the table. They should each start with prefix org.jboss.jbossts.xts11.coordinator .


A key requirement of a transaction service is to be resilient to a system crash by a host running a participant, as well as the host running the transaction coordination services. Crashes which happen before a transaction terminates or before a business activity completes are relatively easy to accommodate. The transaction service and participants can adopt a presumed abort policy.

Crash recovery is more complex if the crash happens during a transaction commit operation, or between completing and closing a business activity. The transaction service must ensure as far as possible that participants arrive at a consistent outcome for the transaction.

On the rare occasions where such a consensus cannot be reached, the transaction service must log and report transaction failures.

XTS includes support for automatic recovery of WS-AT and WS-BA transactions, if either or both of the coordinator and participant hosts crashes. The XTS recovery manager begins execution on coordinator and participant hosts when the XTS service restarts. On a coordinator host, the recovery manager detects any WS-AT transactions which have prepared but not committed, as well as any WS-BA transactions which have completed but not yet closed. It ensures that all their participants are rolled forward in the first case, or closed in the second.

On a participant host, the recovery manager detects any prepared WS-AT participants which have not responded to a transaction rollback, and any completed WS-BA participants which have not yet responded to an activity cancel request, and ensures that the former are rolled back and the latter are compensated. The recovery service also allows for recovery of subordinate WS-AT transactions and their participants if a crash occurs on a host where an interposed WS-AT coordinator has been employed.

The WS-AT coordination service tracks the status of each participant in a transaction as the transaction progresses through its two-phase commit. When all participants have been sent a prepare message and have responded with a prepared message, the coordinator writes a log record storing each participant's details, indicating that the transaction is ready to complete. If the coordinator service crashes after this point has been reached, completion of the two-phase commit protocol is still guaranteed, by reading the log file after reboot and sending a commit message to each participant. Once all participants have responded to the commit with a committed message, the coordinator can safely delete the log entry.

Since the prepared messages returned by the participants imply that they are ready to commit their provisional changes and make them permanent, this type of recovery is safe. Additionally, the coordinator does not need to account for any commit messages which may have been sent before the crash, or resend messages if it crashes several times. The XTS participant implementation is resilient to redelivery of the commit messages. If the participant has implemented the recovery functions described in Section 4.5.1.2.1, “WS-AT Participant Crash Recovery APIs” , the coordinator can guarantee delivery of commit messages if both it crashes, and one or more of the participant service hosts also crash, at the same time.

If the coordination service crashes before the prepare phase completes, the presumed abort protocol ensures that participants are rolled back. After system restart, the coordination service has the information about about all the transactions which could have entered the commit phase before the reboot, since they have entries in the log. It also knows about any active transactions started after the reboot. If a participant is waiting for a response, after sending its prepared message, it automatically re-sends the prepared message at regular intervals. When the coordinator detects a transaction which is not active and has no entry in the log file after the reboot, it instructs the participant to abort, ensuring that the web service gets a chance to roll back any provisional state changes it made on behalf of the transaction.

A web service may decide to unilaterally commit or roll back provisional changes associated with a given participant, if configured to time-out after a specified length of time without a response. In this situation, the the web service should record this action and log a message to persistent storage. When the participant receives a request to commit or roll back, it should throw an exception if its unilateral decision action does not match the requested action. The coordinator detects the exception and logs a message marking the outcome as heuristic. It also saves the state of the transaction permanently in the transaction log, to be inspected and reconciled by an administrator.

WS-AT participants associated with a transactional web service do not need to be involved in crash recovery if the Web service's host machine crashes before the participant is told to prepare. The coordinator will assume that the transaction has aborted, and the Web service can discard any information associated with unprepared transactions when it reboots.

When a participant is told to prepare , the Web service is expected to save to persistent storage the transactional state it needs to commit or roll back the transaction. The specific information it needs to save is dependent on the implementation and business logic of the Web Service. However, the participant must save this state before returning a Prepared vote from the prepare call. If the participant cannot save the required state, or there is some other problem servicing the request made by the client, it must return an Aborted vote.

The XTS participant services running on a Web Service's host machine cooperate with the Web service implementation to facilitate participant crash recovery. These participant services are responsible for calling the participant's prepare , commit , and rollback methods. The XTS implementation tracks the local state of every enlisted participant. If the prepare call returns a Prepared vote, the XTS implementation ensures that the participant state is logged to the local transaction log before forwarding a prepared message to the coordinator.

A participant log record contains information identifying the participant, its transaction, and its coordinator. This is enough information to allow the rebooted XTS implementation to reinstate the participant as active and to continue communication with the coordinator, as though the participant had been enlisted and driven to the prepared state. However, a participant instance is still necessary for the commit or rollback process to continue.

Full recovery requires the log record to contain information needed by the Web service which enlisted the participant. This information must allow it to recreate an equivalent participant instance, which can continue the commit process to completion, or roll it back if some other Web Service fails to prepare . This information might be as simple as a String key which the participant can use to locate the data it made persistent before returning its Prepared vote. It may be as complex as a serialized object tree containing the original participant instance and other objects created by the Web service.

If a participant instance implements the relevant interface, the XTS implementation will append this participant recovery state to its log record before writing it to persistent storage. In the event of a crash, the participant recovery state is retrieved from the log and passed to the Web Service which created it. The Web Service uses this state to create a new participant, which the XTS implementation uses to drive the transaction to completion. Log records are only deleted after the participant's commit or rollback method is called.

When a Business Activity participant web service completes its work, it may want to save the information which will be required later to close or compensate actions performed during the activity. The XTS implementation automatically acquires this information from the participant as part of the completion process and writes it to a participant log record. This ensures that the information can be restored and used to recreate a copy of the participant even if the web service container crashes between the complete and close or compensate operations.

For a Participant Completion participant, this information is acquired when the web service invokes the completed method of the BAParticipantManager instance returned from the call which enlisted the participant. For a Coordinator Completion participant this occurs immediately after the call to it's completed method returns. This assumes that the completed method does not throw an exception or call the participant manager's cannotComplete or fail method.

A participant may signal that it is capable of performing recovery processing, by implementing the java.lang.Serializable interface. An alternative is to implement the Example 4.4, “ PersistableATParticipant Interface ” .


If a participant implements the Serializable interface, the XTS participant services implementation uses the serialization API to create a version of the participant which can be appended to the participant log entry. If it implements the PersistableATParticipant interface, the XTS participant services implementation call the getRecoveryState method to obtain the state to be appended to the participant log entry.

If neither of these APIs is implemented, the XTS implementation logs a warning message and proceeds without saving any recovery state. In the event of a crash on the host machine for the Web service during commit, the transaction cannot be recovered and a heuristic outcome may occur. This outcome is logged on the host running the coordinator services.

A Web service must register with the XTS implementation when it is deployed, and unregister when it is undeployed, in order to participate in recovery processing. Registration is performed using class XTSATRecoveryManager defined in package org.jboss.jbossts.xts.recovery.participant.at .


The Web service must provide an implementation of interface XTSBARecoveryModule in package org.jboss.jbossts.xts.recovery.participant.ba , as an argument to the register and unregister calls. This instance identifies saved participant recovery records and recreates new, recovered participant instances:


If a participant's recovery state was saved using serialization, the recovery module's deserialize method is called to recreate the participant. Normally, the recovery module is required to read, cast, and return an object from the supplied input stream. If a participant's recovery state was saved using the PersistableATParticipant interface, the recovery module's recreate method is called to recreate the participant from the byte array it provided when the state was saved.

The XTS implementation cannot identify which participants belong to which recovery modules. A module only needs to return a participant instance if the recovery state belongs to the module's Web service. If the participant was created by another Web service, the module should return null . The participant identifier, which is supplied as argument to the deserialize or recreate method, is the identifier used by the Web service when the original participant was enlisted in the transaction. Web Services participating in recovery processing should ensure that participant identifiers are unique per service. If a module recognizes that a participant identifier belongs to its Web service, but cannot recreate the participant, it should throw an exception. This situation might arise if the service cannot associate the participant with any transactional information which is specific to the business logic.

Even if a module relies on serialization to create the participant recovery state saved by the XTS implementation, it still must be registered by the application. The deserialization operation must employ a class loader capable of loading classes specific to the Web service. XTS fulfills this requirement by devolving responsibility for the deserialize operation to the recovery module.

The WS-BA coordination service implementation tracks the status of each participant in an activity as the activity progresses through completion and closure. A transition point occurs during closure, once all CoordinatorCompletion participants receive a complete message and respond with a completed message. At this point, all ParticipantCompletion participants should have sent a completed message. The coordinator writes a log record storing the details of each participant, and indicating that the transaction is ready to close. If the coordinator service crashes after the log record is written, the close operation is still guaranteed to be successful. The coordinator checks the log after the system reboots and re-sends a close message to all participants. After all participants respond to the close with a closed message, the coordinator can safely delete the log entry.

The coordinator does not need to account for any close messages sent before the crash, nor resend messages if it crashes several times. The XTS participant implementation is resilient to redelivery of close messages. Assuming that the participant has implemented the recovery functions described below, the coordinator can even guarantee delivery of close messages if both it, and one or more of the participant service hosts, crash simultaneously.

If the coordination service crashes before it has written the log record, it does not need to explicitly compensate any completed participants. The presumed abort protocol ensures that all completed participants are eventually sent a compensate message. Recovery must be initiated from the participant side.

A log record does not need to be written when an activity is being canceled. If a participant does not respond to a cancel or compensate request, the coordinator logs a warning and continues. The combination of the presumed abort protocol and participant-led recovery ensures that all participants eventually get canceled or compensated, as appropriate, even if the participant host crashes.

If a completed participant does not detect a response from its coordinator after resending its completed response a suitable number of times, it switches to sending getstatus messages, to determine whether the coordinator still knows about it. If a crash occurs before writing the log record, the coordinator has no record of the participant when the coordinator restarts, and the getstatus request returns a fault. The participant recovery manager automatically compensates the participant in this situation, just as if the activity had been canceled by the client.

After a participant crash, the participant recovery manager detects the log entries for each completed participant. It sends getstatus messages to each participant's coordinator host, to determine whether the activity still exists. If the coordinator has not crashed and the activity is still running, the participant switches back to resending completed messages, and waits for a close or compensate response. If the coordinator has also crashed or the activity has been canceled, the participant is automatically canceled.

A Web service must register with the XTS implementation when it is deployed, and unregister when it is undeployed, so it can take part in recovery processing.

Registration is performed using the XTSBARecoveryManager , defined in the org.jboss.jbossts.xts.recovery.participant.ba package.


The Web service must provide an implementation of the XTSBARecoveryModule in the org.jboss.jbossts.xts.recovery.participant.ba , as an argument to the register and unregister calls. This instance identifies saved participant recovery records and recreates new, recovered participant instances:


If a participant's recovery state was saved using serialization, one of the recovery module's deserialize methods is called, so that it can recreate the participant. Which method to use depends on whether the saved participant implemented the ParticipantCompletion protocol or the CoordinatorCompletion protocol. Normally, the recovery module reads, casts and returns an object from the supplied input stream. If a participant's recovery state was saved using the PersistableBAParticipant interface, one of the recovery module's recreate methods is called, so that it can recreate the participant from the byte array provided when the state was saved. The method to use depends on which protocol the saved participant implemented.

The XTS implementation does not track which participants belong to which recovery modules. A module is only expected to return a participant instance if it can identify that the recovery state belongs to its Web service. If the participant was created by some other Web service, the module should return null . The participant identifier supplied as an argument to the deserialize or recreate calls is the identifier used by the Web service when the original participant was enlisted in the transaction. Web Services which participate in recovery processing should ensure that the participant identifiers they employ are unique per service. If a module recognizes a participant identifier as belonging to its Web service, but cannot recreate the participant, it throws an exception. This situation might arise if the service cannot associate the participant with any transactional information specific to business logic.

A module must be registered by the application, even when it relies upon serialization to create the participant recovery state saved by the XTS implementation. The deserialization operation must employ a class loader capable of loading Web service-specific classes. The XTS implementation achieves this by delegating responsibility for the deserialize operation to the recovery module.

When a BA participant completes, it is expected to commit changes to the web service state made during the activity. The web service usually also needs to persist these changes to a local storage device. This leaves open a window where the persisted changes may not be guarded with the necessary compensation information. The web service container may crash after the changes to the service state have been written but before the XTS implementation is able to acquire the recovery state and write a recovery log record for the participant. Participants may close this window by employing a two phase update to the local store used to persist the web service state.

A participant which needs to persist changes to local web service state should implement interface ConfirmCompletedParticipant in package com.arjuna.wst11 . This signals to the XTS implementation that it expects confirmation after a successful write of the participant recovery record, allowing it to roll forward provisionally persisted changes to the web service state. Delivery of this confirmation can be guaranteed even if the web service container crashes after writing the participant log record. Conversely, if a recovery record cannot be written because of a fault or a crash prior to writing, the provisional changes can be guaranteed to be rolled back.


When the participant is ready to complete, it should prepare its persistent changes by temporarily locking access to the relevant state in the local store and writing the changed data to disk, retaining both the old and new versions of the service state. For a Participant Completion participant, this prepare operation should be done just before calling the participant manager's completed method. For a Coordinator Completion participant, it should be done just before returning from the call to the participant's completed method. After writing the participant log record, the XTS implementation calls the participant's confirmCompleted method, providing value true as the argument. The participant should respond by installing the provisional state changes and releasing any locks. If the log record cannot be written, the XTS implementation calls the participant's confirmCompleted method, providing value false as the argument. The participant should respond by restoring the original state values and releasing any locks.

If a crash occurs before the call to confirmCompleted , the application's recovery module can make sure that the provisional changes to the web service state are rolled forward or rolled back as appropriate. The web service must identify all provisional writes to persistent state before it starts serving new requests or processing recovered participants. It must reobtain any locks required to ensure that the state is not changed by new transactions. When the recovery module recovers a participant from the log, its compensation information is available. If the participant still has prepared changes, the recovery code must call confirmCompleted , passing value true. This allows the participant to finish the complete operation. The XTS implementation then forwards a completed message to the coordinator, ensuring that the participant is subsequently notified either to close or to compensate. At the end of the first recovery scan, the recovery module may find some prepared changes on disk which are still unaccounted for. This means that the participant recovery record is not available. The recovery module should restore the original state values and release any locks. The XTS implementation responds to coordinator requests regarding the participant with an unknown participant fault, forcing the activity as a whole to be rolled back.

The basic building blocks of a transactional Web Services application include the application itself, the Web services that the application consumes, the Transaction Manager, and the transaction participants which support those Web services. Although it is likely that different developers will be responsible for each piece, the concepts are presented here so that you can see the whole picture. Often, developers produce services, or applications that consume services, and system administrators run the transaction-management infrastructure.

There are multiple quickstarts provided on Narayana GitHub repository which should give you a better understanding of how to use our software. This chapter will give you a brief overview where to find them and what technologies they demonstrate.

Quickstart URL: https://github.com/jbosstm/quickstart/tree/5.5.2.Final/XTS/wsat-jta-multi_service

This quickstart uses JTA to manage WS-AT applications. The quickstart is composed of a client (the test) and two Web services (FirstServiceAT and SecondServiceAT). Both services are invoked by the test from within the same JTA transaction.

The Client begins a JTA transaction and then invokes an operation on each service. Transaction context propagation is enabled by default. Therefore XTS automatically bridges the JTA transaction to a WS-AT transaction before each invocation is made.

Each service uses JPA to persist its data (the value of a counter). Therefore, the service class is annotated with javax.ejb.TransactionAttribute which tells XTS to automatically bridge WS-AT transaction to JTA.

Quickstart URL: https://github.com/jbosstm/quickstart/tree/5.5.2.Final/XTS/wsat-jta-multi_hop

This quickstart uses JTA to manage WS-AT applications. The quickstart is composed of a client (the test) and two Web services (FirstServiceAT and SecondServiceAT).

The Client begins a JTA transaction and then invokes an operation on FirstServiceAT. Transaction context propagation is enabled by default. Therefore XTS automatically bridges the JTA transaction to a WS-AT transaction before the invocation is made.

FirstServiceAT uses JPA to persist its data. Therefore, the service class is annotated with javax.ejb.TransactionAttribute which tells XTS to automatically bridge WS-AT transaction to JTA. The FirstServiceAT Web Service updates some local data and then invokes the SecondServiceAT Web services.

Similarly, to when invoking FirstServiceAT, the JTA transaction is bridged to a WS-AT transaction when invoking SecondServiceAT. SecondServiceAT also uses JPA for persistence, so the incoming WS-AT transaction is again bridged to JTA.

Quickstart URL: https://github.com/jbosstm/quickstart/tree/5.5.2.Final/XTS/ssl

This example walks you through the steps required to setup two servers (client and server) that communicate via Web services over a secure connection. The example show how this can be done for WS-Atomic Transaction, but the same applies for WS Business Activity.

Quickstart URL: https://github.com/jbosstm/quickstart/tree/5.5.2.Final/XTS/raw-xts-api-demo

This example demonstrates the whole range of XTS possibilities, including WS-AT and WS-BA.

This example uses the Raw XTS API. It is only recommended for scenarios where the WS-AT to JTA integration is not appropriate; or where the Compensating Transactions API support for WS-BA is not appropriate.

Quickstart URL: https://github.com/jbosstm/quickstart/tree/5.5.2.Final/compensating-transactions/non-transactional_resource

This example demonstrates the simple use case of our API for developing applications that use Compensating Transactions. It shows how a non-transactional activity (such as sending an email, or printing a document) can be coordinated in a compensating transaction.

Quickstart URL: https://github.com/jbosstm/quickstart/tree/5.5.2.Final/compensating-transactions/travel-agent

This example demonstrates the more complex use case of our API for developing applications that use Compensating Transactions. It shows how a long running compensating transaction can be composed of a series of short-running ACID transactions. The example also involves multiple organisations and forms a distributed transaction over Web Services.

Transactions provide a structuring mechanism for business logic. Use of transactions allows for grouping of data manipulations into constructs with certain properties. Traditional ACID transactions provide for properties of Atomicity, Consistency, Isolation and Durability.

In JavaEE applications, transaction support is provided via the Java Transaction API (JTA). The classes and interfaces in the javax.transaction and javax.transaction.xa packages provide a means by which the programmer may manage transaction demarcation (begin, commit, rollback) and, where necessary, interact with the transaction management system (e.g. enlistResource). In many JavaEE applications, further abstractions are provided on top of the JTA. For example, EJB3 @TransactionAttribute annotations may be used for transaction boundary demarcation in preference to explicit calls to the JTA's UserTransaction interface.

In distributed applications, the JTA implementation may provide propagation of transaction context and transaction control calls between containers (JVMs) using either a propriety transport or JTS, the Java mapping of the CORBA OTS standard on an RMI/IIOP transport. In Narayana, both local and distributed (JTS) implementations of the JTA are available.

In Web Services applications, ACID transaction management and interoperable context propagation is provided for by the WS-AT standard. Narayana XTS provides an implementation of both the 1.0 and 1.2 versions of this standard. Bridging is provided only on the more recent version. At the time of writing the standard covers only the web services API and protocol, not the Java API through which the protocol may be driven. Therefore, XTS provides a custom Java API to users, with characteristics broadly similar to the JTA.

For applications that combine traditional JavaEE transaction management and Web Service transaction management, it is often desirable to have some mechanism for linking these transaction types, such that a single transaction may span business logic written for either transaction type. Examples include exposing existing JavaEE transactional business logic (e.g. EJBs) as transactional Web Services, or allowing JavaEE transactional components to utilize transactional Web Services.

We use the term Transaction Bridging to describe the process of linking the JavaEE and Web Services transaction domains. The transaction bridge component (txbridge) of Narayana provides bi-directional linkage, such that either type of transaction may encompass business logic designed for use with the other type.

The technique used by the bridge is a combination of interposition and protocol mapping.

Interposition is used in transaction systems to allow a tree of transaction coordinators to be constructed, usually for performance reasons. Interposed coordinators function as transaction managers for nodes below them in the tree, whilst appearing as resources (participants in WS-AT terminology) to the node above them.

Within a single transaction domain, interposition may be used to allow remote nodes to minimize the number of network calls necessary at transaction termination. The top level node is known as the root coordinator, whilst interposed coordinators are termed subordinate. This name indicates that they are not autonomously responsible for determining the transaction outcome, but rather are driven by their parent coordinator. Therefore, whilst a top level coordinator exposes only the commit and rollback methods for transaction termination and handles the 2PC internally, the subordinates additionally expose the prepare method to their parent, behaving much like resources during the termination protocol.


In the transaction bridge, an interposed coordinator is registered into the existing transaction and performs the additional task of protocol mapping. That is, it appears to its parent coordinator to be a resource of its native transaction type, whilst appearing to its children to be a coordinator of their native transaction type, even though these transaction types differ.


The interposed coordinator is responsible for performing mapping between the transaction protocols. There is a strong correspondence between the API and protocol used by the JTA and WS-AT transaction types, which is unsurprising given their common heritage and shared problem domain. However, method signatures, exception types and such do differ. The bridge provides a abstraction layer to mask these distinctions as far as possible.

The net result of this is that existing business logic perceives its expected transaction environment, even though the transaction in which it is executing may be subordinate to one of a different type. No changes are necessary to existing transactional applications to allow them to operate in the scope of foreign transactions. This facilitates reuse of existing business logic components in new environments and increases the possibilities for new architectures and interoperability.

The transaction bridge resides in the package org.jboss.jbossts.txbridge and its subpackages.. It consists of two distinct sets of classes, one for bridging in each direction.

The process of inflowing a WS-AT transaction context on a Web Service call into the container and converting it to a local JTA transaction context such that existing transactional JavaEE code (e.g. EJBs) may be called within its scope, is termed Inbound Transaction Bridging. When using inbound bridging, a parent WS-AT transaction coordinator has a subordinate JTA coordinator interposed into it via the transaction bridge.

The process of outflowing a WS-AT transaction context on a call to a transactional Web Service from a business logic method operating in a JavaEE transaction scope, is termed Outbound Transaction Bridging. When using outbound bridging, a parent JTA transaction coordinator has a subordinate WS-AT coordinator interposed into it via the transaction bridge.

For the purpose of understanding this naming convention, it is simplest to view the JTA as being local to the container in which it operates, whilst the Web Service protocol provides for transaction context propagation between servers. This is an accurate representation of the situation that exists where the local JTA version of Narayana is being used alongside Narayana XTS in an application server. However, it is an oversimplification of the situation where the JTS option is used. We will return to this case later.


The process flow when using the inbound bridge is as follows:

  1. A remote client starts a WS-AT transaction and invokes a transactional Web Service in the scope of that transaction. The inbound WS invocation therefore has SOAP headers containing the WS-AT transaction context. The coordinator used for this transaction is the root coordinator. It may be remote from either or both of the client and the service it is invoking. The client needs access to a WS-AT implementation, but not a JTA or the transaction bridge deployed.

  2. The call arrives at a web service container, which must have Narayana JTA or JTS, XTS and the transaction bridge deployed. The JAX-WS handler chain for the web service should have both the XTS WS-AT transaction header processor and the inbound bridge handler registered, such that they are invoked in that order.

  3. The transaction header processor takes the WS-AT transaction context from XML, creates a corresponding WS-AT TxContext and associates it to the Thread. The bridge handler calls the InboundBridgeManager to obtain an InboundBridge instance corresponding to the TxContext.

  4. As the BridgeManager is seeing the TxContext for the first time, it creates a new Bridge instance. It also creates a new Bridge VolatileParticipant and DurableParticipant and registers them with the WS-AT transaction coordinator. These Participants wrap a subordinate JTA transaction.

  5. The bridge header processor starts the bridge, which associates the JTA subordinate transaction context to the Thread. At this point the Thread has transaction contexts for both WS-AT and JTA.

  6. The JAX-WS pipeline processing continues, eventually calling whatever business logic is exposed. This may be e.g. an EJB using JSR-181 annotations. The business logic may use the JTA transaction in the normal manner e.g. enlisting Synchronizations and XAResources or performing other transactional activity either directly or though the usual JavaEE abstractions.

  7. On the return path, the bridge header processor disassociates the JTA transaction context from the Thread via the Bridge. The XTS context processor then does likewise for the WS-AT TxContext.

  8. On subsequent web services calls to the same or other web services from the same client, the process is repeated. However, the BridgeManager will, upon seeing the same WS-AT transaction context again, return the existing Bridge instance and not register further Participant instances. This allows substantially better performance than registering one Participant per web service invocation.

  9. Upon transaction termination by the client, the WS-AT transaction coordinator will drive the enlisted bridge Participants through the transaction termination protocol. The Participants maps these calls down to the JTA subtransaction coordinator, which in turn passes them on to any Synchronizations or XAResources enlisted in the transaction. This process is not visible to the business logic, except in so far as it may have registered its own Synchronizations, XAResources or Participants with the transaction.

The process flow when using the outbound bridge is as follows:

  1. A client starts a JTA transaction and invokes a remote transactional Web Service in the scope of that transaction. The client must have Narayana JTA (or JTS) and XTS deployed, as well as the transaction bridge. The coordinator used for the JTA transaction is the root coordinator. The server hosting the target web service needs a WS-AT transaction implementation but not a JTA or the transaction bridge.

  2. The outbound WS invocation flows though a handler chain that has the outbound transaction bridge handler and XTS header context processor registered, such that they are invoked in that order.

  3. The bridge handler calls the outbound bridge manager to obtain an outbound bridge instance corresponding to the JTA transaction context. As the BridgeManager is seeing the context for the first time, it creates a new Bridge instance. It also creates a Synchronization and XAResource instance to wrap the subordinate WS-AT transaction and registers these with the JTA transaction.

  4. The bridge handler starts the bridge, which associates the subordinate WS-AT transaction context to the Thread. The WS-AT header context processor then serializes this into XML in the headers of the outbound Web Services call.

  5. The receiving Web Service sees a WS-AT context and can work with it in the normal manner, without knowing it is a subordinate context.

  6. On the return path, the bridge handler disassociates the WS-AT TxContext from the Thread via the Bridge.

  7. On subsequent calls to the same or other transactional Web Services in the scope of the same JTA transaction, the process is repeated. However, the BridgeManager will, upon seeing the same JTA transaction context again, return the existing Bridge and not register another Synchronization or XAResource with the parent JTA transaction. This allows substantially better performance than registering once per web service invocation.

  8. Upon transaction termination by the client, the JTA transaction coordinator will drive the enlisted bridge Synchronization and XAResource through the transaction termination protocol. The XAResource maps these calls down to the WS-AT subtransaction coordinator, which in turn passes them on to any Volatile or Durable Participants enlisted in the transaction. This process is not visible to the business logic, except in so far as it may have registered its own Participants, XAResources or Synchronizatons with the transaction.

The bridge includes independent crash recovery systems for the inbound and outbound sides. These are automatically installed and activated as part of the bridge deployment. They rely upon the recovery mechanisms in the JTA and XTS components, which are likewise deployed and activated by default as part of their respective components.

It is the responsibility of the application(s) to use suitable XAResources (inbound) or DurableParticipants (outbound). In general the former will be from XA datasources or messaging systems, whilst the latter will be custom implementations. In either case it is important to ensure recovery is correctly configured for the resource manager(s) before using them in production, via the bridge or otherwise. The Narayana documentation set details crash recovery configuration, as does the application server administration guide. For resource manager specific information e.g. Oracle db permissions settings for recovery connections, please consult the vendor's documentation.

A bridged transaction will involve several distinct log writes, potentially on multiple hosts. Resolving the transaction may require more than one crash recovery cycle, due to ordering constrains on the events taking place during recovery. If a transaction fails to recover after all servers have been restored to service for more than two recovery cycles duration, the Narayana objectstore browser and server logs may be useful for diagnosing the issue. Where a transaction involves multiple bridges the number of recovery cycles required to resolve it may further increase. For systems requiring maximum availability it is therefore not recommended to span a transaction through more than one bridge.

Note that the 1PC commit optimization should not be used with outbound bridged transactions in which the subordinate may contain more than one Participant. Even where only one Participant is used, crash recovery logs may not correctly reflect the actual transaction outcome. The 1PC optimization is on be default and may be disabled by setting <property name="commitOnePhase">false </property> on CoordinatorEnvironmentBean.

See the 'Design Notes' appendix for detailed information on potential crash recovery scenarios and how each is handled.

In distributed environments that utilize transaction bridging, it is possible to construct arrangements of servers such that a transaction context passes though more than one interposition. These can give rise to some undesirable issues, including locking and performance problems.

A simple case would be a loop in which a JTA transaction context is bridged outbound to a WS-AT context, passed though one or more remote servers and inflowed back to the original server through an inbound bridge. This may result in a new subordinate JTA context, rather than reuse of the existing parent context in the original server.

This situation has two main observable effects. Firstly, the parent JTA transaction and indirectly subordinate JTA transaction are considered distinct and XAResources may not be shared between them. In most cases this will cause isolation between the transactions, such that they do not share locks or see eachother's changes. This may cause deadlocks in the application. Secondly, performance will be poor relative to reuse of the original context, particularly if the interposition chain becomes long.

A similar problem exists where a transaction context is propagated from a single source to a single destination server via two or more separate routes, the abstract paths forming a diamond shape. In such case the intermediate nodes operate independently and will bridge the original context to two separate interposed contexts. To the destination server these will appear unrelated, rather than as representations of the same transaction. Thus instead of recombining into a single shared transaction context at the destination, they will behave as different transactions, giving rise once again to potential deadlock and performance issues.

These problems may be partially addressed by having a shared context mapping service available on the network, which each bridge consults when working with a previously unseen transaction context for the first time. Using such a mechanism, bridge instances may identify transactions for which an established mapping already exists and reuse that relationship rather than creating a new one.

This shared service model does however cause some issues of its own with regard to performance and availability. It is not currently implemented. Therefore, users are urged to be cautious when constructing distributed applications. Whilst location abstraction is sometimes desirable, is is important to maintain a clear understanding of the deployment relationships between transactional components in the system.

The JavaEE transaction engine in Narayana comes in two varieties. These are the local only JTA, which does not support propagation of transaction context or transaction control calls between JVMs and the JTAX, which provides the JTA API implemented by a JTS engine that does support distributed usage.

WildFly Application Server uses the local JTA implementation by default, but can be reconfigured to use the JTS via the JTA API, such that it supports distributed transactions without requiring any changes to business applications.

In environments requiring transaction propagation of JTA transactions, it is feasible to use either the JTS or an outbound and inbound bridge pair to achieve this. In the former case the transport is RMI/IIOP for the transaction control and RMI/IIOP or JRMP for the transactional business logic calls. In the latter case the transport is Web Services for both transaction control and business logic.

From a transaction management perspective the JTS solution is preferred, due to simplicity (no protocol mapping is needed), maturity (Narayana JTS was the world's first JTS implementation and has been extensively used and tested in production environments) and performance (binary vs. xml).

It is possible to use transactions that propagate context on some calls via JTS and on others via Web Services, such as a client invoking both EJBs via RMI/IIOP and Web services with WS-AT context. In such cases it's possible for a transaction to have multiple representations that the infrastructure cannot determine are related, even if they actually represent different contexts in the same interposition hierarchy. Care must therefore be taken to avoid the problems described previously in 'Loops and Diamonds'.

The current transaction bridge release has the following limitations:

This section records key design points relating to the bridge implementation. The target audience for this section is software engineers maintaining or extending the transaction bridge implementation. It is unlikely to contain material useful to users, except in so far as they wish to contribute to the project. An in-depth knowledge of Narayana internals may be required to make sense of some parts of this appendix.

The txbridge is written as far as possible as a user application layered on top of the JTA and XTS implementations. It accesses these underlying components through standard or supported APIs as far as possible. For example, XAResource is favored over AbstractRecord, the JCA standard XATerminator is used for driving subordinates and so on. This facilitates modularity and portability.

It follows that functionality required by the bridge should first be evaluated for inclusion in one of the underlying modules, as experience has shown it is often also useful for other user applications. For example, improvements to allows subordinate termination code portability between JTA and JTS, and support for subordinate crash recovery have benefited from this approach. The txbridge remains a thin layer on top of this functionality, containing only purpose specific code.

The 'loops and diamonds' problem boils down to providing deterministic, bi-directional 1:1 mapping between an Xid (which is fixed length) and a WS-AT context (which is unbounded length in the spec, although bounded for instances created by the XTS). Consistent hashing techniques get you so far with independent operation, but the only 100% solution is to have a shared service on the network providing the mapping lookup. Naturally this then becomes a single point of failure as well as a scalability issue. For some scenarios it may be possible to use interceptors to propagate the Xid on the web services call as extra data, instead of trying to reproduce the mapping at the other end. Unfortunately XA does not provide for this kind of extensibility, although CORBA does, leading to the possibility of solving the issue without a centralized approach in mixed JTS+WS-AT environments.

Requiring a tx context on all calls is a bit limiting, but JBossWS native lacks a WS-Policy implementation. Things may change with the move to CXF. This is really a wider issue with XTS, not just the bridge.

As usual with transactions, it's the crash recovery that provides for the most complexity. Recovery for the inbound and outbound sides is handled independently. Because of event ordering between recovery modules (JTA, XTS), it requires two complete cycles to resolve some of these crash recovery situations.

An inbound transaction involves at least four log writes. Top down (i.e. in reverse order of log creation) these are: The WS-AT coordinator log (assumed here to be XTS, but may be 3rd party), the XTS Participant log in the receiving server, the JCA Subordinate transaction log and at least one XA Resource Manager log (which are 3rd party e.g. Oracle).

There is no separate log created by the txbridge. The XTS Participant log inlines the Serializable BridgeDurableParticipant via its writeObject method. Recorded state includes its identity (the Xid) and the identity of the separately logged JTA subordinate tx (a Uid).

XTS is responsible for the top level coordinator log. Narayana is responsible for the JTA subordinate tx log and 3rd party RMs are each responsible for their own.

The following situations may exist at recovery time, according to the point in time at which the crash occurred:

RM log only: In this case, the InboundBridgeRecoveryManager's XAResourceOrphanFilter implementation will be invoked via Narayana XARecoveryModule, will recognize the orphaned Xids by their formatId (which they inherit from the JCA subordinate, which the txbridge previously created with a specially constructed inflowed Xid) and will vote to have the XARecoveryModule roll them back as no corresponding JCA subordinate log exists, so presumed abort applies.

RM log and JTA subordinate tx log: The InboundBridgeRecoverytManager's scan of indoubt subordinate JTA transactions identifies the JTA subordinate as being orphaned and rolls it back, which in turn causes the rollback of the RM's XAResource.

RM log, JTA subordinate log and XTS Participant log: XTS is responsible for detecting that the Participant is orphaned (by re-sending Prepared to the Coordinator and receiving 'unknown tx' back) and initiating rollback under the presumed abort convention.

WS-AT coordinator log and all downstream logs: The coordinator re-sends Commit to the Participant and the transaction completes.

An outbound transaction involves log writes for the JTA parent transaction and the XTS BridgeWrapper coordinator. There is not a separate log created by the txbridge. The JTA tx log inlines the Serializable BridgeXAResource via its writeObject method. Recorded state includes the JTA tx id and bridgeWrapper id String. In addition a Web Service participating in the subordinate transaction will create a log. Assuming it's XTS, the participant side log will inline any Serializable Durable2PCParticipant, effectively forming the RM log.

The following situations may exist at recovery time, according to the point in time at which the crash occurred:

RM log (i.e. XTS Participant log, inlining Serializable Durable2PCParticipant) only: XTS is responsible for detecting that the Participant is orphaned (its direct parent, the subordinate coordinator, is missing) and rolling it back. The bridge recovery code is not involved – XTS recovery deserializes and drives any app DurableParticipants directly.

RM log and XTS subordinate log: The DurableParticipant(s) (i.e. client side) and XTS subordinate coordinator / BridgeWrapper (i.e. server side) and reinstantiated by XTS. The BridgeWrapper, being subordinate to a missing parent, must be identified and explicitly rolledback by the bridge recovery code. The bridge recovery manager is itself a RecoveryModule, thus invoked periodically to perform this task. It identified its own BridgeWrapper instance from amongst all those awaiting recovery by means of an id prefix specific to the txbridge code. See JBTM-725 for further details.

RM log, XTS subordinate log and JTA parent log (with inlined BridgeXAResource): Top down recovery by the JTA recovery module drives tx to completion, taking the normal JTA parent->BridgeXAResource->XTS subordinate->DurableParticipant path. Note that if the bridge is the only XAResource in the parent, the JTA must have 1PC commit optimization disabled or it won't write a log for recovery.

The test suite for the txbridge is split along two axis. Firstly, the inbound and outbound sides of the bridge have their own test suites in a parallel code package hierarchy. These are largely mirrors, containing tests which have matching intent but different implementation details. Secondly, the tests are split between those for normal execution and those for crash recovery.

The tests use a framework consisting of a basic servlet acting as client (the code pre-dates the availability of XTS lightweight client), a basic web service as server and a set of utility classes implementing the appropriate interfaces (Participant/Synchronization/XAResource). These classes contain the bare minimum of test logic. In order to make the tests as easy to understand and modify as possible, an attempt is made to capture the entirety of the test logic within the junit test function instead of splitting it over the framework classes. To facilitate this, extensive use is made of byteman and its associated dtest library, which provides basic distributed mock-like execution tracing and configuration. You probably need to take a detour and read the dtest docs before proceeding further.

The basic tests all follow the same pattern: make a call through the bridge, following different logic paths in each test, and verify that the test resources see the expected method calls. For example, in a test that runs a transaction successfully, expect to see commit called on enlisted resources and rollback not called. For a test that configures the prepare to fail, expect to see rollback called and commit not called. The tests verify behavior in the presence of 'expected' errors e.g. prepare failures, but generally don't cover unexpected failures e.g. exceptions thrown from commit.

Test normal execution targets in the tests/build.xml assume the server is started manually with byteman installed and has XTS, txbridge and the test artifacts deployed. Note that it also contains targets that may be called to achieve the last of these steps.

The crash rec tests start (and subsequently restart) the server automatically, but assume the that XTS, txbridge and the test artifacts are deployed. To manage the server they need to be provided with JBOSS_HOME and JAVA_HOME values in the build.xml.