The choreography engine can be thought of as a big distributed program accessing only external
APIs defined in some interface description language. Exactly how these interfaces are implemented
(bound) is a secondary issue.
Below is a diagram of how the choreography engine interacts with everything in TPTP. Note that
every link to and from the choreography engine in the centre is an interface with a specific
definition but no specific implementation (at the time of program creation).
The current choreography engine consists of two main parts. The first part performs
the translation from BPEL into Java source code and compiles the results into Java class
files. The second part is the actual engine implementation which runs these compiled class files.
As shown in the diagram above the first step in executing the specified BPEL program is for the
choreography component to translate it to Java source code. This Java source code assumes the
existence of a choreography engine “Runner” interface. This “Runner” interface provides the
translated Java source code with facilities to send and receive messages on specified conversation
IDs, create and access shared memory constructs such as shared variables or maps, and to create
and access synchronisation constructs like semaphores, mutexes and barriers. The translated Java
source code uses these APIs to implement the necessary BPEL programming constructs.
The translated Java source code is then compiled into Java classes using the standard Eclipse
JDT Java compiler. This set of compiled classes is known as the engine “Program” (there is also
a Program class which serves as a container and classloader for these compiled classes).
When the program is ready then an engine implementation is instantiated to run the program. The
“EngineFactory” class is used to instantiate an engine implementation and the program is passed
through to the implementation and launched via the standard Client API interfaces.
The engine implementation takes the program classes and launches the specified “Runners” to
execute the program. It passes in itself as an implementation of the “Runner” interfaces so
that the executing runners have access to the necessary distributed constructs.
The benefit of this architecture is that the engine implementation can easily be replaced without
affecting the BPEL translation.
This potentially allows engines with different strengths and weaknesses to all take advantage of
the BPEL translation just by implementing a set of well defined, simple interfaces.
The main Java engine implementation is a distributed implementation capable of running a
single BPEL process across multiple machines (the threads created in each BPEL flow are
distributed according to a deployment algorithm across the specified set of hosts).
This engine uses a star topology where the a “Controller” coordinates all distributed transactions
between a number of “Sub Controllers”. The “Controller” is a single point of synchronisation
for all distributed transactions and so manages synchronisation and shared memory across the
runners in all the “SubControllers”.
The “Controller” machine can be on the same machine as a “SubController” machine but where there
are a large number of “Runners” or a number of “Runners” that are draining resources due to
operations they are performing within WSDL bound BPEL invocations it is likely that hosting the
“Controller” machine on a separate machine will yield better performance.
This engine is designed to run BPEL processes used for load testing.
A secondary Java engine implementation called the “Mini Engine” provides a more standard
BPEL engine and serves as a good basic example implementation of the choreography engine interfaces.
The Mini Engine is a completely local implementation and does not distribute the BPEL process
in any way. This implementation has certain benefits over the main engine implementation such
as a greatly reduced startup time and increased performance in certain conditions (e.g. message
passing, where this engine does not need to pass messages across a network).
The Main Engine transport layer is designed to be deadlock free.
The layer will be described in terms of two nodes (A and B) communicating point to point and
the connection between them.
Each connection between two nodes can be logically split into 4 parts:
- An up/down stream for outgoing (A-B-A) transactions
- An up/down stream for incoming (B-A-B) transactions
- A stream for A-B asynchronous messages
- A stream for B-A asynchronous messages
NOTE: The asynchronous message streams do not have buffer planes because in the case of the
Choreography engine the asynchronous messages are never rerouted, they are point to point messages
NOTE: The transaction streams do not have ACK based finite buffering because the nature of the
transaction ensures the same thing (transaction response == ACK)
NOTE: The two multiplexers (x3 and x2) could potentially be mixed into a single multiplxer (x4)
but the ack-finite-buffer streams are implemented as a generic feature over a single bidirectional stream
and therefore do their own multiplexing
The nodes’ point to point connection (e.g. a TCP/IP socket) is first multiplexed into 3 separate
bi-directional streams.
One stream is used for outgoing transactions, another is for incoming transactions, and the last is
for asynchronous messaging.
The engine contains a pluggable point to point transport layer which provides the point to point connection
in the multiplexed transport layer. Alternative transports can be added via the use of an Eclipse extension
point. The engine's dependency management facilities ensure that the transport implementation is then passed
to the rest of the engine without further installation (where possible).
As standard, the engine comes with a TCP/IP implementation of the point to point Session Transport extension point.
When sending a stream of messages asynchronously, it may be possible to lock the underlying transport
layer, depending on that layer’s buffering strategy.
To ensure that there are a finite (and known) number of messages in the underlying layer at any one time,
the outgoing stream sends N messages. The other end of the stream, reading the messages, then sends
acknowledgements every N/2 (or some other factor) messages. The outgoing stream will then receive this
acknowledgement and allow a further N/2 messages into the stream, thus guaranteeing that there will be
no more than N/2 messages in the underlying transport layer at any time.
The transaction streams in the Choreography engine are used for the majority of communication and
transactions can be rerouted.
The transaction streams do not need ACK based finite buffering because the nature of the transaction
provides this – if a send occurs then another send cannot occur until a receive for the previous send
has occurred, thus guaranteeing at most 1 message in the underlying transport at any time. This feature
prevents deadlock of the underlying layer.
The transaction streams do, however, need buffer planes. Buffer planes mean that A can make a transaction
request on B, and then B can make a transaction request on A ad infinitum without deadlock occurring.
This means that however the application layer chains its transactions and whatever the dependencies are, deadlock will not occur.
The Asynchronous streams in the Choreography engine are used only for point to point messaging.
The messages sent across these streams are never rerouted.
These streams need ACK based finite buffering for the reasons above.
Because these streams never reroute messages, they do not need buffer planes.