Author: Antony Miguel (antony.miguel@scapatech.com), Last Updated: 12th May 2005

Frequently Asked Questions (FAQ)

General FAQ
Architectural FAQ

Why BPEL / WSDL / XSD?

BPEL, WSDL and XSD are all commonly associated with web services, XML, SOAP etc but they have all been designed to be very generic.

BPEL can be seen as a distributed programming language that exercises WSDL endpoints (all I/O in the programming language is via WSDL described and bound endpoints).

The BPEL, WSDL, XSD and XPATH in the engine shouldn't be necessarily associated with XML, SOAP or any of the common web services implementations. Instead, BPEL can be seen simply as a programming language, WSDL can be seen as a generic Interface Definition Language (IDL), XSD can be seen as a generic type definition language and XPATH can be seen as a generic expression evaluation language.

The combination of all four therefore make up a full featured programming language.


Is the engine a BPEL interpreter?

No. The engine is a BPEL engine and it's purpose in life is to run BPEL processes but it is not an interpreter. The engine has a compile step where the BPEL process source is translated into java classes before being run inside a java engine framework. For more information on this process see the choreography engine architecture document.


In a load testing BPEL behaviour, how is it possible to ensure each thread (simulated user) has a separate session?

(raised in 13th April choreography component meeting)

All the threads (simulated users) in the BPEL load testing behaviour can make invocations against the same partner link. It is possible to dynamically set the address of the endpoint they are speaking to which gives us flexibility to dynamically specify the SUT they are running against. If we want multiple parallel invocations to speak to the same address but have separate connections then we can use correlation sets. The invocations specify a correlation set but each thread correlates on a unique ID and therefore gets a separate port instance to use (and therefore a separate, unique session to the SUT).


Won't all the XML processing be slow?

A common reaction to BPEL seems to be that there is so much XML around that parsing all this XML will cause serious performance issues for an engine which is intended for use as a generic but performant behaviour engine (given that it may need to run load tests).

The first point to make is that if the internals of data being passed around through APIs the choreography engine exercises are not specified via XSD and the bindings then the choreography engine will treat this data as opaque. No parsing will take place and it will efficiently pass the data around as a string.

Where the datatypes are mapped to XSD types the choreography engine may parse the data. The engine can have whatever internal representation of XSD types it likes as long as this representation can be converted to whatever form necessary for any given binding. In reality, the engine will not use XML documents as it’s internal representation of XSD types, it will use Java objects which map to known types.

This means that in the case of the WSIF Java binding, the engine will simply assign data from one java object into it’s own internal java datatypes (so no parsing will take place). In the case of a SOAP/XML binding the engine will have to do some parsing but only at the point where an endpoint is exercised (e.g. a BPEL ), all other operations such as calculations (XPATH) etc will use the internal java objects and therefore need no parsing. As a further performance improvement over standard parsing, each internal java object representing an XSD type can have it’s own specific conversion methods compiled in, lending to optimisations not available with generic XML parsing.


How does the engine distribute a single BPEL process?

(raised in 16th March choreography component meeting)

The choreography engine should be able to cope with executing any behaviour anyone would like to define in TPTP. This is a broad scope but the set of behaviours can be split into two sets: local behaviours and distributed behaviours. The engine must be able to support both local behaviours running on the current machine or in the current JVM and also behaviours such as distributed load tests which utilise many machines in a single behaviour.

To this end, the choreography engine has been designed so that the engine implementation is abstracted and the runners which implement the BPEL speak only to a standardised interface to pass messages etc. This is described in more detail in the Choreography Architecture document.

This also brings up the issue of distributing a BPEL process which represents a distributed load test behaviour. In the simple case, the choreography engine will be able to run a single BPEL process and distribute the parallel parts of that BPEL process (BPEL flows) across multiple machines.

It is desirable also that at some point the user should be able to specify the distribution and placement of the parallel parts of the BPEL process (runner deployment and load balancing).


How does the main engine deploy threads in a single BPEL process?

(raised in 16th March choreography component meeting)

The way distribution works in the engine is that if there is a single BPEL process that contains a flow, then all the activities in that flow are distributed evenly to run in parallel on all the hosts that make up the engine instance. Therefore to run a distributed load test it is only necessary to have and launch a single BPEL process which contains a flow with enough activities to run one activity on each of the hosts in the set of available hosts.

However, in BPEL as the specification currently stands, it is necessary to statically define each invocation under the flow. It is not possible to dynamically specify the number of copies of an activity that should run in parallel. This means that if a user wanted to run a load test with 100 users in parallel, they would have to fully specify 100 activities under the BPEL flow.

Issues 4 and 147 in the BPEL issues list are attempts to rectify this. Issues 147 specifies the concept of a parallel foreach, where it would be possible to specify the number of copies of an activity that should occur as an XPATH expression or as a variable. This would allow the user to specify based on a variable how many users the load test should contain.

At the moment the engine evenly distributes any parallel threads across whatever hosts are available to the program. Ultimately, the user will be able to specify whether they want activity A in flow F to run on host H1 or H2. When Issue 4/147 is fixed the user should also be able to specify how they want the dynamic number of users to be distributed (e.g. place 75% of users on host H1 and 25% of users on host H2).

To this end, there will be a BPEL extension which the distributed engine can pick up and other engines can ignore. The extension will work such that the activities under a flow can have an element which specifies weighting across machine distributions or even a specific host to run on. This should be an XPATH expression or variable to allow maximum flexibility.

This is a complex method of thread deployment though. For users who do not wish to have this level of complexity or do not wish to have deployment information contained within their BPEL process, users will also be able to specify weighting information in the Launch Configuration that runs their process.

Any thread deployment information present in the BPEL file will override information in the Launch Configuration.


How does the main engine manage correlated transactions across deployed threads in a single BPEL process?

Correlations across transactions which involve flows may end up being distributed across multiple machines in the case where the BPEL process is run using the distributed engine.

In some cases one can imagine this might cause problems - e.g. in the case of the WSIF Java binding type where the machine may hold state information not readily accessible to other machines which make up the engine instance.

The solution to this is to categorise WSDL binding types as being either stateful or stateless.

In the case of a stateless binding type (e.g. SOAP) there is no problem in splitting these correlated transactions across multiple machines as the bindint type will contain no extra state and all necessary information will be contained within the WSDL message.

In the case of a stateful binding type (e.g. WSIF Java) the engine must simply restrict the deployment of the threads at the time that the BPEL is translated into Java. If a sequence performs an invocation I1 and then performs a flow which more invocations are made which are correlated with I1, the child threads (flow elements) must be deployed on the same machine as the parent thread (sequence).


How does the engine manage extension point dependencies? (e.g. Java ports or WSDL binding type extensions)

Extension points which may require dependencies (e.g. engine point to point transports or WSDL binding types) can export dependencies on JARs within their plugins.

These JARs are broadly split into two categories:

Engine dependencies are JARs which the engine infrastructure depends on such as JARs required for point to point transport extensions. The engine passes these across to Daemons as necessary so that the Daemon can instantiate a new engine instance including the specified depend?encies.

Program dependencies are JARs which the compiled engine Program (java classes) depend on. These are passed to the compiler so that the Program can be compiled properly and they are passed around with the Program to remote parts of the engine so that the Program can include them in it's classloader and when the Program runs it can load any dependant classes without problems.


How does the engine manage backwards compatibility?

With each change to the distributed engine code, a version is updated in the code which maps roughly to the TPTP version (except finer grained). This is the version of a particular build of the engine.

Given that the engine may evolve in complex ways and older engine's cannot be expected to support newer engine capabilities (forwards compatibility), the main point of compatibility is the engine Daemon. The engine Daemon should be viewed as an external API that should be kept as similar as possible between TPTP releases. The benefit of this is that since the engine Daemon launches separate processes for each part of an engine instance, it can be updated as necessary with a given version JAR which it can then use to instantiate part of a distributed engine instance.

In plain terms, the engine daemon connector checks the supported versions of the daemon it is connected to. If the daemon is not the correct version then the daemon connector sends across an update JAR which the daemon caches and can use to instantiate parts of an engine instance with that version.

The net effect of this is backwards and forwards compatibility. A user need never update their engine daemon unless a significant change occurs in the daemon code itself.


How can the engine access the workbench APIs such as TPTP model APIs?

The engine will access the workbench APIs in the same way as it would access any other API. The APIs will be wrapped up in WSDL as web service endpoints. The underlying binding which actually accesses these APIs is a separate matter and could be implemented in multiple ways.

The standard TPTP APIs however will likely be accessed via a SOAP server running inside the Eclipse Workbench. This SOAP server would be a client-side part of the choreography component which runs specifically to expose TPTP APIs to the choreography engine. WSDL operations which roughly mapped to the TPTP Java APIs would be implemented in the SOAP server such that the operations could be used to create TPTP data models and make use of other TPTP APIs.


How can the workbench or user interact with a running process?

If a vendor wished to have interaction with a running BPEL process from the Eclipse workbench there would be two obvious ways to accomplish this.

The first would be for the BPEL process to make a request on the workbench as some kind of web service. One example would be that the workbench could run a SOAP server and the BPEL process could invoke operations on this SOAP server to get interaction from the workbench or user.

The second would be very similar but in the other direction. The BPEL process is capable of acting as a web service and so the BPEL process could have a thread listening as a web service for invocations from the workbench. This approach would potentially have more flexibility as the BPEL process could either block and wait for input from the user or a BPEL flow could be used to listen for input asynchronously.