[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ecf-dev] Race conditions with remote provider

Hi Alex,

On 6/6/2011 1:04 PM, Alex Blewitt wrote:
<stuff deleted>
Well, I'm writing a discovery container. So I'm using r-osgi and the example hello consumer/producer examples, just with my container instead of ZeroConf. In turn, I'm instantiating this and connecting to it (with connect(id,connectContext)) in the bundle activator's start method.

I see...that's interesting (using r-osgi to write a discovery API provider).

<stuff deleted>
So I had something like (excuse the exact method calls, this is from memory):

class Activator {
   void start() {
     id = IDFactory.getDefault().createId(namespace, new String[] { "a","b","c" });
     myContainer = new MyContainer();
     context.registerService(myContainer, {IDiscoveryListener, IDiscoveryAdvertiser});

The problem was that the ID Factory was using the namespace (either from the plugin.xml where I have some links, or from a service I may have registered prior to that call ... think the plugin.xml is how it's finding it) and using bundle.getClass() to load the namespace's specific class provider.

Since the start() method hasn't completed at that point, an external bundle doing bundle.loadClass(some.class.name) fails with some OSGi 'bundle not started' type exception.

I see. Markus Kuppe is the lead committer on the discovery API, so I'll defer to him...but one approach you might take is to create a ServiceFactory (rather than registering the locator/advertiser service directly in the start method). This will defer the container creation and the call to IDFactory.getDefault() (the IDFactory...and the container factory for that matter...defer the processing of the appropriate extensions until the *first* time that IDFactory.getDefault() is called).

This (ServiceFactory) is the approach that the jmdns and jslp discovery provider use...as the discovery providers 'want' to make themselves available as soon as the ECF core itself is started...and this can (and does) get mixed up with timing issues (e.g. sometimes discovery container creation can initiate blocking I/O calls...as I expect your connect call probably does...and naturally one won't want to put that into the classload/activator start).

So, for example, here's some code that the jmdns provider uses on start():

final Properties props = new Properties();
props.put(IDiscoveryService.CONTAINER_NAME, NAME);
props.put(Constants.SERVICE_RANKING, new Integer(750));
String[] clazzes = new String[] {IDiscoveryService.class.getName(), IDiscoveryLocator.class.getName(), IDiscoveryAdvertiser.class.getName()};
// this is the usage of ServiceFactory
serviceRegistration = context.registerService(clazzes, new ServiceFactory() {
private volatile JMDNSDiscoveryContainer jdc;

/* (non-Javadoc)
* @see org.osgi.framework.ServiceFactory#getService(org.osgi.framework.Bundle, org.osgi.framework.ServiceRegistration)
public Object getService(final Bundle bundle, final ServiceRegistration registration) {
if (jdc == null) {
try {
jdc = new JMDNSDiscoveryContainer(); // <-- the jmdns id is created in this constructor
jdc.connect(null, null);
// this catch is for the case when the JMDNSDiscoveryContainer ID creation fails
} catch (final IDCreateException e) {
Trace.catching(JMDNSPlugin.PLUGIN_ID, JMDNSDebugOptions.EXCEPTIONS_CATCHING, this.getClass(), "getService(Bundle, ServiceRegistration)", e); //$NON-NLS-1$ //$NON-NLS-2$
} catch (final ContainerConnectException e) {
Trace.catching(JMDNSPlugin.PLUGIN_ID, JMDNSDebugOptions.EXCEPTIONS_CATCHING, this.getClass(), "getService(Bundle, ServiceRegistration)", e); //$NON-NLS-1$ //$NON-NLS-2$
jdc = null;
return jdc;

/* (non-Javadoc)
* @see org.osgi.framework.ServiceFactory#ungetService(org.osgi.framework.Bundle, org.osgi.framework.ServiceRegistration, java.lang.Object)
public void ungetService(final Bundle bundle, final ServiceRegistration registration, final Object service) {
//TODO-mkuppe we later might want to dispose jSLP when the last!!! consumer ungets the service
//Though don't forget about the (ECF) Container which might still be in use
}, props);

Hopefully this will help with your case. Discovery providers are naturally kind of tricky...since they typically 'want' to be started very early (upon start of ECF itself), while still having lazy processing of the ECF extensions (id factories and container factories). The ServiceFactory allows/supports this without creating timing/classload/bundle state exception with another thread.

I'm hoping/expecting that Markus will smack me down quick if this is all wrong though :).