Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » Virgo » Race condition at virgo start
Race condition at virgo start [message #1037393] Tue, 09 April 2013 09:53 Go to next message
Christoph S. is currently offline Christoph S.
Messages: 5
Registered: April 2013
Junior Member
Hi all

We run virgo in an productive cluster environment with 20 virgo node.
We deploy a plan-file with more than 200 modules. On server start about 10-20% of all server nodes hang. It look like a race condition in spring dm service discovery.

OSGI package level wiring works, but some Spring-DM Service dependencies are unresolved. The situation is non-deterministic.

Virgo Version: virgo-tomcat-server-3.6.0.RELEASE

thanks
Re: Race condition at virgo start [message #1038866 is a reply to message #1037393] Thu, 11 April 2013 08:55 Go to previous messageGo to next message
Glyn Normington is currently offline Glyn Normington
Messages: 1222
Registered: July 2009
Senior Member
Sorry to hear you are having problems. Please check you have read this FAQ - specifically you need to frame a question very carefully. Wink

[Updated on: Thu, 11 April 2013 09:01]

Report message to a moderator

Re: Race condition at virgo start [message #1038927 is a reply to message #1038866] Thu, 11 April 2013 10:13 Go to previous messageGo to next message
Christoph S. is currently offline Christoph S.
Messages: 5
Registered: April 2013
Junior Member
I found a stacktrace to narrow the problem Wink. There is ConcurrentModificationException in DependencyServiceManager

ERROR region-dm-9 Exception during dependency processing for OsgiBundleXmlApplicationContext(bundle=[bundlename], config=osgibundle:/META-INF/spring/*.xml) 
java.util.ConcurrentModificationException: null
         at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(Unknown Source)
         at java.util.LinkedHashMap$KeyIterator.next(Unknown Source)
         at org.eclipse.gemini.blueprint.extender.internal.dependencies.startup.DependencyServiceManager.createDependencyFilter(DependencyServiceManager.java:407)
         at org.eclipse.gemini.blueprint.extender.internal.dependencies.startup.DependencyServiceManager.sendBootstrappingDependenciesEvent(DependencyServiceManager.java:482)
         at org.eclipse.gemini.blueprint.extender.internal.dependencies.startup.DependencyServiceManager.access$500(DependencyServiceManager.java:64)
         at org.eclipse.gemini.blueprint.extender.internal.dependencies.startup.DependencyServiceManager$DependencyServiceListener.updateDependencies(DependencyServiceManager.java:189)
         at org.eclipse.gemini.blueprint.extender.internal.dependencies.startup.DependencyServiceManager$DependencyServiceListener.serviceChanged(DependencyServiceManager.java:128)
         at org.eclipse.osgi.internal.serviceregistry.FilteredServiceListener.serviceChanged(FilteredServiceListener.java:107)
         at org.eclipse.osgi.framework.internal.core.BundleContextImpl.dispatchEvent(BundleContextImpl.java:861)
         at org.eclipse.osgi.framework.eventmgr.EventManager.dispatchEvent(EventManager.java:230)
         at org.eclipse.osgi.framework.eventmgr.ListenerQueue.dispatchEventSynchronous(ListenerQueue.java:148)
         at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.publishServiceEventPrivileged(ServiceRegistry.java:819)
         at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.publishServiceEvent(ServiceRegistry.java:771)
         at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.register(ServiceRegistrationImpl.java:130)
         at org.eclipse.osgi.internal.serviceregistry.ServiceRegistry.registerService(ServiceRegistry.java:214)
         at org.eclipse.osgi.framework.internal.core.BundleContextImpl.registerService(BundleContextImpl.java:433)
         at org.eclipse.gemini.blueprint.service.exporter.support.OsgiServiceFactoryBean.registerService(OsgiServiceFactoryBean.java:386)
         at org.eclipse.gemini.blueprint.service.exporter.support.OsgiServiceFactoryBean.registerService(OsgiServiceFactoryBean.java:344)
         at org.eclipse.gemini.blueprint.service.exporter.support.OsgiServiceFactoryBean$Executor.registerService(OsgiServiceFactoryBean.java:98)
         at org.eclipse.gemini.blueprint.service.exporter.support.internal.controller.ExporterController.registerService(ExporterController.java:38)
         at org.eclipse.gemini.blueprint.service.dependency.internal.DefaultMandatoryDependencyManager.startExporter(DefaultMandatoryDependencyManager.java:331)
         at org.eclipse.gemini.blueprint.service.dependency.internal.DefaultMandatoryDependencyManager.checkIfExporterShouldStart(DefaultMandatoryDependencyManager.java:269)
         at org.eclipse.gemini.blueprint.service.dependency.internal.DefaultMandatoryDependencyManager.discoverDependentImporterFor(DefaultMandatoryDependencyManager.java:260)
         at org.eclipse.gemini.blueprint.service.dependency.internal.DefaultMandatoryDependencyManager.addServiceExporter(DefaultMandatoryDependencyManager.java:190)
         at org.eclipse.gemini.blueprint.service.dependency.internal.MandatoryDependencyBeanPostProcessor.postProcessAfterInitialization(MandatoryDependencyBeanPostProcessor.java:43)
         at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.applyBeanPostProcessorsAfterInitialization(AbstractAutowireCapableBeanFactory.java:407)
         at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1426)
         at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:519)
         at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:456)
         at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:291)
         at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
         at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:288)
         at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:190)
         at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:563)
         at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:895)
         at org.eclipse.gemini.blueprint.context.support.AbstractDelegatedExecutionApplicationContext.access$1600(AbstractDelegatedExecutionApplicationContext.java:60)
         at org.eclipse.gemini.blueprint.context.support.AbstractDelegatedExecutionApplicationContext$4.run(AbstractDelegatedExecutionApplicationContext.java:325)
         at org.eclipse.gemini.blueprint.util.internal.PrivilegedUtils.executeWithCustomTCCL(PrivilegedUtils.java:85)
         at org.eclipse.gemini.blueprint.context.support.AbstractDelegatedExecutionApplicationContext.completeRefresh(AbstractDelegatedExecutionApplicationContext.java:290)
         at org.eclipse.gemini.blueprint.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor$CompleteRefreshTask.run(DependencyWaiterApplicationContextExecutor.java:137)
         at org.eclipse.virgo.kernel.agent.dm.ContextPropagatingTaskExecutor$2.run(ContextPropagatingTaskExecutor.java:95)
         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
         at java.lang.Thread.run(Unknown Source)



I cannot debug the problem, since we have to loop startup-sequence many times to reproduce the situation. Looks like there is a timing issue.

Thanks
Christoph
Re: Race condition at virgo start [message #1038947 is a reply to message #1038927] Thu, 11 April 2013 10:37 Go to previous messageGo to next message
Glyn Normington is currently offline Glyn Normington
Messages: 1222
Registered: July 2009
Senior Member
One theory is that two (or more) service dependencies of the same bundle were satisfied in close succession and thread scheduling was such that the collection of unsatisfied dependencies being iterated on one thread was mutated by another thread. This is a bug, although as you point out, it's going to be nasty to reproduce. Please could you log a bugzilla against Gemini Blueprint 1.0.2 although if (and it's a big if given the current resource on Gemini Blueprint - patches gratefully accepted Wink ) someone manages to fix it, the earliest release it will go into is Gemini Blueprint 2.0 which is destined for Virgo 3.7.0.
Re: Race condition at virgo start [message #1039541 is a reply to message #1038947] Fri, 12 April 2013 05:19 Go to previous messageGo to next message
Christoph S. is currently offline Christoph S.
Messages: 5
Registered: April 2013
Junior Member
Thank about the hint. Our application consists of infrastructure services that are referenced in many other modules. Your theory might be true in our case.

I will take a look at the Code around this commit

http://git.eclipse.org/c/gemini.blueprint/org.eclipse.gemini.blueprint.git/commit/extender/src/main/java/org/eclipse/gemini/blueprint/extender/internal/dependencies/startup/DependencyServiceManager.java?id=708b8669657659cc392e098dce046300a2cad747&ss=1

It could be that with the new code, the exception does not occur anymore, but the problem still remains.

Maybe I can analyze our problem in detail. I will give you feedback.

Many Thanks
Re: Race condition at virgo start [message #1039739 is a reply to message #1037393] Fri, 12 April 2013 10:24 Go to previous messageGo to next message
Glyn Normington is currently offline Glyn Normington
Messages: 1222
Registered: July 2009
Senior Member
Commit 708b86 was included as the fix to bug 384748 in Gemini Blueprint 1.0.2.RELEASE which is included in Virgo 3.6.x. So I'm disappointed that you are seeing that particular ConcurrentModificationException with Virgo 3.6.0.

One thing that puzzled me when I first looked at your stack trace was that the line numbers of DependencyServiceManager did not match those of the code at the 1.0.2.RELEASE tag. Please could you confirm that you are running a Virgo 3.6.0.RELEASE including the Gemini Blueprint 1.0.2.RELEASE JARs as shipped?

There may still be a race in the DependencyServiceManager code, so let's continue this thread to collaborate on diagnosis and a possible solution.
Re: Race condition at virgo start [message #1039781 is a reply to message #1039739] Fri, 12 April 2013 11:29 Go to previous messageGo to next message
Christoph S. is currently offline Christoph S.
Messages: 5
Registered: April 2013
Junior Member
Actually, I guess the stack trace is from an older virgo version. We updated some time ago to solve the problem. There is no newer stacktrace. However the system hangs silently now Sad

We started start/stop loop with enabled debug log now. Hope the problem still occurs.
Logging sometimes has an ugly sideeffect to synchronize concurrent threads.
Re: Race condition at virgo start [message #1039800 is a reply to message #1039781] Fri, 12 April 2013 11:59 Go to previous messageGo to next message
Glyn Normington is currently offline Glyn Normington
Messages: 1222
Registered: July 2009
Senior Member
Ah, I see. The other approach is to stare at DependencyServiceManager for several hours in a darkened room. Smile
Re: Race condition at virgo start [message #1044127 is a reply to message #1039800] Thu, 18 April 2013 10:23 Go to previous message
Christoph S. is currently offline Christoph S.
Messages: 5
Registered: April 2013
Junior Member
I studied the code (in beautiful sunshine Wink ).

in StageOne()

dl.findServiceDependencies() takes a snapshot of dependencies.

after some lines a listener gets registered to keep track of newly available services.

dependencyDetector.register();

What happens, if a service starts in between?
Shouldn't the listener registered first before calling dl.findServiceDependencies() ?


org.eclipse.gemini.blueprint.extender.internal.dependencies.startup.DependencyWaiterApplicationContextExecutor

	protected void stageOne() {

[...]
			DependencyServiceManager dl = createDependencyServiceListener(task);
			dl.findServiceDependencies();

			skipExceptionEvent = true;

			// all dependencies are met, just go with stageTwo
			if (dl.isSatisfied()) {
				log.info("No outstanding OSGi service dependencies, completing initialization for " + getDisplayName());
				stageTwo();
			} else {
				// there are dependencies not met
				// register a listener to look for them
				synchronized (monitor) {
					dependencyDetector = dl;
				}

				if (debug)
					log.debug("Registering service dependency dependencyDetector for " + getDisplayName());

				dependencyDetector.register();

				if (synchronousWait) {
					waitBarrier.increment();
					if (debug)
						log.debug("Synchronous wait-for-dependencies; waiting...");

					// if waiting times out...
					if (waitBarrier.waitForZero(timeout)) {
						timeout();
					} else
						stageTwo();
				} else {
					// start the watchdog (we're asynch)
					startWatchDog();
				}
			}
Previous Topic:org.eclipse.virgo.web.dm from p2?
Next Topic:Eclipse plug-ins and extension points
Goto Forum:
  


Current Time: Thu Jul 24 11:23:34 EDT 2014

Powered by FUDForum. Page generated in 0.02770 seconds