Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [osgi-users] Deadlock with getService/ungetService

Tom,

Unfortunately that is the most stack trace that threadInfo will return even if we ask for MAX. I would also like to get a full stack trace and that would surely help.

As for why we are using ungetServices, in some cases we need to do it since our prototype instances are either instantiated from outside of SCR and in other cases we have components that will get a stateful instance, run some process and then unget it after. But the most important reason started with the problem that we found which led to this issue https://issues.apache.org/jira/browse/FELIX-5974 that might have been fixed by now but our code was in place by then.

Cheers,
Alain




Alain Picard
Chief Strategy Officer
Castor Technologies Inc
o:514-360-7208
m:813-787-3424


On Thu, Sep 2, 2021 at 12:44 PM Thomas Watson <tjwatson@xxxxxxxxxx> wrote:
It is hard for me to tell what the cause is here without the full stack traces of the two threads involved in the deadlock.  Overall I am confused why you have SCR components that are invoking ungetService directly instead of having SCR do that for you.  But perhaps I misunderstood what you mean by " we had made some changes where a number of ungetServices were moved outside of our components deactivate method."

Tom
 
 
 
----- Original message -----
From: "Alain Picard" <picard@xxxxxxxxxxxxxx>
Sent by: "osgi-users" <osgi-users-bounces@xxxxxxxxxxx>
To: "This is a community mail list for OSGi technology. Any OSGi technical discussion or questions are acceptable here." <osgi-users@xxxxxxxxxxx>
Cc:
Subject: [EXTERNAL] [osgi-users] Deadlock with getService/ungetService
Date: Thu, Sep 2, 2021 10:11 AM
 
Yesterday our application ended up in a deadlock state in production. With the help of HealthCheck we were able to identify that there was a thread deadlock and capture the thread info for those offending threads.
 
Thread 1
osgi> threadInfo "qtp2111139171-925"
Found 1 threads named qtp2111139171-925
Info:
"qtp2111139171-925" prio=5 Id=925 BLOCKED on org.eclipse.osgi.internal.serviceregistry.PrototypeServiceFactoryUse@19a0fb19 owned by "qtp2111139171-948" Id=948
        at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.ungetService(ServiceRegistrationImpl.java:614)
        -  blocked on org.eclipse.osgi.internal.serviceregistry.PrototypeServiceFactoryUse@19a0fb19
        at org.eclipse.osgi.internal.serviceregistry.ServiceObjectsImpl.ungetService(ServiceObjectsImpl.java:135)
        at org.apache.felix.scr.impl.helper.ComponentServiceObjectsHelper$ComponentServiceObjectsImpl.close(ComponentServiceObjectsHelper.java:142)
        at org.apache.felix.scr.impl.helper.ComponentServiceObjectsHelper.closeServiceObjects(ComponentServiceObjectsHelper.java:95)
        at org.apache.felix.scr.impl.manager.DependencyManager.invokeUnbindMethod(DependencyManager.java:1933)
        at org.apache.felix.scr.impl.manager.DependencyManager.close(DependencyManager.java:1682)
        at org.apache.felix.scr.impl.manager.SingleComponentManager.disposeImplementationObject(SingleComponentManager.java:417)
        at org.apache.felix.scr.impl.manager.ServiceFactoryComponentManager.ungetService(ServiceFactoryComponentManager.java:170)
        ...
Thread 2
osgi> threadInfo "qtp2111139171-948"
Found 1 threads named qtp2111139171-948
Info:
"qtp2111139171-948" prio=5 Id=948 BLOCKED on org.eclipse.osgi.internal.serviceregistry.PrototypeServiceFactoryUse@5a01ccb9 owned by "qtp2111139171-925" Id=925
        at org.eclipse.osgi.internal.serviceregistry.ServiceRegistrationImpl.getService(ServiceRegistrationImpl.java:521)
        -  blocked on org.eclipse.osgi.internal.serviceregistry.PrototypeServiceFactoryUse@5a01ccb9
        at org.eclipse.osgi.internal.serviceregistry.ServiceObjectsImpl.getService(ServiceObjectsImpl.java:92)
        at org.apache.felix.scr.impl.helper.ComponentServiceObjectsHelper$ComponentServiceObjectsImpl.getService(ComponentServiceObjectsHelper.java:166)
        at com.castortech.iris.ecp.view.spi.core.zk.BaseControlZKRendererImpl.activate(BaseControlZKRendererImpl.java:70)
        at jdk.internal.reflect.GeneratedMethodAccessor259.invoke(Unknown Source)
        at java.base@11.0.6/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base@11.0.6/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.felix.scr.impl.inject.methods.BaseMethod.invokeMethod(BaseMethod.java:228)
        ...
Here we can see one thread getting a service and the other is ungetting a service.
 
In the last few days we had made some changes where a number of ungetServices were moved outside of our components deactivate method.
 
This is using org.eclipse.osgi v 3.14.0v20190517 and Felix SCR 2.1.14.v20190123, both of which don't seem to have changed much in the classes that are in the trace.
 
I have 2 questions: what can be causing this and how to avoid it, and if there is no mechanism to avoid deadlocks, then shouldn't there be at least a timeout mechanism so that one thread fails and this doesn't have to bring down the application and force a restart?
 
Cheers,
Alain
_______________________________________________
osgi-users mailing list
osgi-users@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/osgi-users
 


_______________________________________________
osgi-users mailing list
osgi-users@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/osgi-users

Back to the top