Eclipse Community Forums: EclipseLink » Deadlock in cache under load

Help

Home

Home » Eclipse Projects » EclipseLink » Deadlock in cache under load(Deadlock in cache under load)

Show: Today's Messages :: Show Polls :: Message Navigator

Deadlock in cache under load [message #491205]

Tue, 13 October 2009 15:59

Adam Esterline

Messages: 2
Registered: July 2009

Junior Member

We are seeing dead locks occur under load with EclipseLink 1.1.1 and 1.1.2. We started seeing this issue when we moved from the JBoss transaction manager to a Spring based transaction manager. We took the Srping transaction manager for TopLink and changed the packages so it works with EclipseLink. We are seeing this issue on Jetty 6.1.20 as well as our previous version of JBoss (4.0.5) using the Spring based transaction manager. The deadlock is occurring in the cache on reads.

I have the thread dumps attached (gzipped). I have tried to attach them, but the attachment stuff isn't working on the forum. You can download the gzipped text file here: http://jefmsmit.googlepages.com/server-out.txt.gz

Thanks,

Adam

[Updated on: Tue, 13 October 2009 16:02]

Report message to a moderator

Re: Deadlock in cache under load [message #492221 is a reply to message #491205]

Mon, 19 October 2009 14:05

James Sutherland

Messages: 1939
Registered: July 2009
Location: Ottawa, Canada

Senior Member

Deadlock issue are normally involved, so you may be best off contacting Oracle technical support on the issue if you have a support contract.

Looking at your thread dump, there seems to be a very large number of threads involved, so it is difficult to find the issue. If you could recreate the issue with fewer threads that would be helpful. I could not see any nested locks that would be required to cause a deadlock, but there were some deferred locks that may be related to the issue.

Do you only get the issue with Spring, and not with normal JEE? This would seem to indicate a Spring related issue. Could possible be the beginEarlyTransaction call the Spring integration may be doing.

What are you doing to cause the issue? Are you using fetch-joining? Do you have non-lazy relationships?

Some things you can try,
- disable the cache (shared=false) should be a workaround
- disable deferred locks may resolve the issue, (inc ode set, DeferredLockManager.SHOULD_USE_DEFERRED_LOCKS=false)
- ensure you are using lazy relationships always
- try the latest patch release or build or 1.2

James : Wiki : Book : Blog : Twitter

Report message to a moderator

Re: Deadlock in cache under load [message #496677 is a reply to message #492221]

Tue, 10 November 2009 21:55

Andrei Ilitchev

Messages: 5
Registered: July 2009

Junior Member

There is one writing thread in the thread dump:
"RMI TCP Connection(922)-10.32.0.51" daemon prio=10 tid=0x000000004e6a2800 nid=0x3cf6 in Object.wait() [0x0000000049e37000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at org.eclipse.persistence.internal.helper.ConcurrencyManager.a cquireReadLock(ConcurrencyManager.java:242)
- locked <0x00002aaacd23efe0> (a org.eclipse.persistence.internal.helper.ConcurrencyManager)
at org.eclipse.persistence.internal.helper.ConcurrencyManager.c heckReadLock(ConcurrencyManager.java:230)
at org.eclipse.persistence.internal.identitymaps.CacheKey.check ReadLock(CacheKey.java:180)
at org.eclipse.persistence.internal.identitymaps.IdentityMapMan ager.getFromIdentityMap(IdentityMapManager.java:610)
at org.eclipse.persistence.internal.identitymaps.IdentityMapMan ager.getFromIdentityMap(IdentityMapManager.java:578)
at org.eclipse.persistence.internal.sessions.ObjectChangeSet.ge tTargetVersionOfSourceObject(ObjectChangeSet.java:352)
at org.eclipse.persistence.internal.sessions.ObjectChangeSet.ge tTargetVersionOfSourceObject(ObjectChangeSet.java:324)
at org.eclipse.persistence.internal.queries.ContainerPolicy.mer geChanges(ContainerPolicy.java:688)
- locked <0x00002aaacfaa29e8> (a java.util.Vector)
at org.eclipse.persistence.mappings.CollectionMapping.mergeChan gesIntoObject(CollectionMapping.java:818)
at org.eclipse.persistence.internal.descriptors.ObjectBuilder.m ergeChangesIntoObject(ObjectBuilder.java:2554)
at org.eclipse.persistence.internal.sessions.MergeManager.merge ChangesIntoDistributedCache(MergeManager.java:437)
at org.eclipse.persistence.internal.sessions.MergeManager.merge Changes(MergeManager.java:265)
at org.eclipse.persistence.internal.sessions.MergeManager.merge ChangesFromChangeSet(MergeManager.java:362)
at org.eclipse.persistence.sessions.coordination.MergeChangeSet Command.executeWithSession(MergeChangeSetCommand.java:82)
at org.eclipse.persistence.internal.sessions.AbstractSession.pr ocessCommand(AbstractSession.java:3220)
at org.eclipse.persistence.sessions.coordination.RemoteCommandM anager.processCommandFromRemoteConnection(RemoteCommandManag er.java:256)
at org.eclipse.persistence.internal.sessions.coordination.rmi.R MIRemoteCommandConnectionImpl.executeCommand(RMIRemoteComman dConnectionImpl.java:52)
at sun.reflect.GeneratedMethodAccessor552.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe thodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.ja va:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTranspo rt.java:535)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TC PTransport.java:790)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCP Transport.java:649)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Threa dPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo lExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

To scope the problem could you run the tests without cache coordination?
Is deadlock still there?

Report message to a moderator

Re: Deadlock in cache under load [message #504067 is a reply to message #496677]

Thu, 17 December 2009 01:04

John Mikula

Messages: 8
Registered: December 2009

Junior Member

I'm having the same problem.

I'm running an application on Sun GlassFish Enterprise Server v2.1 (9.1.1) (build b60e-fcs) with EclipseLink 1.1.2 as my JPA implementation. Our application has been getting a lot more load lately, and I've frequently encountered a situation in which all of my HTTP worker threads are locked waiting at
org.eclipse.persistence.internal.helper.ConcurrencyManager.a cquire(ConcurrencyManager.java:89)
Except for one, which is stuck in a loop at
org.eclipse.persistence.internal.helper.ConcurrencyManager.r eleaseDeferredLock(ConcurrencyManager.java:454)

I can provide a full thread dump of the locked state.

I've followed some of the advice from this thread:
http://forums.oracle.com/forums/thread.jspa?threadID=851676

Particularly setting
DeferredLockManager.SHOULD_USE_DEFERRED_LOCKS = false;
didn't seem to help at all. And setting "eclipselink.cache.shared.default"="false" causes several of my unit tests to fail. From what I understand setting that option isolates the read cache from the write cache, which seems to be having adverse side effects on my application.

I am not using join fetching, but I am using QueryHint.REFRESH_CASCADE. Do these have similar behavior?

Moreover, how do I resolve my problem? I've looked at the code for ConcurrencyManager in the more recent releases and I see no changes in this part of the code, so if it's a known bug, it doesn't appear to have been addressed.

In my case, it's not a true deadlock, but an infinite loop. Here is a code snip from ConcurrencyManager.releaseDeferredLock()

        // Thread have three stages, one where they are doing work (i.e. building objects)
        // two where they are done their own work but may be waiting on other threads to finish their work,
        // and a third when they and all the threads they are waiting on are done.
        // This is essentially a busy wait to determine if all the other threads are done.
        while (true) {
            // 2612538 - the default size of Map (32) is appropriate
            Map recursiveSet = new IdentityHashMap();
            if (isBuildObjectOnThreadComplete(currentThread, recursiveSet)) {// Thread job done.
                lockManager.releaseActiveLocksOnThread();
                removeDeferredLockManager(currentThread);
                AbstractSessionLog.getLog().log(SessionLog.FINER, "deferred_locks_released", currentThread.getName());
                return;
            } else {// Not done yet, wait and check again.
                try {
                    Thread.sleep(1);
                } catch (InterruptedException ignoreAndContinue) {
                }
            }
        }

For some reason isBuildObjectOnThreadComplete(currentThread, recursiveSet) always returns false, so that thread stays in that loop, and the concurrency manager never issues a notify() to release the waiting threads.

Cheers

Report message to a moderator

Re: Deadlock in cache under load [message #504180 is a reply to message #491205]

Thu, 17 December 2009 15:21

James Sutherland

Messages: 1939
Registered: July 2009
Location: Ottawa, Canada

Senior Member

Try the latest release, there was an issue fixed that may be related.

Refreshing should not be causes an issue.

Ensure you are using LAZY on all relationships.

There was a bug in the DeferredLockManager.SHOULD_USE_DEFERRED_LOCKS = false backdoor, so avoid that (unless on latest release, I think it was fixed).

There are some debug methods on IdentityMapAccessor that may be of some use.

Include the threads dumps for the blocked threads.

If you have a support contract, contact technical support.

James : Wiki : Book : Blog : Twitter

Report message to a moderator

Re: Deadlock in cache under load [message #504206 is a reply to message #504180]

Thu, 17 December 2009 16:49

John Mikula

Messages: 8
Registered: December 2009

Junior Member

http://datasolutions.s3.amazonaws.com/public/threaddump.txt

I stripped out all threads that didn't have org.eclipse.persistence in their stack.

Report message to a moderator

Re: Deadlock in cache under load [message #504247 is a reply to message #504180]

Thu, 17 December 2009 20:50

John Mikula

Messages: 8
Registered: December 2009

Junior Member

By latest version, do you recommend I use 1.2, which is the latest 1.x or 2.0, which was released last week?

Also, I looked at IdentityMapAccessor, and it seems that it only references locks in terms of optimistic locks that are versioned in cache, so I don't see how that would help for debugging.

Report message to a moderator

Re: Deadlock in cache under load [message #504430 is a reply to message #504247]

Fri, 18 December 2009 17:23

John Mikula

Messages: 8
Registered: December 2009

Junior Member

My team spent some effort reproducing the problem in a test environment. We found that if we rapidly and repeatedly made API calls against our application (which in turn made API calls to eclipselink JPA) we could bring glassfish down within 10-20 minutes.

I was then able to experiment with a variety of options to help pinpoint the problem, including deploying historical versions of my application from our SCM.

As it turns out, the problem only manifested itself after I added QueryHint.REFRESH to all of my queries. I suspect there is a bug in EclipseLink that causes this, but for my part, the reason I put all the refresh hints in the code is that I am running on multiple servers and haven't gotten cache coordination working properly yet.

Hope this helps anyone who comes after.

Cheers

Report message to a moderator

Re: Deadlock in cache under load [message #1843580 is a reply to message #504430]

Mon, 09 August 2021 12:25

Nainsy Gupta

Messages: 1
Registered: August 2021

Junior Member

Hello Everyone,

I know this is an old forum, but we recently faced this issue, when the application load was increased.

We have similar traces in the thread dumps, as are shared by different contributors of this forum.

We have already raised one service request with the Oracle team but they are asking for a reproducible test case.

We have been working on the test case, but yet we haven't got any success in reproducing it.

It will be a great help if someone can help in providing few suggestions on how to reproduce the issue.

Thanks in advance !!

Best Regards,
Nainsy Gupta

Report message to a moderator

Previous Topic:	SDO deprecation
Next Topic:	EllementCollection mapping throws [EclipseLink-6157]

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Thu Sep 19 15:58:44 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter