Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » EclipseLink » Deadlock in cache under load(Deadlock in cache under load)
Deadlock in cache under load [message #491205] Tue, 13 October 2009 11:59 Go to next message
Adam Esterline is currently offline Adam Esterline
Messages: 2
Registered: July 2009
Junior Member
We are seeing dead locks occur under load with EclipseLink 1.1.1 and 1.1.2. We started seeing this issue when we moved from the JBoss transaction manager to a Spring based transaction manager. We took the Srping transaction manager for TopLink and changed the packages so it works with EclipseLink. We are seeing this issue on Jetty 6.1.20 as well as our previous version of JBoss (4.0.5) using the Spring based transaction manager. The deadlock is occurring in the cache on reads.

I have the thread dumps attached (gzipped). I have tried to attach them, but the attachment stuff isn't working on the forum. You can download the gzipped text file here: http://jefmsmit.googlepages.com/server-out.txt.gz

Thanks,

Adam

[Updated on: Tue, 13 October 2009 12:02]

Report message to a moderator

Re: Deadlock in cache under load [message #492221 is a reply to message #491205] Mon, 19 October 2009 10:05 Go to previous messageGo to next message
James Sutherland is currently offline James Sutherland
Messages: 1939
Registered: July 2009
Location: Ottawa, Canada
Senior Member

Deadlock issue are normally involved, so you may be best off contacting Oracle technical support on the issue if you have a support contract.

Looking at your thread dump, there seems to be a very large number of threads involved, so it is difficult to find the issue. If you could recreate the issue with fewer threads that would be helpful. I could not see any nested locks that would be required to cause a deadlock, but there were some deferred locks that may be related to the issue.

Do you only get the issue with Spring, and not with normal JEE? This would seem to indicate a Spring related issue. Could possible be the beginEarlyTransaction call the Spring integration may be doing.

What are you doing to cause the issue? Are you using fetch-joining? Do you have non-lazy relationships?

Some things you can try,
- disable the cache (shared=false) should be a workaround
- disable deferred locks may resolve the issue, (inc ode set, DeferredLockManager.SHOULD_USE_DEFERRED_LOCKS=false)
- ensure you are using lazy relationships always
- try the latest patch release or build or 1.2




James : Wiki : Book : Blog : Twitter
Re: Deadlock in cache under load [message #496677 is a reply to message #492221] Tue, 10 November 2009 16:55 Go to previous messageGo to next message
Andrei Ilitchev is currently offline Andrei Ilitchev
Messages: 5
Registered: July 2009
Junior Member
There is one writing thread in the thread dump:
"RMI TCP Connection(922)-10.32.0.51" daemon prio=10 tid=0x000000004e6a2800 nid=0x3cf6 in Object.wait() [0x0000000049e37000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at org.eclipse.persistence.internal.helper.ConcurrencyManager.a cquireReadLock(ConcurrencyManager.java:242)
- locked <0x00002aaacd23efe0> (a org.eclipse.persistence.internal.helper.ConcurrencyManager)
at org.eclipse.persistence.internal.helper.ConcurrencyManager.c heckReadLock(ConcurrencyManager.java:230)
at org.eclipse.persistence.internal.identitymaps.CacheKey.check ReadLock(CacheKey.java:180)
at org.eclipse.persistence.internal.identitymaps.IdentityMapMan ager.getFromIdentityMap(IdentityMapManager.java:610)
at org.eclipse.persistence.internal.identitymaps.IdentityMapMan ager.getFromIdentityMap(IdentityMapManager.java:578)
at org.eclipse.persistence.internal.sessions.ObjectChangeSet.ge tTargetVersionOfSourceObject(ObjectChangeSet.java:352)
at org.eclipse.persistence.internal.sessions.ObjectChangeSet.ge tTargetVersionOfSourceObject(ObjectChangeSet.java:324)
at org.eclipse.persistence.internal.queries.ContainerPolicy.mer geChanges(ContainerPolicy.java:688)
- locked <0x00002aaacfaa29e8> (a java.util.Vector)
at org.eclipse.persistence.mappings.CollectionMapping.mergeChan gesIntoObject(CollectionMapping.java:818)
at org.eclipse.persistence.internal.descriptors.ObjectBuilder.m ergeChangesIntoObject(ObjectBuilder.java:2554)
at org.eclipse.persistence.internal.sessions.MergeManager.merge ChangesIntoDistributedCache(MergeManager.java:437)
at org.eclipse.persistence.internal.sessions.MergeManager.merge Changes(MergeManager.java:265)
at org.eclipse.persistence.internal.sessions.MergeManager.merge ChangesFromChangeSet(MergeManager.java:362)
at org.eclipse.persistence.sessions.coordination.MergeChangeSet Command.executeWithSession(MergeChangeSetCommand.java:82)
at org.eclipse.persistence.internal.sessions.AbstractSession.pr ocessCommand(AbstractSession.java:3220)
at org.eclipse.persistence.sessions.coordination.RemoteCommandM anager.processCommandFromRemoteConnection(RemoteCommandManag er.java:256)
at org.eclipse.persistence.internal.sessions.coordination.rmi.R MIRemoteCommandConnectionImpl.executeCommand(RMIRemoteComman dConnectionImpl.java:52)
at sun.reflect.GeneratedMethodAccessor552.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe thodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.ja va:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTranspo rt.java:535)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TC PTransport.java:790)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCP Transport.java:649)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Threa dPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo lExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

To scope the problem could you run the tests without cache coordination?
Is deadlock still there?
Re: Deadlock in cache under load [message #504067 is a reply to message #496677] Wed, 16 December 2009 20:04 Go to previous messageGo to next message
John Mikula is currently offline John Mikula
Messages: 8
Registered: December 2009
Junior Member
I'm having the same problem.

I'm running an application on Sun GlassFish Enterprise Server v2.1 (9.1.1) (build b60e-fcs) with EclipseLink 1.1.2 as my JPA implementation. Our application has been getting a lot more load lately, and I've frequently encountered a situation in which all of my HTTP worker threads are locked waiting at
org.eclipse.persistence.internal.helper.ConcurrencyManager.a cquire(ConcurrencyManager.java:89)
Except for one, which is stuck in a loop at
org.eclipse.persistence.internal.helper.ConcurrencyManager.r eleaseDeferredLock(ConcurrencyManager.java:454)

I can provide a full thread dump of the locked state.

I've followed some of the advice from this thread:
http://forums.oracle.com/forums/thread.jspa?threadID=851676

Particularly setting
DeferredLockManager.SHOULD_USE_DEFERRED_LOCKS = false;
didn't seem to help at all. And setting "eclipselink.cache.shared.default"="false" causes several of my unit tests to fail. From what I understand setting that option isolates the read cache from the write cache, which seems to be having adverse side effects on my application.

I am not using join fetching, but I am using QueryHint.REFRESH_CASCADE. Do these have similar behavior?

Moreover, how do I resolve my problem? I've looked at the code for ConcurrencyManager in the more recent releases and I see no changes in this part of the code, so if it's a known bug, it doesn't appear to have been addressed.

In my case, it's not a true deadlock, but an infinite loop. Here is a code snip from ConcurrencyManager.releaseDeferredLock()

        // Thread have three stages, one where they are doing work (i.e. building objects)
        // two where they are done their own work but may be waiting on other threads to finish their work,
        // and a third when they and all the threads they are waiting on are done.
        // This is essentially a busy wait to determine if all the other threads are done.
        while (true) {
            // 2612538 - the default size of Map (32) is appropriate
            Map recursiveSet = new IdentityHashMap();
            if (isBuildObjectOnThreadComplete(currentThread, recursiveSet)) {// Thread job done.
                lockManager.releaseActiveLocksOnThread();
                removeDeferredLockManager(currentThread);
                AbstractSessionLog.getLog().log(SessionLog.FINER, "deferred_locks_released", currentThread.getName());
                return;
            } else {// Not done yet, wait and check again.
                try {
                    Thread.sleep(1);
                } catch (InterruptedException ignoreAndContinue) {
                }
            }
        }


For some reason isBuildObjectOnThreadComplete(currentThread, recursiveSet) always returns false, so that thread stays in that loop, and the concurrency manager never issues a notify() to release the waiting threads.

Cheers
Re: Deadlock in cache under load [message #504180 is a reply to message #491205] Thu, 17 December 2009 10:21 Go to previous messageGo to next message
James Sutherland is currently offline James Sutherland
Messages: 1939
Registered: July 2009
Location: Ottawa, Canada
Senior Member

Try the latest release, there was an issue fixed that may be related.

Refreshing should not be causes an issue.

Ensure you are using LAZY on all relationships.

There was a bug in the DeferredLockManager.SHOULD_USE_DEFERRED_LOCKS = false backdoor, so avoid that (unless on latest release, I think it was fixed).

There are some debug methods on IdentityMapAccessor that may be of some use.

Include the threads dumps for the blocked threads.

If you have a support contract, contact technical support.


James : Wiki : Book : Blog : Twitter
Re: Deadlock in cache under load [message #504206 is a reply to message #504180] Thu, 17 December 2009 11:49 Go to previous messageGo to next message
John Mikula is currently offline John Mikula
Messages: 8
Registered: December 2009
Junior Member
http://datasolutions.s3.amazonaws.com/public/threaddump.txt

I stripped out all threads that didn't have org.eclipse.persistence in their stack.
Re: Deadlock in cache under load [message #504247 is a reply to message #504180] Thu, 17 December 2009 15:50 Go to previous messageGo to next message
John Mikula is currently offline John Mikula
Messages: 8
Registered: December 2009
Junior Member
By latest version, do you recommend I use 1.2, which is the latest 1.x or 2.0, which was released last week?

Also, I looked at IdentityMapAccessor, and it seems that it only references locks in terms of optimistic locks that are versioned in cache, so I don't see how that would help for debugging.
Re: Deadlock in cache under load [message #504430 is a reply to message #504247] Fri, 18 December 2009 12:23 Go to previous message
John Mikula is currently offline John Mikula
Messages: 8
Registered: December 2009
Junior Member
My team spent some effort reproducing the problem in a test environment. We found that if we rapidly and repeatedly made API calls against our application (which in turn made API calls to eclipselink JPA) we could bring glassfish down within 10-20 minutes.

I was then able to experiment with a variety of options to help pinpoint the problem, including deploying historical versions of my application from our SCM.

As it turns out, the problem only manifested itself after I added QueryHint.REFRESH to all of my queries. I suspect there is a bug in EclipseLink that causes this, but for my part, the reason I put all the refresh hints in the code is that I am running on multiple servers and haven't gotten cache coordination working properly yet.

Hope this helps anyone who comes after.

Cheers
Previous Topic:Cannot find entities classes when packaged in another bundle than bundle containing persistence.xml
Next Topic:Webstart much slower since EL 2.0
Goto Forum:
  


Current Time: Wed Jul 30 07:13:18 EDT 2014

Powered by FUDForum. Page generated in 0.01834 seconds