Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » EclipseLink » Hanging and long delays obtaining read locks
Hanging and long delays obtaining read locks [message #1108983] Sat, 14 September 2013 14:31 Go to next message
Randy Tidd is currently offline Randy Tidd
Messages: 5
Registered: August 2013
Junior Member
We are using EclipseLink 2.4.2 in a J2EE stack with WebLogic 10.3.6. We have a relatively small system with about 5-10 users in testing, eventually about 30 users in production, and a database with about 100 tables. The primary table has about 500 rows, most other tables have 1000-5000 rows, and there are a couple tables with 100,000-200,000 rows. I mention this just to indicate that there isn't much data and most operations are quick, for example doing a "select all" from a table typically only returns a few thousand rows and should take around 100-500 msec.

We often have a lot of "stuck threads", caused by lines in EclipseLink that are getting read locks, such as:

org.eclipse.persistence.internal.helper.ConcurrencyManager.acquireDeferredLock(ConcurrencyManager.java:198)

org.eclipse.persistence.internal.helper.ConcurrencyManager.acquire(ConcurrencyManager.java:94)

Looking through the stack traces of the thread dumps, the deadlocks seem to occur only during reading. EclipseLink is building objects for the cache based on the results from the queries, and one thread is writing to the cache which makes others wait, and somehow the waiting thread are kept waiting for more than 10 minutes (WebLogic's stuck thread timeout). I've attached a sample stack trace at the end of this message.

I have looked at the oft-cited FAQ about this:

http://wiki.eclipse.org/EclipseLink/FAQ/JPA#How_to_diagnose_and_resolve_hangs_and_deadlocks.3F

and have tried these suggestions but this has not helped. We are on a pretty recent version (2.4.2 is circa July 2013). Our queries are lazy, and we removed all join-fetches. We need the L2 cache (it is one of the main reasons we are using JPA).

I tried this:

setCacheTransactionIsolation(CONCURRENT_READ_WRITE)

which had no effect. Looking through the EclipseLink 2.4.2 source code, I don't see that this is even referenced anywhere? What behavior is it supposed to modify?

I wrote a standalone test which runs from the command line (outside of WebLogic) that simply performs a fetch of all 2000 rows in one table repeatedly in 20 threads. I then set breakpoints on all of the places in EclipseLink where it has been blocking on wait() calls. Usually within a few minutes the breakpoints are hit, indicating that there is thread contention when obtaining locks. My test does eventually finish but takes a long time, suggesting that the threads aren't completely hung, but are only significantly delayed for some reason.

We have noticed that the deadlocks/delays often occur when one thread is doing a "get all", for example fetching all objects from a table. This is usually <= 2000 objects, and so should be quick, but I guess building an object for each of those results and then writing it to the cache causes a lot of contention.

Frankly we are completely stumped at why we are having such problems with locks with such a small system and are at a loss for how to debug it. We are taking steps to decrease the amount of database traffic our system produces, getting rid of "get all" and "select *" queries when possible, but believe that there shouldn't be enough database traffic to so easily lock up EclipseLink.

I have gone through the EclipseLink source code looking at the areas where it is hanging, and read the comments and referenced bug reports, but don't have any more insight into what could be happening. If anyone has any suggestions for debugging or troubleshooting this, or EclipseLink parameters that we can set that might either improve the behavior or add more diagnostics, I would be extremely grateful.

java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)
org.eclipse.persistence.internal.helper.ConcurrencyManager.acquireDeferredLock(ConcurrencyManager.java:198)
org.eclipse.persistence.internal.identitymaps.CacheKey.acquireDeferredLock(CacheKey.java:184)
org.eclipse.persistence.internal.identitymaps.AbstractIdentityMap.acquireDeferredLock(AbstractIdentityMap.java:98)
org.eclipse.persistence.internal.identitymaps.IdentityMapManager.acquireDeferredLock(IdentityMapManager.java:119)
org.eclipse.persistence.internal.sessions.IdentityMapAccessor.acquireDeferredLock(IdentityMapAccessor.java:75)
org.eclipse.persistence.internal.sessions.AbstractSession.retrieveCacheKey(AbstractSession.java:4810)
org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObject(ObjectBuilder.java:782)
org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildWorkingCopyCloneNormally(ObjectBuilder.java:723)
org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObjectInUnitOfWork(ObjectBuilder.java:676)
org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObject(ObjectBuilder.java:609)
org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObject(ObjectBuilder.java:564)
org.eclipse.persistence.queries.ObjectLevelReadQuery.buildObject(ObjectLevelReadQuery.java:777)
org.eclipse.persistence.queries.ReadAllQuery.registerResultInUnitOfWork(ReadAllQuery.java:797)
org.eclipse.persistence.queries.ReadAllQuery.executeObjectLevelReadQuery(ReadAllQuery.java:434)
org.eclipse.persistence.queries.ObjectLevelReadQuery.executeDatabaseQuery(ObjectLevelReadQuery.java:1150)
org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:852)
org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:1109)
org.eclipse.persistence.queries.ReadAllQuery.execute(ReadAllQuery.java:393)
org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:1197)
org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:2879)
org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1607)
org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1589)
org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1554)
org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(QueryImpl.java:231)
org.eclipse.persistence.internal.jpa.QueryImpl.getResultList(QueryImpl.java:411)
(service call to get all objects from one table)

Re: Hanging and long delays obtaining read locks [message #1110930 is a reply to message #1108983] Tue, 17 September 2013 10:18 Go to previous messageGo to next message
James Sutherland is currently offline James Sutherland
Messages: 1939
Registered: July 2009
Location: Ottawa, Canada
Senior Member

For it to be using acquireDeferredLock instead of acquireLock, this means the class must have a non LAZY relationship, or the query is using a fetch-join, so double check this.

We have not seen any deadlocks or significant waiting in any of our concurrency tests, which run with more than 20 threads, so what you are seeing is odd. For a deadlock to occur, there must by a cycle of locks, so a single thread dump is not enough to diagnose an issue.

If you can recreate something with a simplified test, your best option may be to log a bug.

Also, first try the latest 2.5.x release.



James : Wiki : Book : Blog : Twitter
Re: Hanging and long delays obtaining read locks [message #1110974 is a reply to message #1110930] Tue, 17 September 2013 11:27 Go to previous messageGo to next message
Randy Tidd is currently offline Randy Tidd
Messages: 5
Registered: August 2013
Junior Member
Thank you for your reply. Incidentally, I just found this message from the eclipselink users mailing list (Jan 2013) which looks very similar to our issue, though unfortunately there were no replies to the message. This was with EclipseLink 2.3.2 (we are on 2.4.2).

http://dev.eclipse.org/mhonarc/lists/eclipselink-users/msg07690.html

I actually spent some time trying to figure out why it was trying to obtain deferred locks. ClassDescriptor.postInitialize() is setting setShouldAcquireCascadedLocks(true) for the class descriptor used in the query based on this code:

            if (!shouldAcquireCascadedLocks()) {
                if (mapping.isForeignReferenceMapping()){
                    if (!((ForeignReferenceMapping)mapping).usesIndirection()){
                        setShouldAcquireCascadedLocks(true);
                    }
                    hasRelationships = true;
                }

I am not sure I totally understand this but it seems that if my entity is the destination of a ManyToOne relationship, this flag is set to true, and this flag is part of this line in ObjectLevelReadQuery:

setRequiresDeferredLocks(DeferredLockManager.SHOULD_USE_DEFERRED_LOCKS && (hasJoining() || (this.descriptor.shouldAcquireCascadedLocks())));

If there were a way for us to avoid deferred locks and get around that deadlock, I would be eager to hear more about it. We are not intentionally using non-lazy or join fetches, though I suspect that EclipseLink is using them behind the scenes for some reason.

All of our JPA classes have an embedded type; all of our database tables have audit columns to track creation/update times and this is implemented via @Embedded. Could this have anything to do with it?

We make use of inheritance, though we see deadlocks on fetches for entities both with and without inheritance.

I suspect that there must be something unusual about our data model (inheritance, relationships, etc) that is tripping a concurrency bug that is not the normal case, however we must find an answer for this.

However, note that we also have deadlocks in the non-deferred case, i.e. ConcurrencyManager.acquire(). Below is one of the stacks that leads to that call. The source of this stack is a JPQL query like "select d from Deal d", which in this case is fetching only 75 objects.

I would really appreciate any tips or ideas that you (or anyone) else might have to help dianogse this issue. It is very widespread and is a show-stopper from putting our system in production because it can't run for more than a few hours without hanging. Thanks very much in advance.

Randy

java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)
weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:205)
weblogic.work.ExecuteThread.run(ExecuteThread.java:226)
"[STUCK] ExecuteThread: '49' for queue: 'weblogic.kernel.Default (self-tuning)'" waiting for lock org.eclipse.persistence.internal.helper.ConcurrencyManager@26e574b2 WAITING
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)
org.eclipse.persistence.internal.helper.ConcurrencyManager.acquire(ConcurrencyManager.java:94)
org.eclipse.persistence.internal.identitymaps.CacheKey.acquire(CacheKey.java:133)
org.eclipse.persistence.internal.identitymaps.AbstractIdentityMap.acquireLock(AbstractIdentityMap.java:122)
org.eclipse.persistence.internal.identitymaps.IdentityMapManager.acquireLock(IdentityMapManager.java:150)
org.eclipse.persistence.internal.sessions.IdentityMapAccessor.acquireLock(IdentityMapAccessor.java:93)
org.eclipse.persistence.internal.sessions.IdentityMapAccessor.acquireLock(IdentityMapAccessor.java:84)
org.eclipse.persistence.internal.sessions.AbstractSession.retrieveCacheKey(AbstractSession.java:4834)
org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObject(ObjectBuilder.java:782)
org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildWorkingCopyCloneNormally(ObjectBuilder.java:723)
org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObjectInUnitOfWork(ObjectBuilder.java:676)
org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObject(ObjectBuilder.java:609)
org.eclipse.persistence.internal.descriptors.ObjectBuilder.buildObject(ObjectBuilder.java:564)
org.eclipse.persistence.queries.ObjectLevelReadQuery.buildObject(ObjectLevelReadQuery.java:777)
org.eclipse.persistence.queries.ReadAllQuery.registerResultInUnitOfWork(ReadAllQuery.java:797)
org.eclipse.persistence.queries.ReadAllQuery.executeObjectLevelReadQuery(ReadAllQuery.java:434)
org.eclipse.persistence.queries.ObjectLevelReadQuery.executeDatabaseQuery(ObjectLevelReadQuery.java:1150)
org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:852)
org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:1109)
org.eclipse.persistence.queries.ReadAllQuery.execute(ReadAllQuery.java:393)
org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:1197)
org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:2879)
org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1607)
org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1589)
org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1554)
org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(QueryImpl.java:231)
org.eclipse.persistence.internal.jpa.QueryImpl.getResultList(QueryImpl.java:411)
Re: Hanging and long delays obtaining read locks [message #1111595 is a reply to message #1110974] Wed, 18 September 2013 08:37 Go to previous messageGo to next message
James Sutherland is currently offline James Sutherland
Messages: 1939
Registered: July 2009
Location: Ottawa, Canada
Senior Member

Ensure you are correctly using weaving. LAZY is not supported unless you are using weaving.

James : Wiki : Book : Blog : Twitter
Re: Hanging and long delays obtaining read locks [message #1111670 is a reply to message #1111595] Wed, 18 September 2013 10:56 Go to previous messageGo to next message
Randy Tidd is currently offline Randy Tidd
Messages: 5
Registered: August 2013
Junior Member
Thanks again James for your reply. We are using "internal" weaving:

<property name="eclipselink.weaving.internal" value="true"/>

To be honest we haven't given any consideration to weaving beyond this setting. Do you think that configured weaving, or static weaving, may have an impact on this cache behavior?

This is a J2EE setup with WebLogic 10.3.6 and EJB 3.0 and I have read that weaving is supported automatically.

My assumption about dynamic vs. static weaving is that static weaving could save some time at startup but wouldn't otherwise influence the performance of the system. We have about 120 JPA classes in the system, so more than a few but I wouldn't expect this to cause major problems. If I have that wrong, please let me know.

If there is any documentation about weaving that gives a good view of the pros and cons of different approaches, that'd be really helpful. I found this guide but it does not have a lot of detail:

http://www.eclipse.org/eclipselink/documentation/2.4/solutions/testingjpa004.htm
Re: Hanging and long delays obtaining read locks [message #1116669 is a reply to message #1111670] Wed, 25 September 2013 14:26 Go to previous message
James Sutherland is currently offline James Sutherland
Messages: 1939
Registered: July 2009
Location: Ottawa, Canada
Senior Member

If you are using a managed JPA context in JavaEE (such as injected into EJB) then weaving should be enabled by default.
If you are using application managed persistence units, or Spring, then it may be possible that weaving is not enabled. It should be obvious if weaving is not enabled, as you should see all of your 1-1 relationship eager loading.

Also ensure you are using LAZY fetching in all of the your 1-1 relationships.


James : Wiki : Book : Blog : Twitter
Previous Topic:Null handling in 2.5
Next Topic:Moxy and Reference Impl schemas do not match
Goto Forum:
  


Current Time: Mon Oct 07 16:35:14 EDT 2013

Powered by FUDForum. Page generated in 0.01657 seconds