Eclipse Community Forums: EMF » (no subject)

Search

Help

Login

Home » Modeling » EMF » (no subject)

Show: Today's Messages :: Show Polls :: Message Navigator

Switch to threaded view of this topic

Create a new topic

Submit Reply

(no subject) [message #687085]

Sat, 28 May 2011 12:12

John Smith is currently offline

Friend

Messages: 137
Registered: July 2009

Senior Member

I made a heap dump while 50% of my model is loaded from a file to the
CDO server, and another one shortly before 100%. I compared the
resulting heap dumps using MemoryAnalyser, and found two suspect
instance increasements responsible for memory consumption:

Class Name
| Objects #1 | Objects #2 | Shallow Heap #1 | Shallow
Heap #2
----------------------------------------------------------------------------------------------------------------------------------------------------
org.eclipse.emf.cdo.internal.common.revision.AbstractCDORevisionCache$CacheSoftReference|
642 | +35.283 | 30.816 | +1.693.584
org.eclipse.emf.cdo.internal.common.id.CDOIDObjectLongImpl
| 117.549 | +43.008 | 1.880.784 |
+688.128
----------------------------------------------------------------------------------------------------------------------------------------------------

So there are 1.693.584 more instances of CacheSoftReference in the
second heap dump.
Viewing the Paths To GC Roots, I found out that they are referenced in
CDORevisionCacheNonAuditing by this hash map:

private Map<CDOID, Reference<InternalCDORevision>> revisions = new
HashMap<CDOID, Reference<InternalCDORevision>>();

As I understand the cache concept, the application behavior will not be
changed if the cached is cleared (except for performance). So to solve
the problem of growing RAM usage, would a java.util.WeakHashMap not do a
better job here? So the cache can partially be cleared if there is to
less RAM.

Regarding the increasing size of instances of CDOIDObjectLongImpl, there
play the key role in the above mentioned hashmap, but are also used as
key and value at
org.eclipse.emf.cdo.server.internal.db.mapping.horizontal.ObjectTypeCache$MemoryCache:

private Map<CDOID, CDOID> memoryCache;

... which is assigned a LinkedHashMap<CDOID, CDOID> instance.

As it is a cache concept, shouldnt it also be a WeakHashMap? Ok, the
"Linked" behavior is lost, but I not understood how it is used in this
code. As I read the documentation, LinkedHashMap can define an iteration
order in different ways (the iteration order of a simple HashMap is more
undefined), but I found no code where actually an iterator of
memoryCache is queried.

To conclude, I found these two memory leaks in CDO (beside a memory leak
in my own code, but I was aware of this before and maybe I will change
the responsible POJO structure to a CDO structure, which on the one side
must be stored in the DB but on the other side can then be garbage
collected on the client), but as these two leaks are fortunately located
in cache functionality, this should be easily solved by using weak
references.

Report message to a moderator

(no subject) [message #687086 is a reply to message #687085]

Sat, 28 May 2011 12:41

Eike Stepper is currently offline

Friend

Messages: 6682
Registered: July 2009

Senior Member

Am 28.05.2011 14:12, schrieb exquisitus:
> I made a heap dump while 50% of my model is loaded from a file to the CDO server, and another one shortly before 100%. I compared the resulting heap dumps using MemoryAnalyser, and found two suspect instance increasements responsible for memory consumption:
>
> Class Name | Objects #1 | Objects #2 | Shallow Heap #1 | Shallow Heap #2
> ----------------------------------------------------------------------------------------------------------------------------------------------------
> org.eclipse.emf.cdo.internal.common.revision.AbstractCDORevisionCache$CacheSoftReference| 642 | +35.283 | 30.816 | +1.693.584
> org.eclipse.emf.cdo.internal.common.id.CDOIDObjectLongImpl | 117.549 | +43.008 | 1.880.784 | +688.128
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> So there are 1.693.584 more instances of CacheSoftReference in the second heap dump.
> Viewing the Paths To GC Roots, I found out that they are referenced in CDORevisionCacheNonAuditing by this hash map:
>
> private Map<CDOID, Reference<InternalCDORevision>> revisions = new HashMap<CDOID, Reference<InternalCDORevision>>();
>
>
> As I understand the cache concept, the application behavior will not be changed if the cached is cleared (except for performance). So to solve the problem of growing RAM usage, would a java.util.WeakHashMap not do a better job here? So the cache can partially be cleared if there is to less RAM.
The Java spec does not guarantee any particular behaviour for weak or soft references. Usually a JVM with the -client option handles weak references as if they were soft references. A diiference is usually only in JVMs started with the -server option. Generally noone has a benefit from free memory as such, but I suspect that weakly referenced objects are garbage collected in favour of softly referenced objects if memory runs low. A framework can hardly judge which objects are more expensive to recreate. I doubt that weak references are a good idea for caches with objects that are expensive to recreate.

>
> Regarding the increasing size of instances of CDOIDObjectLongImpl, there play the key role in the above mentioned hashmap, but are also used as key and value at
> org.eclipse.emf.cdo.server.internal.db.mapping.horizontal.ObjectTypeCache$MemoryCache:
>
> private Map<CDOID, CDOID> memoryCache;
>
> .. which is assigned a LinkedHashMap<CDOID, CDOID> instance.
Note how MemoryCache cares for eviction of the eldest element if the configured capacity is exceeded.

>
> As it is a cache concept, shouldnt it also be a WeakHashMap?
No, it is basically a fixed size LRU cache.

> Ok, the "Linked" behavior is lost, but I not understood how it is used in this code.
To quickly be able to remove the eldest (LRU) element.

> As I read the documentation, LinkedHashMap can define an iteration order in different ways (the iteration order of a simple HashMap is more undefined), but I found no code where actually an iterator of memoryCache is queried.
The actual eviction logic is in LinkedHashMap.

>
> To conclude, I found these two memory leaks in CDO
That's wrong. Both cases do not cause memory leaks. In the case of the CDORevisionCache(s) you may indeed encounter an increase in memory consumption but if memory goes low enough the softly reachable CDORevisions will be garbage collected and the respective map entries will be removed from the cache map. If that does really not happen (unlikely) then there is a bug in the cache implementation. Please submit a bugzilla then.

Where is the stack trace of the OutOfMemoryError that could be an indication for a memory leak?

> (beside a memory leak in my own code, but I was aware of this before and maybe I will change the responsible POJO structure to a CDO structure, which on the one side must be stored in the DB but on the other side can then be garbage collected on the client), but as these two leaks are fortunately located in cache functionality, this should be easily solved by using weak references.
I doubt that, in general, changing soft references to weak references can cure an evident memory leak.

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Report message to a moderator

(no subject) [message #687087 is a reply to message #687086]

Sun, 29 May 2011 02:53

John Smith is currently offline

Friend

Messages: 137
Registered: July 2009

Senior Member

Hi Eike,

> I doubt that, in general, changing soft references to weak references
> can cure an evident memory leak.

I actually not proposed to change soft references to weak references,
but I proposed to change strong references to weak references by using a
WeakHashMap instead of a HashMap. I dont care that the HashMap'values
are CacheSoftReference instances, the data type is irrelevant for the
memory problem. I care that the HashMap's keys are never garbage
collected. When memory is running low, a WeakHashMap can remove those
keys, which are not referenced by the remainder application (and I guess
this is the case here).

Another solution for this problem would be to use a LRU cache for
CDORevisionCacheNonAuditing#revisions , in the same way it was done for
ObjectTypeCache$MemoryCache. So it should be defined that way:

private Map<CDOID, Reference<InternalCDORevision>> revisions = new
LinkedHashMap<CDOID, Reference<InternalCDORevision>>() {

@Override
protected boolean removeEldestEntry(java.util.Map.Entry<CDOID,
Reference<InternalCDORevision>> eldest)
{
return size() > 100000;
}
}

But I actually dont like this concept, sinse it is hard to find a good
limit like 100000. Hard to adapt it to the users' actually available
memory. some have 2 gigs, some have 64bit with 16gigs RAM. Are there
heuristics so that this limit is adjusted? SoftReference's documentation
says "Virtual machine implementations are, however, encouraged to bias
against clearing recently-created or recently-used soft references." So
since the docu is from Sun, they should also have implemented some LRU
strategy for e.g. a WeakHashMap. So depending on the -Xmx setting of the
user, the user can roughly manage this limit himself when using
WeakHashMaps directly in combination with the -Xmx setting.

> That's wrong. Both cases do not cause memory leaks. In the case of the
> CDORevisionCache(s) you may indeed encounter an increase in memory
> consumption but if memory goes low enough the softly reachable
> CDORevisions will be garbage collected and the respective map entries
> will be removed from the cache map. If that does really not happen
> (unlikely) then there is a bug in the cache implementation.

Ah, here is a misunderstanding, either from me or you. You wrote "the
respective map entries will be removed from the cache map", but which
documentation says this is be done? If the CDORevision is softly
referenced by a CacheSoftReference and is garbage collected, the
CacheSoftReference instance itself will NOT be garbage collected at all
and will remain as value in the HashMap. But even if the
CacheSoftReference instance itself would be garbage collected, I dont
see which part of the Java framework can remove this value and the
corresponding key from the HashMap. The documentation of e.g.
SoftReference says nothing about removing strong references.

> Where is the stack trace of the OutOfMemoryError that could be an
> indication for a memory leak?

This is a problem, I run the test with a bigger model over night to
produce a oufofmem exception, restricting RAM to 60mb and trying to read
in a model file of 114mb to the database. I use JVM acceptor, with
client and server having to share the 60mb. I saved the partially build
model every 5000 new model elements (about 0.2% of the whole model) , so
that the CLEAN CDOObjects can be garbage collected. However at about
17.7% a timeout occured, but since the last 0.2% of the previous save()
took already nearly twice as long as normal, I guess the RAM limit was
quite full, much time was spend with memory releasing and allocation.
So I guess behind the timeout exception, there was a outofmem exception
somewhere, which was catched, but only resulted in non-responsive
server. So the -XX:+HeapDumpOnOutOfMemoryError option could not create a
dump heap!

As I already said, I also have a memory leak problem in my code, since
while parsing the model file, I have to build up a list of unresolved
forward-references, which can be resolved when elements occuring later
in the file get parsed. (Its not an XMI file, actually an IFC file, but
the problem is nearly equivalent to XMI files). I will later resolve
this problem by storing forward references as CDO objects into the
database, however forward references are quite rare in contrast to
backward references in IFC files (I guess in XMI files, they are 50%/50%).

I print out the console:

Saving at 16.498753336740343% #dirty=5000 %dirty=0.19566501204097264
Saving took 63938
Saving at 16.694418348781312% #dirty=5000 %dirty=0.1956650120409691
Saving took 63093
Saving at 16.897071396966602% #dirty=5000 %dirty=0.20265304818529017
Saving took 61688
Saving at 17.09273640900757% #dirty=5000 %dirty=0.1956650120409691
Saving took 60594
Saving at 17.29538945719286% #dirty=5000 %dirty=0.20265304818529017
Saving took 61890
Saving at 17.491054469233834% #dirty=5000 %dirty=0.19566501204097264
Saving took 98000
Saving at 17.69370751741912% #dirty=5000 %dirty=0.20265304818528662
[ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException:
Timeout after 10078 millis
org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
at
org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
at
org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException:
Timeout after 10078 millis
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
[ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException:
Timeout after 10078 millis
org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
at
org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
at
org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException:
Timeout after 10078 millis
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
------------------------- END -------------------------

org.eclipse.net4j.util.transaction.TransactionException:
org.eclipse.emf.cdo.util.CommitException:
org.eclipse.net4j.signal.RemoteException:
org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at
org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:959)
at qut.part21.loader.Part21LoaderCDO.consume(Part21LoaderCDO.java:41)
at
qut.part21.parser.ClearTextReader.jjtreeCloseNodeScope(ClearTextReader.java:18)
at
qut.part21.parser.ClearTextReader.entity_instance(ClearTextReader.java:606)
at
qut.part21.parser.ClearTextReader.entity_instance_list(ClearTextReader.java:534)
at qut.part21.parser.ClearTextReader.data_section(ClearTextReader.java:492)
at qut.part21.parser.ClearTextReader.exchange_file(ClearTextReader.java:81)
at qut.part21.parser.ClearTextReader.syntax(ClearTextReader.java:813)
at qut.part21.loader.Part21ResourceImpl.doLoad(Part21ResourceImpl.java:44)
at IFC2X3.util.IFC2X3ResourceImpl.doLoad(IFC2X3ResourceImpl.java:46)
at
org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1511)
at
org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1290)
at
org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoad(ResourceSetImpl.java:255)
at
org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoadHelper(ResourceSetImpl.java:270)
at
org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.getResource(ResourceSetImpl.java:397)
at
org.eclipse.emf.cdo.internal.ui.actions.ImportRootsAction.getSourceResources(ImportRootsAction.java:111)
at
qut.ifcm.cdo.editor.ImportIfcRootsAction.doRun(ImportIfcRootsAction.java:37)
at
qut.ifcm.cdo.editor.ImportIfcRootsAction.testRun(ImportIfcRootsAction.java:63)
at
qut.designview.test.ContainmentTest.testBasicContainment(ContainmentTest.java:105)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at
org.eclipse.net4j.util.tests.AbstractOMTest.runBare(AbstractOMTest.java:214)
at
org.eclipse.emf.cdo.tests.config.impl.ConfigTest.runBare(ConfigTest.java:504)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at org.eclipse.net4j.util.tests.AbstractOMTest.run(AbstractOMTest.java:260)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at
org.eclipse.emf.cdo.tests.config.impl.ConfigTestSuite$TestWrapper.runTest(ConfigTestSuite.java:126)
at junit.framework.TestSuite.run(TestSuite.java:238)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: org.eclipse.emf.cdo.util.CommitException:
org.eclipse.net4j.signal.RemoteException:
org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at
org.eclipse.emf.internal.cdo.transaction.CDOTransactionImpl.commit(CDOTransactionImpl.java:1072)
at
org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:955)
... 44 more
Caused by: org.eclipse.net4j.signal.RemoteException:
org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at
org.eclipse.net4j.signal.RequestWithConfirmation.getRemoteException(RequestWithConfirmation.java:139)
at
org.eclipse.net4j.signal.RequestWithConfirmation.setRemoteException(RequestWithConfirmation.java:128)
at
org.eclipse.net4j.signal.SignalProtocol.handleRemoteException(SignalProtocol.java:423)
at
org.eclipse.net4j.signal.RemoteExceptionIndication.indicating(RemoteExceptionIndication.java:63)
at org.eclipse.net4j.signal.Indication.doExtendedInput(Indication.java:55)
at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
at org.eclipse.net4j.signal.Indication.execute(Indication.java:49)
at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
at
org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
at
org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
... 5 more
Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException:
Timeout after 10078 millis
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)

Report message to a moderator

(no subject) [message #687088 is a reply to message #687087]

Sun, 29 May 2011 06:57

Eike Stepper is currently offline

Friend

Messages: 6682
Registered: July 2009

Senior Member

Am 29.05.2011 04:53, schrieb exquisitus:
> Hi Eike,
>
>
> > I doubt that, in general, changing soft references to weak references
> > can cure an evident memory leak.
>
> I actually not proposed to change soft references to weak references, but I proposed to change strong references to weak references by using a WeakHashMap instead of a HashMap.
WeakHashMaps are usually used for canonical mappings, i.e. for values that are needed no longer than the keys. That's not the case for the revision cache because revisions are used independently of their keys (CDOIDs) and because CDOIDs are not expected to be unique (e.g. two revisions that reference the same third revision are not required to use the same target CDOID *instance* ).

So, neither *weak* nor *key* makes sense in this type of cache.

> I dont care that the HashMap'values are CacheSoftReference instances, the data type is irrelevant for the memory problem.
What exactly is the memory problem. As I mentioned in my first reply, growing memory consumption is a good sign for a cache as long as it does not result in OutOfMemoryErrors. It's very unlikely that CDO swallows exceptions. To double check you can register your custom log and trace listeners: http://wiki.eclipse.org/FAQ_for_CDO_and_Net4j#How_can_I_enable_tracing.3F

> I care that the HashMap's keys are never garbage collected.
That's not true. The values (keyed soft refs) are registered with a ReferenceQueue where they are enqueued by the JVM automatically when the values are garbage collected. This queue is monitored by a worker thread that periodically polls the queue, takes the keys of the keyed soft refs and removes them (i.e. the cache map entries) from the cache map. This behaviour is configurable if you cast your cache to ReferenceQueueWorker:

org.eclipse.net4j.util.ref.ReferenceQueueWorker.setPollMillis(long)
org.eclipse.net4j.util.ref.ReferenceQueueWorker.setMaxWorkPerPoll(int)

> When memory is running low, a WeakHashMap can remove those keys, which are not referenced by the remainder application (and I guess this is the case here).
Exactly that's the reason why it does not make sense. The cache content was likely to disappear immediately after the first usage (remember what I said about unspec'ed behaviour of weak refs).

>
> Another solution for this problem
So far I don't see a problem.

> would be to use a LRU cache for CDORevisionCacheNonAuditing#revisions , in the same way it was done for ObjectTypeCache$MemoryCache.
In the case of a revision cache it's not so easy because the lookup strategies are more complex: lookup by id,branch,time (range search) and lookup by id,branch,version. We used to ship an LRU cache but it turned out to be very hard to maintain and it offered almost no benefit compared to a purely memory-sensitive cache implementation (which would be needed anyway).

> So it should be defined that way:
>
> private Map<CDOID, Reference<InternalCDORevision>> revisions = new LinkedHashMap<CDOID, Reference<InternalCDORevision>>() {
>
> @Override
> protected boolean removeEldestEntry(java.util.Map.Entry<CDOID, Reference<InternalCDORevision>> eldest)
> {
> return size() > 100000;
> }
> }
>
> But I actually dont like this concept, sinse it is hard to find a good limit like 100000. Hard to adapt it to the users' actually available memory. some have 2 gigs, some have 64bit with 16gigs RAM. Are there heuristics so that this limit is adjusted?
The only advantage of LRU caches (in Java!) is that they can have a custom eviction policy applied, which is (in Java!) not possible for memory-sensitive caches. Some articles claim to have evidence that (in Java!) the best results can be achieved with two-level caches. Level1 is LRU that evicts by application policy to level2. Level2 is memory-sensitive. As I said, this turned out to be too complex for our non-standard access patterns.

> SoftReference's documentation says "Virtual machine implementations are, however, encouraged to bias against clearing recently-created or recently-used soft references." So since the docu is from Sun, they should also have implemented some LRU strategy for e.g. a WeakHashMap. So depending on the -Xmx setting of the user, the user can roughly manage this limit himself when using WeakHashMaps directly in combination with the -Xmx setting.
>
> > That's wrong. Both cases do not cause memory leaks. In the case of the
> > CDORevisionCache(s) you may indeed encounter an increase in memory
> > consumption but if memory goes low enough the softly reachable
> > CDORevisions will be garbage collected and the respective map entries
> > will be removed from the cache map. If that does really not happen
> > (unlikely) then there is a bug in the cache implementation.
>
> Ah, here is a misunderstanding, either from me or you. You wrote "the respective map entries will be removed from the cache map", but which documentation says this is be done?
The code. We consider the cache mostly an implementation detail, but it strikes me that we should better expose the ReferenceQueueWorker configuration in CDORevisionCache.

> If the CDORevision is softly referenced by a CacheSoftReference and is garbage collected, the CacheSoftReference instance itself will NOT be garbage collected at all and will remain as value in the HashMap. But even if the CacheSoftReference instance itself would be garbage collected, I dont see which part of the Java framework can remove this value and the corresponding key from the HashMap. The documentation of e.g. SoftReference says nothing about removing strong references.
See above.

>
>
>
> > Where is the stack trace of the OutOfMemoryError that could be an
> > indication for a memory leak?
>
> This is a problem, I run the test with a bigger model over night to produce a oufofmem exception, restricting RAM to 60mb and trying to read in a model file of 114mb to the database. I use JVM acceptor, with client and server having to share the 60mb.
If you want to find evidence of memory leaks I would not start with a heap max near the application minimum requirement under no load. If you want to test scalability you should find a heap max value that under normal load operates without any problems and then start to increase the load.

> I saved the partially build model every 5000 new model elements (about 0.2% of the whole model) , so that the CLEAN CDOObjects can be garbage collected. However at about 17.7% a timeout occured,
Note that modify/commit operates do not have the same scalability characteristics as load/read operations. If you want to test the revision cache (in your scenarion there are at least 2 such caches!) populate a huge repository in small steps (small enough that they have no problems with your heap size limit). Restart the JVM and iterate over the entire model without holding on to any objects. Then you should be able to see how the cache behaves.

> but since the last 0.2% of the previous save() took already nearly twice as long as normal, I guess the RAM limit was quite full, much time was spend with memory releasing and allocation.
Maybe, maybe not.

> So I guess behind the timeout exception, there was a outofmem exception somewhere, which was catched, but only resulted in non-responsive server. So the -XX:+HeapDumpOnOutOfMemoryError option could not create a dump heap!
Again, you are guessing.

>
> As I already said, I also have a memory leak problem in my code, since while parsing the model file, I have to build up a list of unresolved forward-references, which can be resolved when elements occuring later in the file get parsed. (Its not an XMI file, actually an IFC file, but the problem is nearly equivalent to XMI files).
We seem to apply different semantics to the term "memory leak". I would not call the above scenario a memory leak. It's just a piece of transient information that *needs* to be kept in memory for an indeterministic, *but not indefinite* time.

> I will later resolve this problem by storing forward references as CDO objects into the database, however forward references are quite rare in contrast to backward references in IFC files (I guess in XMI files, they are 50%/50%).
>
> I print out the console:
With only 60mb for client and server this means almost nothing IMHO. Please try the approach that I outlined above.

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

>
>
> Saving at 16.498753336740343% #dirty=5000 %dirty=0.19566501204097264
> Saving took 63938
> Saving at 16.694418348781312% #dirty=5000 %dirty=0.1956650120409691
> Saving took 63093
> Saving at 16.897071396966602% #dirty=5000 %dirty=0.20265304818529017
> Saving took 61688
> Saving at 17.09273640900757% #dirty=5000 %dirty=0.1956650120409691
> Saving took 60594
> Saving at 17.29538945719286% #dirty=5000 %dirty=0.20265304818529017
> Saving took 61890
> Saving at 17.491054469233834% #dirty=5000 %dirty=0.19566501204097264
> Saving took 98000
> Saving at 17.69370751741912% #dirty=5000 %dirty=0.20265304818528662
> [ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
> at java.util.TimerThread.mainLoop(Timer.java:512)
> at java.util.TimerThread.run(Timer.java:462)
> [ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
> at java.util.TimerThread.mainLoop(Timer.java:512)
> at java.util.TimerThread.run(Timer.java:462)
> ------------------------- END -------------------------
>
>
>
> org.eclipse.net4j.util.transaction.TransactionException: org.eclipse.emf.cdo.util.CommitException: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:959)
> at qut.part21.loader.Part21LoaderCDO.consume(Part21LoaderCDO.java:41)
> at qut.part21.parser.ClearTextReader.jjtreeCloseNodeScope(ClearTextReader.java:18)
> at qut.part21.parser.ClearTextReader.entity_instance(ClearTextReader.java:606)
> at qut.part21.parser.ClearTextReader.entity_instance_list(ClearTextReader.java:534)
> at qut.part21.parser.ClearTextReader.data_section(ClearTextReader.java:492)
> at qut.part21.parser.ClearTextReader.exchange_file(ClearTextReader.java:81)
> at qut.part21.parser.ClearTextReader.syntax(ClearTextReader.java:813)
> at qut.part21.loader.Part21ResourceImpl.doLoad(Part21ResourceImpl.java:44)
> at IFC2X3.util.IFC2X3ResourceImpl.doLoad(IFC2X3ResourceImpl.java:46)
> at org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1511)
> at org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1290)
> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoad(ResourceSetImpl.java:255)
> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoadHelper(ResourceSetImpl.java:270)
> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.getResource(ResourceSetImpl.java:397)
> at org.eclipse.emf.cdo.internal.ui.actions.ImportRootsAction.getSourceResources(ImportRootsAction.java:111)
> at qut.ifcm.cdo.editor.ImportIfcRootsAction.doRun(ImportIfcRootsAction.java:37)
> at qut.ifcm.cdo.editor.ImportIfcRootsAction.testRun(ImportIfcRootsAction.java:63)
> at qut.designview.test.ContainmentTest.testBasicContainment(ContainmentTest.java:105)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at junit.framework.TestCase.runTest(TestCase.java:168)
> at org.eclipse.net4j.util.tests.AbstractOMTest.runBare(AbstractOMTest.java:214)
> at org.eclipse.emf.cdo.tests.config.impl.ConfigTest.runBare(ConfigTest.java:504)
> at junit.framework.TestResult$1.protect(TestResult.java:110)
> at junit.framework.TestResult.runProtected(TestResult.java:128)
> at junit.framework.TestResult.run(TestResult.java:113)
> at junit.framework.TestCase.run(TestCase.java:124)
> at org.eclipse.net4j.util.tests.AbstractOMTest.run(AbstractOMTest.java:260)
> at junit.framework.TestSuite.runTest(TestSuite.java:243)
> at org.eclipse.emf.cdo.tests.config.impl.ConfigTestSuite$TestWrapper.runTest(ConfigTestSuite.java:126)
> at junit.framework.TestSuite.run(TestSuite.java:238)
> at junit.framework.TestSuite.runTest(TestSuite.java:243)
> at junit.framework.TestSuite.run(TestSuite.java:238)
> at junit.framework.TestSuite.runTest(TestSuite.java:243)
> at junit.framework.TestSuite.run(TestSuite.java:238)
> at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
> at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
> at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> Caused by: org.eclipse.emf.cdo.util.CommitException: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.emf.internal.cdo.transaction.CDOTransactionImpl.commit(CDOTransactionImpl.java:1072)
> at org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:955)
> ... 44 more
> Caused by: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.signal.RequestWithConfirmation.getRemoteException(RequestWithConfirmation.java:139)
> at org.eclipse.net4j.signal.RequestWithConfirmation.setRemoteException(RequestWithConfirmation.java:128)
> at org.eclipse.net4j.signal.SignalProtocol.handleRemoteException(SignalProtocol.java:423)
> at org.eclipse.net4j.signal.RemoteExceptionIndication.indicating(RemoteExceptionIndication.java:63)
> at org.eclipse.net4j.signal.Indication.doExtendedInput(Indication.java:55)
> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
> at org.eclipse.net4j.signal.Indication.execute(Indication.java:49)
> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
> ... 5 more
> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
> at java.util.TimerThread.mainLoop(Timer.java:512)
> at java.util.TimerThread.run(Timer.java:462)
>
>

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Report message to a moderator

(no subject) [message #687089 is a reply to message #687088]

Sun, 29 May 2011 07:00

Eike Stepper is currently offline

Friend

Messages: 6682
Registered: July 2009

Senior Member

One more note: Especially about the non-auditing revision cache I know that it is being used in production with huge model scenarios. It has undergone extensive testing and no memory leaks have been found.

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Am 29.05.2011 08:57, schrieb Eike Stepper:
> Am 29.05.2011 04:53, schrieb exquisitus:
>> Hi Eike,
>>
>>
>> > I doubt that, in general, changing soft references to weak references
>> > can cure an evident memory leak.
>>
>> I actually not proposed to change soft references to weak references, but I proposed to change strong references to weak references by using a WeakHashMap instead of a HashMap.
> WeakHashMaps are usually used for canonical mappings, i.e. for values that are needed no longer than the keys. That's not the case for the revision cache because revisions are used independently of their keys (CDOIDs) and because CDOIDs are not expected to be unique (e.g. two revisions that reference the same third revision are not required to use the same target CDOID *instance* ).
>
> So, neither *weak* nor *key* makes sense in this type of cache.
>
>> I dont care that the HashMap'values are CacheSoftReference instances, the data type is irrelevant for the memory problem.
> What exactly is the memory problem. As I mentioned in my first reply, growing memory consumption is a good sign for a cache as long as it does not result in OutOfMemoryErrors. It's very unlikely that CDO swallows exceptions. To double check you can register your custom log and trace listeners: http://wiki.eclipse.org/FAQ_for_CDO_and_Net4j#How_can_I_enable_tracing.3F
>
>> I care that the HashMap's keys are never garbage collected.
> That's not true. The values (keyed soft refs) are registered with a ReferenceQueue where they are enqueued by the JVM automatically when the values are garbage collected. This queue is monitored by a worker thread that periodically polls the queue, takes the keys of the keyed soft refs and removes them (i.e. the cache map entries) from the cache map. This behaviour is configurable if you cast your cache to ReferenceQueueWorker:
>
> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setPollMillis(long)
> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setMaxWorkPerPoll(int)
>
>> When memory is running low, a WeakHashMap can remove those keys, which are not referenced by the remainder application (and I guess this is the case here).
> Exactly that's the reason why it does not make sense. The cache content was likely to disappear immediately after the first usage (remember what I said about unspec'ed behaviour of weak refs).
>
>>
>> Another solution for this problem
> So far I don't see a problem.
>
>> would be to use a LRU cache for CDORevisionCacheNonAuditing#revisions , in the same way it was done for ObjectTypeCache$MemoryCache.
> In the case of a revision cache it's not so easy because the lookup strategies are more complex: lookup by id,branch,time (range search) and lookup by id,branch,version. We used to ship an LRU cache but it turned out to be very hard to maintain and it offered almost no benefit compared to a purely memory-sensitive cache implementation (which would be needed anyway).
>
>> So it should be defined that way:
>>
>> private Map<CDOID, Reference<InternalCDORevision>> revisions = new LinkedHashMap<CDOID, Reference<InternalCDORevision>>() {
>>
>> @Override
>> protected boolean removeEldestEntry(java.util.Map.Entry<CDOID, Reference<InternalCDORevision>> eldest)
>> {
>> return size() > 100000;
>> }
>> }
>>
>> But I actually dont like this concept, sinse it is hard to find a good limit like 100000. Hard to adapt it to the users' actually available memory. some have 2 gigs, some have 64bit with 16gigs RAM. Are there heuristics so that this limit is adjusted?
> The only advantage of LRU caches (in Java!) is that they can have a custom eviction policy applied, which is (in Java!) not possible for memory-sensitive caches. Some articles claim to have evidence that (in Java!) the best results can be achieved with two-level caches. Level1 is LRU that evicts by application policy to level2. Level2 is memory-sensitive. As I said, this turned out to be too complex for our non-standard access patterns.
>
>> SoftReference's documentation says "Virtual machine implementations are, however, encouraged to bias against clearing recently-created or recently-used soft references." So since the docu is from Sun, they should also have implemented some LRU strategy for e.g. a WeakHashMap. So depending on the -Xmx setting of the user, the user can roughly manage this limit himself when using WeakHashMaps directly in combination with the -Xmx setting.
>>
>> > That's wrong. Both cases do not cause memory leaks. In the case of the
>> > CDORevisionCache(s) you may indeed encounter an increase in memory
>> > consumption but if memory goes low enough the softly reachable
>> > CDORevisions will be garbage collected and the respective map entries
>> > will be removed from the cache map. If that does really not happen
>> > (unlikely) then there is a bug in the cache implementation.
>>
>> Ah, here is a misunderstanding, either from me or you. You wrote "the respective map entries will be removed from the cache map", but which documentation says this is be done?
> The code. We consider the cache mostly an implementation detail, but it strikes me that we should better expose the ReferenceQueueWorker configuration in CDORevisionCache.
>
>> If the CDORevision is softly referenced by a CacheSoftReference and is garbage collected, the CacheSoftReference instance itself will NOT be garbage collected at all and will remain as value in the HashMap. But even if the CacheSoftReference instance itself would be garbage collected, I dont see which part of the Java framework can remove this value and the corresponding key from the HashMap. The documentation of e.g. SoftReference says nothing about removing strong references.
> See above.
>
>>
>>
>>
>> > Where is the stack trace of the OutOfMemoryError that could be an
>> > indication for a memory leak?
>>
>> This is a problem, I run the test with a bigger model over night to produce a oufofmem exception, restricting RAM to 60mb and trying to read in a model file of 114mb to the database. I use JVM acceptor, with client and server having to share the 60mb.
> If you want to find evidence of memory leaks I would not start with a heap max near the application minimum requirement under no load. If you want to test scalability you should find a heap max value that under normal load operates without any problems and then start to increase the load.
>
>> I saved the partially build model every 5000 new model elements (about 0.2% of the whole model) , so that the CLEAN CDOObjects can be garbage collected. However at about 17.7% a timeout occured,
> Note that modify/commit operates do not have the same scalability characteristics as load/read operations. If you want to test the revision cache (in your scenarion there are at least 2 such caches!) populate a huge repository in small steps (small enough that they have no problems with your heap size limit). Restart the JVM and iterate over the entire model without holding on to any objects. Then you should be able to see how the cache behaves.
>
>> but since the last 0.2% of the previous save() took already nearly twice as long as normal, I guess the RAM limit was quite full, much time was spend with memory releasing and allocation.
> Maybe, maybe not.
>
>> So I guess behind the timeout exception, there was a outofmem exception somewhere, which was catched, but only resulted in non-responsive server. So the -XX:+HeapDumpOnOutOfMemoryError option could not create a dump heap!
> Again, you are guessing.
>
>>
>> As I already said, I also have a memory leak problem in my code, since while parsing the model file, I have to build up a list of unresolved forward-references, which can be resolved when elements occuring later in the file get parsed. (Its not an XMI file, actually an IFC file, but the problem is nearly equivalent to XMI files).
> We seem to apply different semantics to the term "memory leak". I would not call the above scenario a memory leak. It's just a piece of transient information that *needs* to be kept in memory for an indeterministic, *but not indefinite* time.
>
>> I will later resolve this problem by storing forward references as CDO objects into the database, however forward references are quite rare in contrast to backward references in IFC files (I guess in XMI files, they are 50%/50%).
>>
>> I print out the console:
> With only 60mb for client and server this means almost nothing IMHO. Please try the approach that I outlined above.
>
> Cheers
> /Eike
>
> ----
> http://www.esc-net.de
> http://thegordian.blogspot.com
> http://twitter.com/eikestepper
>
>
>>
>>
>> Saving at 16.498753336740343% #dirty=5000 %dirty=0.19566501204097264
>> Saving took 63938
>> Saving at 16.694418348781312% #dirty=5000 %dirty=0.1956650120409691
>> Saving took 63093
>> Saving at 16.897071396966602% #dirty=5000 %dirty=0.20265304818529017
>> Saving took 61688
>> Saving at 17.09273640900757% #dirty=5000 %dirty=0.1956650120409691
>> Saving took 60594
>> Saving at 17.29538945719286% #dirty=5000 %dirty=0.20265304818529017
>> Saving took 61890
>> Saving at 17.491054469233834% #dirty=5000 %dirty=0.19566501204097264
>> Saving took 98000
>> Saving at 17.69370751741912% #dirty=5000 %dirty=0.20265304818528662
>> [ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
>> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
>> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
>> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
>> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
>> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
>> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
>> at java.util.TimerThread.mainLoop(Timer.java:512)
>> at java.util.TimerThread.run(Timer.java:462)
>> [ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
>> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
>> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
>> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
>> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
>> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
>> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
>> at java.util.TimerThread.mainLoop(Timer.java:512)
>> at java.util.TimerThread.run(Timer.java:462)
>> ------------------------- END -------------------------
>>
>>
>>
>> org.eclipse.net4j.util.transaction.TransactionException: org.eclipse.emf.cdo.util.CommitException: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:959)
>> at qut.part21.loader.Part21LoaderCDO.consume(Part21LoaderCDO.java:41)
>> at qut.part21.parser.ClearTextReader.jjtreeCloseNodeScope(ClearTextReader.java:18)
>> at qut.part21.parser.ClearTextReader.entity_instance(ClearTextReader.java:606)
>> at qut.part21.parser.ClearTextReader.entity_instance_list(ClearTextReader.java:534)
>> at qut.part21.parser.ClearTextReader.data_section(ClearTextReader.java:492)
>> at qut.part21.parser.ClearTextReader.exchange_file(ClearTextReader.java:81)
>> at qut.part21.parser.ClearTextReader.syntax(ClearTextReader.java:813)
>> at qut.part21.loader.Part21ResourceImpl.doLoad(Part21ResourceImpl.java:44)
>> at IFC2X3.util.IFC2X3ResourceImpl.doLoad(IFC2X3ResourceImpl.java:46)
>> at org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1511)
>> at org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1290)
>> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoad(ResourceSetImpl.java:255)
>> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoadHelper(ResourceSetImpl.java:270)
>> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.getResource(ResourceSetImpl.java:397)
>> at org.eclipse.emf.cdo.internal.ui.actions.ImportRootsAction.getSourceResources(ImportRootsAction.java:111)
>> at qut.ifcm.cdo.editor.ImportIfcRootsAction.doRun(ImportIfcRootsAction.java:37)
>> at qut.ifcm.cdo.editor.ImportIfcRootsAction.testRun(ImportIfcRootsAction.java:63)
>> at qut.designview.test.ContainmentTest.testBasicContainment(ContainmentTest.java:105)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at junit.framework.TestCase.runTest(TestCase.java:168)
>> at org.eclipse.net4j.util.tests.AbstractOMTest.runBare(AbstractOMTest.java:214)
>> at org.eclipse.emf.cdo.tests.config.impl.ConfigTest.runBare(ConfigTest.java:504)
>> at junit.framework.TestResult$1.protect(TestResult.java:110)
>> at junit.framework.TestResult.runProtected(TestResult.java:128)
>> at junit.framework.TestResult.run(TestResult.java:113)
>> at junit.framework.TestCase.run(TestCase.java:124)
>> at org.eclipse.net4j.util.tests.AbstractOMTest.run(AbstractOMTest.java:260)
>> at junit.framework.TestSuite.runTest(TestSuite.java:243)
>> at org.eclipse.emf.cdo.tests.config.impl.ConfigTestSuite$TestWrapper.runTest(ConfigTestSuite.java:126)
>> at junit.framework.TestSuite.run(TestSuite.java:238)
>> at junit.framework.TestSuite.runTest(TestSuite.java:243)
>> at junit.framework.TestSuite.run(TestSuite.java:238)
>> at junit.framework.TestSuite.runTest(TestSuite.java:243)
>> at junit.framework.TestSuite.run(TestSuite.java:238)
>> at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
>> at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
>> at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
>> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
>> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
>> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
>> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
>> Caused by: org.eclipse.emf.cdo.util.CommitException: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.emf.internal.cdo.transaction.CDOTransactionImpl.commit(CDOTransactionImpl.java:1072)
>> at org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:955)
>> ... 44 more
>> Caused by: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.signal.RequestWithConfirmation.getRemoteException(RequestWithConfirmation.java:139)
>> at org.eclipse.net4j.signal.RequestWithConfirmation.setRemoteException(RequestWithConfirmation.java:128)
>> at org.eclipse.net4j.signal.SignalProtocol.handleRemoteException(SignalProtocol.java:423)
>> at org.eclipse.net4j.signal.RemoteExceptionIndication.indicating(RemoteExceptionIndication.java:63)
>> at org.eclipse.net4j.signal.Indication.doExtendedInput(Indication.java:55)
>> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
>> at org.eclipse.net4j.signal.Indication.execute(Indication.java:49)
>> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
>> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> Caused by: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
>> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
>> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
>> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
>> ... 5 more
>> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
>> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
>> at java.util.TimerThread.mainLoop(Timer.java:512)
>> at java.util.TimerThread.run(Timer.java:462)
>>
>>

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Report message to a moderator

(no subject) [message #687090 is a reply to message #687088]

Sun, 29 May 2011 12:35

John Smith is currently offline

Friend

Messages: 137
Registered: July 2009

Senior Member

> That's not true. The values (keyed soft refs) are registered with a
> ReferenceQueue where they are enqueued by the JVM automatically when the
> values are garbage collected. This queue is monitored by a worker thread
> that periodically polls the queue, takes the keys of the keyed soft refs
> and removes them (i.e. the cache map entries) from the cache map. This
> behaviour is configurable if you cast your cache to ReferenceQueueWorker:
>
> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setPollMillis(long)
> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setMaxWorkPerPoll(int)

I was not aware of the functionality of ReferenceQueue , yeah then its
no leak.

>> As I already said, I also have a memory leak problem in my code, since
>> while parsing the model file, I have to build up a list of unresolved
>> forward-references, which can be resolved when elements occuring later
>> in the file get parsed. (Its not an XMI file, actually an IFC file,
>> but the problem is nearly equivalent to XMI files).
> We seem to apply different semantics to the term "memory leak". I would
> not call the above scenario a memory leak. It's just a piece of
> transient information that *needs* to be kept in memory for an
> indeterministic, *but not indefinite* time.

Ok, a little bit imprecise, I am used to see it from a tester's point of
view.

Now I figured out that a larger IFC model i tested was likely generated
by some other CAD vendor, since there are nearly no backward references,
only forward references. It seems they used a totally different approach
for serialization. So first I have to store the forward references in
the DB to proceed with my leak search:) I plan to use an int-array, I
think this is supported by CDO, right? As CDO currently has no good
solution to manage large containment lists (even if only "smart
pointers" are used in the list), I now splitted the containment list of
the root element containing the whole model into a binary tree. This
effectively means, that for every stored model element, about one or two
containment-helper-elements are additionally stored, each holding up to
3 references (one for the 0 branch, one for the 1 branch, and one for
the model element itself), so this is quite acceptable I think. So now I
want to store additionally an int-array in the
containment-helper-element for keeping information about unresolved
forward references of the currently parsed element. After the whole IFC
file is parsed, I will iterate over the whole binary tree (by using
eAllContents()), and resolve the forward references by using the
int-arrays, and maybe setting the int-array to [] since they are no
longer needed.

I plan to contribute this solution to CDO since it seems to be very useful:
(a) on client side, only O(log(modelsize)) RAM is needed for parsing big
XMI files (the path from a leaf of the binary tree to the root). The
binary tree can be used as map for backward references, and can also
store forward references in an int-array.
(b) it solves the problem of inefficient large lists when the model is
loaded from CDO database
(c) serialization, which has the problem of memorizing (XMI-)IDs for
serialized elements, can also be done in O(log(modelsize)) time, since
the ids are already stored in the binary tree.

Another thought of mine was to write to DB directly, without using CDO,
this should be faster. But I dont want to deal with SQL and I want to
use CDO's garbage-collection-optimized-model-handling for the binary
tree itself on client side. Nevertheless it would be nice that the
domain metamodel and the domain editors do not need to know the EMF
metamodel for the binary tree. So the root element's containment list,
say its feature myContainedElements, should use the binary list
automatically. I partially realized this by deriving the root element's
metaclass from the binary tree's root node metaclass. A further step
would be to mark myContainedElements as derived. Any idea to better hide
the binary tree is welcome!

Report message to a moderator

(no subject) [message #687099 is a reply to message #687090]

Mon, 30 May 2011 10:20

Stefan Winkler is currently offline

Friend

Messages: 307
Registered: July 2009
Location: Germany

Senior Member

Hi,

you wrote
> As CDO currently has no good solution to manage large containment lists

what exactly do you mean by "good" solution?
good == performance (speed) or good == memory footprint

Please be aware that
a) EMF (and thus, CDO) does not support int[] out of the box, you'd have
to write an EDataType along with a proper mapping (including a good CDO
TypeMapping)
b) The int[] would be kept as an atomic value. Each time you commit it,
it would be transferred and written into the DB as a whole (it would not
be delta-aware)

So, could you please elaborate a bit more
- why wouldn't you use an EAttribute of type EInt with isMany == true?
- why do you want to store the int[] in the DB in the first place? If I
understand correctly, it is only a helper field, and as such it would be
of transient nature, right? So you could store it locally (in memory or
in a temp file), couldn't you?

Cheers,
Stefan

Am 29.05.11 14:35, schrieb exquisitus:
>
>> That's not true. The values (keyed soft refs) are registered with a
>> ReferenceQueue where they are enqueued by the JVM automatically when the
>> values are garbage collected. This queue is monitored by a worker thread
>> that periodically polls the queue, takes the keys of the keyed soft refs
>> and removes them (i.e. the cache map entries) from the cache map. This
>> behaviour is configurable if you cast your cache to ReferenceQueueWorker:
>>
>> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setPollMillis(long)
>> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setMaxWorkPerPoll(int)
>
> I was not aware of the functionality of ReferenceQueue , yeah then its
> no leak.
>
>
>>> As I already said, I also have a memory leak problem in my code, since
>>> while parsing the model file, I have to build up a list of unresolved
>>> forward-references, which can be resolved when elements occuring later
>>> in the file get parsed. (Its not an XMI file, actually an IFC file,
>>> but the problem is nearly equivalent to XMI files).
>> We seem to apply different semantics to the term "memory leak". I would
>> not call the above scenario a memory leak. It's just a piece of
>> transient information that *needs* to be kept in memory for an
>> indeterministic, *but not indefinite* time.
>
>
> Ok, a little bit imprecise, I am used to see it from a tester's point of
> view.
>
> Now I figured out that a larger IFC model i tested was likely generated
> by some other CAD vendor, since there are nearly no backward references,
> only forward references. It seems they used a totally different approach
> for serialization. So first I have to store the forward references in
> the DB to proceed with my leak search:) I plan to use an int-array, I
> think this is supported by CDO, right? As CDO currently has no good
> solution to manage large containment lists (even if only "smart
> pointers" are used in the list), I now splitted the containment list of
> the root element containing the whole model into a binary tree. This
> effectively means, that for every stored model element, about one or two
> containment-helper-elements are additionally stored, each holding up to
> 3 references (one for the 0 branch, one for the 1 branch, and one for
> the model element itself), so this is quite acceptable I think. So now I
> want to store additionally an int-array in the
> containment-helper-element for keeping information about unresolved
> forward references of the currently parsed element. After the whole IFC
> file is parsed, I will iterate over the whole binary tree (by using
> eAllContents()), and resolve the forward references by using the
> int-arrays, and maybe setting the int-array to [] since they are no
> longer needed.
>
> I plan to contribute this solution to CDO since it seems to be very useful:
> (a) on client side, only O(log(modelsize)) RAM is needed for parsing big
> XMI files (the path from a leaf of the binary tree to the root). The
> binary tree can be used as map for backward references, and can also
> store forward references in an int-array.
> (b) it solves the problem of inefficient large lists when the model is
> loaded from CDO database
> (c) serialization, which has the problem of memorizing (XMI-)IDs for
> serialized elements, can also be done in O(log(modelsize)) time, since
> the ids are already stored in the binary tree.
>
> Another thought of mine was to write to DB directly, without using CDO,
> this should be faster. But I dont want to deal with SQL and I want to
> use CDO's garbage-collection-optimized-model-handling for the binary
> tree itself on client side. Nevertheless it would be nice that the
> domain metamodel and the domain editors do not need to know the EMF
> metamodel for the binary tree. So the root element's containment list,
> say its feature myContainedElements, should use the binary list
> automatically. I partially realized this by deriving the root element's
> metaclass from the binary tree's root node metaclass. A further step
> would be to mark myContainedElements as derived. Any idea to better hide
> the binary tree is welcome!

Report message to a moderator

(no subject) [message #687105 is a reply to message #687099]

Tue, 31 May 2011 04:25

John Smith is currently offline

Friend

Messages: 137
Registered: July 2009

Senior Member

> you wrote
>> As CDO currently has no good solution to manage large containment lists
>
> what exactly do you mean by "good" solution?
> good == performance (speed) or good == memory footprint

good == memory footprint. As I understand the code of CDO, it creates a
list of always full size, which has large memory footprint, even if the
items are only CDOElementProxyImpl instances, see
CDOListWithElementProxiesImpl. Maybe a better solution would be not to
derive from ArrayList but from HashMap<Integer, Object>: thus mapping
the index to the currently hold value (a CDOElementProxy, a CDOID or the
model element itself). Then, if listAsMap.get(i) would return null, the
implementation could actually return CDOElementProxyImpl(i), so
demand-created.

> - why wouldn't you use an EAttribute of type EInt with isMany == true?
I actually modeled it that way, sorry for any confusion.

> - why do you want to store the int[] in the DB in the first place? If I
> understand correctly, it is only a helper field, and as such it would be
> of transient nature, right? So you could store it locally (in memory or
> in a temp file), couldn't you?

Storing this transient information in memory is no option, because it
will cause an outofmem error.
Storing in a temp file would be an option, but this requires some smart
logic to write entries and find them later in a big file - nothing else
is done by a DB, so I use CDO at best.

I now successful tried this approach (storing forward references as
int-array with the model itself) for a small 6mb IFC model.
For a larger 120mb model, I got an outofmem errpr. I think the reason
was that ObjectTypeCache.DEFAULT_CACHE_CAPACITY was 100000, and this was
to much as I only permitted 60mb RAM for server and client. I will do
another test with 90MB (i want to keep this limit down, to produce
outofmems, in order to find memory leaks).

Report message to a moderator

(no subject) [message #687106 is a reply to message #687105]

Tue, 31 May 2011 08:47

Stefan Winkler is currently offline

Friend

Messages: 307
Registered: July 2009
Location: Germany

Senior Member

Hi,

Am 31.05.11 06:25, schrieb exquisitus:
>>> As CDO currently has no good solution to manage large containment lists
>>
>> what exactly do you mean by "good" solution?
>> good == performance (speed) or good == memory footprint
>
> good == memory footprint. As I understand the code of CDO, it creates a
> list of always full size, which has large memory footprint, even if the
> items are only CDOElementProxyImpl instances, see
> CDOListWithElementProxiesImpl. Maybe a better solution would be not to
> derive from ArrayList but from HashMap<Integer, Object>: thus mapping
> the index to the currently hold value (a CDOElementProxy, a CDOID or the
> model element itself). Then, if listAsMap.get(i) would return null, the
> implementation could actually return CDOElementProxyImpl(i), so
> demand-created.

Well, implementing a list as a hash-map would maybe work for the get(i)
method. But inserting, removing, and moving elements would be really
costly, as you'd have to change all subsequent indices (= hash keys!).
At the same time, we don't win anything by this, because the reason for
the element proxies is to know that there is an element there, which
could be loaded on demand. You'd have to differentiate between no
element (because list isn't large enough) from proxy element (can be
loaded on demand) anyways. So the hashmap does not yield any advantage.

> I now successful tried this approach (storing forward references as
> int-array with the model itself) for a small 6mb IFC model.
> For a larger 120mb model, I got an outofmem errpr. I think the reason
> was that ObjectTypeCache.DEFAULT_CACHE_CAPACITY was 100000, and this was
> to much as I only permitted 60mb RAM for server and client. I will do
> another test with 90MB (i want to keep this limit down, to produce
> outofmems, in order to find memory leaks).
I don't know whether you are aware that dirty objects (before committing
them) do not get GC'd. So, the transaction size (amount of
dirty/uncommitted objects) also is a source of OOMEs. If this is the
problem, you should commit more often...

Cheers,
Stefan

Report message to a moderator

(no subject) [message #687107 is a reply to message #687106]

Tue, 31 May 2011 09:49

John Smith is currently offline

Friend

Messages: 137
Registered: July 2009

Senior Member

> Well, implementing a list as a hash-map would maybe work for the get(i)
> method. But inserting, removing, and moving elements would be really
> costly, as you'd have to change all subsequent indices (= hash keys!).
The same effort must be done for lists! if you insert an element at
position i in a list, all following elements with index greater than i
must be shifted! In a hashmap, all keys greater than i must be adapted
accordingly. Ok, this takes O(n) (iterating over all keys), in contrast
to O(n\2) for lists, considering the average case. But if the map only
contains e.g. half of the elements, then we have alrady the same effort!
The less elements are in the map, the better performs the map in
contrast to a list. Considering large lists where you want to handle no
much elements at a time, a map performs much better.

By the way, adding an element to the end would be as fast as for a list.
In 99% of all cases you add to the end. Normaly only when you want to
delete elements from the list, the above comparison is relevant.

> At the same time, we don't win anything by this, because the reason for
> the element proxies is to know that there is an element there, which
> could be loaded on demand. You'd have to differentiate between no
> element (because list isn't large enough) from proxy element (can be
> loaded on demand) anyways. So the hashmap does not yield any advantage.

The list.size() method can be used to determine how many elements are
available. However by deriving from ArrayList, the current
implementation has to add dummies, namely CDOElementProxyImpl, to get
list.size "to the right number".

However when the List interface would be re-implemented, so that the
implemented methods would use a HashMap, and if the size() method would
return an own managed counter of how many elements are available, then
the HashMap-approach can unfolds its power.

>
>> I now successful tried this approach (storing forward references as
>> int-array with the model itself) for a small 6mb IFC model.
>> For a larger 120mb model, I got an outofmem errpr. I think the reason
>> was that ObjectTypeCache.DEFAULT_CACHE_CAPACITY was 100000, and this was
>> to much as I only permitted 60mb RAM for server and client. I will do
>> another test with 90MB (i want to keep this limit down, to produce
>> outofmems, in order to find memory leaks).
> I don't know whether you are aware that dirty objects (before committing
> them) do not get GC'd. So, the transaction size (amount of
> dirty/uncommitted objects) also is a source of OOMEs. If this is the
> problem, you should commit more often...
I commit every 5000 "dirty" elements, so this cannot be a source of OOME.

Report message to a moderator

(no subject) [message #687108 is a reply to message #687107]

Tue, 31 May 2011 11:22

Stefan Winkler is currently offline

Friend

Messages: 307
Registered: July 2009
Location: Germany

Senior Member

Hi,

ok, let's go away from the technical stuff (list, hashmap, etc.).
What you really would like to have is a more (memory-)efficient
implementation of partially loaded lists, right?

I am not sure if this can be done (mainly because I think, the standard
behavior should be kept and an optimization like the one you propose
should be configurable; also, I am currently unsure how much the
structure/implementation is induced and dependent by EMF itself).

Technically, an implementation like a ChunkedList (a collection of
chunks of ArrayLists) would be preferrable over a HashMap.

And finally, one would have to think about eviction strategies, because
currently, lists cannot be "unloaded" partially.

Cheers,
Stefan

PS: I usually like to know who I'm talking to. So if you don't mind,
please put some name in your messages rather than "exquisitus". :-P

Am 31.05.11 11:49, schrieb exquisitus:
>
> > Well, implementing a list as a hash-map would maybe work for the get(i)
> > method. But inserting, removing, and moving elements would be really
> > costly, as you'd have to change all subsequent indices (= hash keys!).
> The same effort must be done for lists! if you insert an element at
> position i in a list, all following elements with index greater than i
> must be shifted! In a hashmap, all keys greater than i must be adapted
> accordingly. Ok, this takes O(n) (iterating over all keys), in contrast
> to O(n\2) for lists, considering the average case. But if the map only
> contains e.g. half of the elements, then we have alrady the same effort!
> The less elements are in the map, the better performs the map in
> contrast to a list. Considering large lists where you want to handle no
> much elements at a time, a map performs much better.
>
> By the way, adding an element to the end would be as fast as for a list.
> In 99% of all cases you add to the end. Normaly only when you want to
> delete elements from the list, the above comparison is relevant.
>
> > At the same time, we don't win anything by this, because the reason for
> > the element proxies is to know that there is an element there, which
> > could be loaded on demand. You'd have to differentiate between no
> > element (because list isn't large enough) from proxy element (can be
> > loaded on demand) anyways. So the hashmap does not yield any advantage.
>
> The list.size() method can be used to determine how many elements are
> available. However by deriving from ArrayList, the current
> implementation has to add dummies, namely CDOElementProxyImpl, to get
> list.size "to the right number".
>
> However when the List interface would be re-implemented, so that the
> implemented methods would use a HashMap, and if the size() method would
> return an own managed counter of how many elements are available, then
> the HashMap-approach can unfolds its power.
>
> >
> >> I now successful tried this approach (storing forward references as
> >> int-array with the model itself) for a small 6mb IFC model.
> >> For a larger 120mb model, I got an outofmem errpr. I think the reason
> >> was that ObjectTypeCache.DEFAULT_CACHE_CAPACITY was 100000, and this
> was
> >> to much as I only permitted 60mb RAM for server and client. I will do
> >> another test with 90MB (i want to keep this limit down, to produce
> >> outofmems, in order to find memory leaks).
> > I don't know whether you are aware that dirty objects (before committing
> > them) do not get GC'd. So, the transaction size (amount of
> > dirty/uncommitted objects) also is a source of OOMEs. If this is the
> > problem, you should commit more often...
> I commit every 5000 "dirty" elements, so this cannot be a source of OOME.

Report message to a moderator

(no subject) [message #687302 is a reply to message #687085]

Sat, 28 May 2011 12:41

Eike Stepper is currently offline

Friend

Messages: 6682
Registered: July 2009

Senior Member

Am 28.05.2011 14:12, schrieb exquisitus:
> I made a heap dump while 50% of my model is loaded from a file to the CDO server, and another one shortly before 100%. I compared the resulting heap dumps using MemoryAnalyser, and found two suspect instance increasements responsible for memory consumption:
>
> Class Name | Objects #1 | Objects #2 | Shallow Heap #1 | Shallow Heap #2
> ----------------------------------------------------------------------------------------------------------------------------------------------------
> org.eclipse.emf.cdo.internal.common.revision.AbstractCDORevisionCache$CacheSoftReference| 642 | +35.283 | 30.816 | +1.693.584
> org.eclipse.emf.cdo.internal.common.id.CDOIDObjectLongImpl | 117.549 | +43.008 | 1.880.784 | +688.128
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> So there are 1.693.584 more instances of CacheSoftReference in the second heap dump.
> Viewing the Paths To GC Roots, I found out that they are referenced in CDORevisionCacheNonAuditing by this hash map:
>
> private Map<CDOID, Reference<InternalCDORevision>> revisions = new HashMap<CDOID, Reference<InternalCDORevision>>();
>
>
> As I understand the cache concept, the application behavior will not be changed if the cached is cleared (except for performance). So to solve the problem of growing RAM usage, would a java.util.WeakHashMap not do a better job here? So the cache can partially be cleared if there is to less RAM.
The Java spec does not guarantee any particular behaviour for weak or soft references. Usually a JVM with the -client option handles weak references as if they were soft references. A diiference is usually only in JVMs started with the -server option. Generally noone has a benefit from free memory as such, but I suspect that weakly referenced objects are garbage collected in favour of softly referenced objects if memory runs low. A framework can hardly judge which objects are more expensive to recreate. I doubt that weak references are a good idea for caches with objects that are expensive to recreate.

>
> Regarding the increasing size of instances of CDOIDObjectLongImpl, there play the key role in the above mentioned hashmap, but are also used as key and value at
> org.eclipse.emf.cdo.server.internal.db.mapping.horizontal.ObjectTypeCache$MemoryCache:
>
> private Map<CDOID, CDOID> memoryCache;
>
> .. which is assigned a LinkedHashMap<CDOID, CDOID> instance.
Note how MemoryCache cares for eviction of the eldest element if the configured capacity is exceeded.

>
> As it is a cache concept, shouldnt it also be a WeakHashMap?
No, it is basically a fixed size LRU cache.

> Ok, the "Linked" behavior is lost, but I not understood how it is used in this code.
To quickly be able to remove the eldest (LRU) element.

> As I read the documentation, LinkedHashMap can define an iteration order in different ways (the iteration order of a simple HashMap is more undefined), but I found no code where actually an iterator of memoryCache is queried.
The actual eviction logic is in LinkedHashMap.

>
> To conclude, I found these two memory leaks in CDO
That's wrong. Both cases do not cause memory leaks. In the case of the CDORevisionCache(s) you may indeed encounter an increase in memory consumption but if memory goes low enough the softly reachable CDORevisions will be garbage collected and the respective map entries will be removed from the cache map. If that does really not happen (unlikely) then there is a bug in the cache implementation. Please submit a bugzilla then.

Where is the stack trace of the OutOfMemoryError that could be an indication for a memory leak?

> (beside a memory leak in my own code, but I was aware of this before and maybe I will change the responsible POJO structure to a CDO structure, which on the one side must be stored in the DB but on the other side can then be garbage collected on the client), but as these two leaks are fortunately located in cache functionality, this should be easily solved by using weak references.
I doubt that, in general, changing soft references to weak references can cure an evident memory leak.

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Report message to a moderator

(no subject) [message #687303 is a reply to message #687086]

Sun, 29 May 2011 02:53

John Smith is currently offline

Friend

Messages: 137
Registered: July 2009

Senior Member

Hi Eike,

> I doubt that, in general, changing soft references to weak references
> can cure an evident memory leak.

I actually not proposed to change soft references to weak references,
but I proposed to change strong references to weak references by using a
WeakHashMap instead of a HashMap. I dont care that the HashMap'values
are CacheSoftReference instances, the data type is irrelevant for the
memory problem. I care that the HashMap's keys are never garbage
collected. When memory is running low, a WeakHashMap can remove those
keys, which are not referenced by the remainder application (and I guess
this is the case here).

Another solution for this problem would be to use a LRU cache for
CDORevisionCacheNonAuditing#revisions , in the same way it was done for
ObjectTypeCache$MemoryCache. So it should be defined that way:

private Map<CDOID, Reference<InternalCDORevision>> revisions = new
LinkedHashMap<CDOID, Reference<InternalCDORevision>>() {

@Override
protected boolean removeEldestEntry(java.util.Map.Entry<CDOID,
Reference<InternalCDORevision>> eldest)
{
return size() > 100000;
}
}

But I actually dont like this concept, sinse it is hard to find a good
limit like 100000. Hard to adapt it to the users' actually available
memory. some have 2 gigs, some have 64bit with 16gigs RAM. Are there
heuristics so that this limit is adjusted? SoftReference's documentation
says "Virtual machine implementations are, however, encouraged to bias
against clearing recently-created or recently-used soft references." So
since the docu is from Sun, they should also have implemented some LRU
strategy for e.g. a WeakHashMap. So depending on the -Xmx setting of the
user, the user can roughly manage this limit himself when using
WeakHashMaps directly in combination with the -Xmx setting.

> That's wrong. Both cases do not cause memory leaks. In the case of the
> CDORevisionCache(s) you may indeed encounter an increase in memory
> consumption but if memory goes low enough the softly reachable
> CDORevisions will be garbage collected and the respective map entries
> will be removed from the cache map. If that does really not happen
> (unlikely) then there is a bug in the cache implementation.

Ah, here is a misunderstanding, either from me or you. You wrote "the
respective map entries will be removed from the cache map", but which
documentation says this is be done? If the CDORevision is softly
referenced by a CacheSoftReference and is garbage collected, the
CacheSoftReference instance itself will NOT be garbage collected at all
and will remain as value in the HashMap. But even if the
CacheSoftReference instance itself would be garbage collected, I dont
see which part of the Java framework can remove this value and the
corresponding key from the HashMap. The documentation of e.g.
SoftReference says nothing about removing strong references.

> Where is the stack trace of the OutOfMemoryError that could be an
> indication for a memory leak?

This is a problem, I run the test with a bigger model over night to
produce a oufofmem exception, restricting RAM to 60mb and trying to read
in a model file of 114mb to the database. I use JVM acceptor, with
client and server having to share the 60mb. I saved the partially build
model every 5000 new model elements (about 0.2% of the whole model) , so
that the CLEAN CDOObjects can be garbage collected. However at about
17.7% a timeout occured, but since the last 0.2% of the previous save()
took already nearly twice as long as normal, I guess the RAM limit was
quite full, much time was spend with memory releasing and allocation.
So I guess behind the timeout exception, there was a outofmem exception
somewhere, which was catched, but only resulted in non-responsive
server. So the -XX:+HeapDumpOnOutOfMemoryError option could not create a
dump heap!

As I already said, I also have a memory leak problem in my code, since
while parsing the model file, I have to build up a list of unresolved
forward-references, which can be resolved when elements occuring later
in the file get parsed. (Its not an XMI file, actually an IFC file, but
the problem is nearly equivalent to XMI files). I will later resolve
this problem by storing forward references as CDO objects into the
database, however forward references are quite rare in contrast to
backward references in IFC files (I guess in XMI files, they are 50%/50%).

I print out the console:

Saving at 16.498753336740343% #dirty=5000 %dirty=0.19566501204097264
Saving took 63938
Saving at 16.694418348781312% #dirty=5000 %dirty=0.1956650120409691
Saving took 63093
Saving at 16.897071396966602% #dirty=5000 %dirty=0.20265304818529017
Saving took 61688
Saving at 17.09273640900757% #dirty=5000 %dirty=0.1956650120409691
Saving took 60594
Saving at 17.29538945719286% #dirty=5000 %dirty=0.20265304818529017
Saving took 61890
Saving at 17.491054469233834% #dirty=5000 %dirty=0.19566501204097264
Saving took 98000
Saving at 17.69370751741912% #dirty=5000 %dirty=0.20265304818528662
[ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException:
Timeout after 10078 millis
org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
at
org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
at
org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException:
Timeout after 10078 millis
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
[ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException:
Timeout after 10078 millis
org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
at
org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
at
org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException:
Timeout after 10078 millis
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
------------------------- END -------------------------

org.eclipse.net4j.util.transaction.TransactionException:
org.eclipse.emf.cdo.util.CommitException:
org.eclipse.net4j.signal.RemoteException:
org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at
org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:959)
at qut.part21.loader.Part21LoaderCDO.consume(Part21LoaderCDO.java:41)
at
qut.part21.parser.ClearTextReader.jjtreeCloseNodeScope(ClearTextReader.java:18)
at
qut.part21.parser.ClearTextReader.entity_instance(ClearTextReader.java:606)
at
qut.part21.parser.ClearTextReader.entity_instance_list(ClearTextReader.java:534)
at qut.part21.parser.ClearTextReader.data_section(ClearTextReader.java:492)
at qut.part21.parser.ClearTextReader.exchange_file(ClearTextReader.java:81)
at qut.part21.parser.ClearTextReader.syntax(ClearTextReader.java:813)
at qut.part21.loader.Part21ResourceImpl.doLoad(Part21ResourceImpl.java:44)
at IFC2X3.util.IFC2X3ResourceImpl.doLoad(IFC2X3ResourceImpl.java:46)
at
org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1511)
at
org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1290)
at
org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoad(ResourceSetImpl.java:255)
at
org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoadHelper(ResourceSetImpl.java:270)
at
org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.getResource(ResourceSetImpl.java:397)
at
org.eclipse.emf.cdo.internal.ui.actions.ImportRootsAction.getSourceResources(ImportRootsAction.java:111)
at
qut.ifcm.cdo.editor.ImportIfcRootsAction.doRun(ImportIfcRootsAction.java:37)
at
qut.ifcm.cdo.editor.ImportIfcRootsAction.testRun(ImportIfcRootsAction.java:63)
at
qut.designview.test.ContainmentTest.testBasicContainment(ContainmentTest.java:105)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at
org.eclipse.net4j.util.tests.AbstractOMTest.runBare(AbstractOMTest.java:214)
at
org.eclipse.emf.cdo.tests.config.impl.ConfigTest.runBare(ConfigTest.java:504)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at org.eclipse.net4j.util.tests.AbstractOMTest.run(AbstractOMTest.java:260)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at
org.eclipse.emf.cdo.tests.config.impl.ConfigTestSuite$TestWrapper.runTest(ConfigTestSuite.java:126)
at junit.framework.TestSuite.run(TestSuite.java:238)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: org.eclipse.emf.cdo.util.CommitException:
org.eclipse.net4j.signal.RemoteException:
org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at
org.eclipse.emf.internal.cdo.transaction.CDOTransactionImpl.commit(CDOTransactionImpl.java:1072)
at
org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:955)
... 44 more
Caused by: org.eclipse.net4j.signal.RemoteException:
org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at
org.eclipse.net4j.signal.RequestWithConfirmation.getRemoteException(RequestWithConfirmation.java:139)
at
org.eclipse.net4j.signal.RequestWithConfirmation.setRemoteException(RequestWithConfirmation.java:128)
at
org.eclipse.net4j.signal.SignalProtocol.handleRemoteException(SignalProtocol.java:423)
at
org.eclipse.net4j.signal.RemoteExceptionIndication.indicating(RemoteExceptionIndication.java:63)
at org.eclipse.net4j.signal.Indication.doExtendedInput(Indication.java:55)
at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
at org.eclipse.net4j.signal.Indication.execute(Indication.java:49)
at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.eclipse.net4j.util.om.monitor.MonitorCanceledException:
org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after
10078 millis
at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
at
org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
at
org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
at
org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
at
org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
at
org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
at
org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
... 5 more
Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException:
Timeout after 10078 millis
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
at
org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)

Report message to a moderator

(no subject) [message #687304 is a reply to message #687087]

Sun, 29 May 2011 06:57

Eike Stepper is currently offline

Friend

Messages: 6682
Registered: July 2009

Senior Member

Am 29.05.2011 04:53, schrieb exquisitus:
> Hi Eike,
>
>
> > I doubt that, in general, changing soft references to weak references
> > can cure an evident memory leak.
>
> I actually not proposed to change soft references to weak references, but I proposed to change strong references to weak references by using a WeakHashMap instead of a HashMap.
WeakHashMaps are usually used for canonical mappings, i.e. for values that are needed no longer than the keys. That's not the case for the revision cache because revisions are used independently of their keys (CDOIDs) and because CDOIDs are not expected to be unique (e.g. two revisions that reference the same third revision are not required to use the same target CDOID *instance* ).

So, neither *weak* nor *key* makes sense in this type of cache.

> I dont care that the HashMap'values are CacheSoftReference instances, the data type is irrelevant for the memory problem.
What exactly is the memory problem. As I mentioned in my first reply, growing memory consumption is a good sign for a cache as long as it does not result in OutOfMemoryErrors. It's very unlikely that CDO swallows exceptions. To double check you can register your custom log and trace listeners: http://wiki.eclipse.org/FAQ_for_CDO_and_Net4j#How_can_I_enable_tracing.3F

> I care that the HashMap's keys are never garbage collected.
That's not true. The values (keyed soft refs) are registered with a ReferenceQueue where they are enqueued by the JVM automatically when the values are garbage collected. This queue is monitored by a worker thread that periodically polls the queue, takes the keys of the keyed soft refs and removes them (i.e. the cache map entries) from the cache map. This behaviour is configurable if you cast your cache to ReferenceQueueWorker:

org.eclipse.net4j.util.ref.ReferenceQueueWorker.setPollMillis(long)
org.eclipse.net4j.util.ref.ReferenceQueueWorker.setMaxWorkPerPoll(int)

> When memory is running low, a WeakHashMap can remove those keys, which are not referenced by the remainder application (and I guess this is the case here).
Exactly that's the reason why it does not make sense. The cache content was likely to disappear immediately after the first usage (remember what I said about unspec'ed behaviour of weak refs).

>
> Another solution for this problem
So far I don't see a problem.

> would be to use a LRU cache for CDORevisionCacheNonAuditing#revisions , in the same way it was done for ObjectTypeCache$MemoryCache.
In the case of a revision cache it's not so easy because the lookup strategies are more complex: lookup by id,branch,time (range search) and lookup by id,branch,version. We used to ship an LRU cache but it turned out to be very hard to maintain and it offered almost no benefit compared to a purely memory-sensitive cache implementation (which would be needed anyway).

> So it should be defined that way:
>
> private Map<CDOID, Reference<InternalCDORevision>> revisions = new LinkedHashMap<CDOID, Reference<InternalCDORevision>>() {
>
> @Override
> protected boolean removeEldestEntry(java.util.Map.Entry<CDOID, Reference<InternalCDORevision>> eldest)
> {
> return size() > 100000;
> }
> }
>
> But I actually dont like this concept, sinse it is hard to find a good limit like 100000. Hard to adapt it to the users' actually available memory. some have 2 gigs, some have 64bit with 16gigs RAM. Are there heuristics so that this limit is adjusted?
The only advantage of LRU caches (in Java!) is that they can have a custom eviction policy applied, which is (in Java!) not possible for memory-sensitive caches. Some articles claim to have evidence that (in Java!) the best results can be achieved with two-level caches. Level1 is LRU that evicts by application policy to level2. Level2 is memory-sensitive. As I said, this turned out to be too complex for our non-standard access patterns.

> SoftReference's documentation says "Virtual machine implementations are, however, encouraged to bias against clearing recently-created or recently-used soft references." So since the docu is from Sun, they should also have implemented some LRU strategy for e.g. a WeakHashMap. So depending on the -Xmx setting of the user, the user can roughly manage this limit himself when using WeakHashMaps directly in combination with the -Xmx setting.
>
> > That's wrong. Both cases do not cause memory leaks. In the case of the
> > CDORevisionCache(s) you may indeed encounter an increase in memory
> > consumption but if memory goes low enough the softly reachable
> > CDORevisions will be garbage collected and the respective map entries
> > will be removed from the cache map. If that does really not happen
> > (unlikely) then there is a bug in the cache implementation.
>
> Ah, here is a misunderstanding, either from me or you. You wrote "the respective map entries will be removed from the cache map", but which documentation says this is be done?
The code. We consider the cache mostly an implementation detail, but it strikes me that we should better expose the ReferenceQueueWorker configuration in CDORevisionCache.

> If the CDORevision is softly referenced by a CacheSoftReference and is garbage collected, the CacheSoftReference instance itself will NOT be garbage collected at all and will remain as value in the HashMap. But even if the CacheSoftReference instance itself would be garbage collected, I dont see which part of the Java framework can remove this value and the corresponding key from the HashMap. The documentation of e.g. SoftReference says nothing about removing strong references.
See above.

>
>
>
> > Where is the stack trace of the OutOfMemoryError that could be an
> > indication for a memory leak?
>
> This is a problem, I run the test with a bigger model over night to produce a oufofmem exception, restricting RAM to 60mb and trying to read in a model file of 114mb to the database. I use JVM acceptor, with client and server having to share the 60mb.
If you want to find evidence of memory leaks I would not start with a heap max near the application minimum requirement under no load. If you want to test scalability you should find a heap max value that under normal load operates without any problems and then start to increase the load.

> I saved the partially build model every 5000 new model elements (about 0.2% of the whole model) , so that the CLEAN CDOObjects can be garbage collected. However at about 17.7% a timeout occured,
Note that modify/commit operates do not have the same scalability characteristics as load/read operations. If you want to test the revision cache (in your scenarion there are at least 2 such caches!) populate a huge repository in small steps (small enough that they have no problems with your heap size limit). Restart the JVM and iterate over the entire model without holding on to any objects. Then you should be able to see how the cache behaves.

> but since the last 0.2% of the previous save() took already nearly twice as long as normal, I guess the RAM limit was quite full, much time was spend with memory releasing and allocation.
Maybe, maybe not.

> So I guess behind the timeout exception, there was a outofmem exception somewhere, which was catched, but only resulted in non-responsive server. So the -XX:+HeapDumpOnOutOfMemoryError option could not create a dump heap!
Again, you are guessing.

>
> As I already said, I also have a memory leak problem in my code, since while parsing the model file, I have to build up a list of unresolved forward-references, which can be resolved when elements occuring later in the file get parsed. (Its not an XMI file, actually an IFC file, but the problem is nearly equivalent to XMI files).
We seem to apply different semantics to the term "memory leak". I would not call the above scenario a memory leak. It's just a piece of transient information that *needs* to be kept in memory for an indeterministic, *but not indefinite* time.

> I will later resolve this problem by storing forward references as CDO objects into the database, however forward references are quite rare in contrast to backward references in IFC files (I guess in XMI files, they are 50%/50%).
>
> I print out the console:
With only 60mb for client and server this means almost nothing IMHO. Please try the approach that I outlined above.

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

>
>
> Saving at 16.498753336740343% #dirty=5000 %dirty=0.19566501204097264
> Saving took 63938
> Saving at 16.694418348781312% #dirty=5000 %dirty=0.1956650120409691
> Saving took 63093
> Saving at 16.897071396966602% #dirty=5000 %dirty=0.20265304818529017
> Saving took 61688
> Saving at 17.09273640900757% #dirty=5000 %dirty=0.1956650120409691
> Saving took 60594
> Saving at 17.29538945719286% #dirty=5000 %dirty=0.20265304818529017
> Saving took 61890
> Saving at 17.491054469233834% #dirty=5000 %dirty=0.19566501204097264
> Saving took 98000
> Saving at 17.69370751741912% #dirty=5000 %dirty=0.20265304818528662
> [ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
> at java.util.TimerThread.mainLoop(Timer.java:512)
> at java.util.TimerThread.run(Timer.java:462)
> [ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
> at java.util.TimerThread.mainLoop(Timer.java:512)
> at java.util.TimerThread.run(Timer.java:462)
> ------------------------- END -------------------------
>
>
>
> org.eclipse.net4j.util.transaction.TransactionException: org.eclipse.emf.cdo.util.CommitException: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:959)
> at qut.part21.loader.Part21LoaderCDO.consume(Part21LoaderCDO.java:41)
> at qut.part21.parser.ClearTextReader.jjtreeCloseNodeScope(ClearTextReader.java:18)
> at qut.part21.parser.ClearTextReader.entity_instance(ClearTextReader.java:606)
> at qut.part21.parser.ClearTextReader.entity_instance_list(ClearTextReader.java:534)
> at qut.part21.parser.ClearTextReader.data_section(ClearTextReader.java:492)
> at qut.part21.parser.ClearTextReader.exchange_file(ClearTextReader.java:81)
> at qut.part21.parser.ClearTextReader.syntax(ClearTextReader.java:813)
> at qut.part21.loader.Part21ResourceImpl.doLoad(Part21ResourceImpl.java:44)
> at IFC2X3.util.IFC2X3ResourceImpl.doLoad(IFC2X3ResourceImpl.java:46)
> at org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1511)
> at org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1290)
> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoad(ResourceSetImpl.java:255)
> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoadHelper(ResourceSetImpl.java:270)
> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.getResource(ResourceSetImpl.java:397)
> at org.eclipse.emf.cdo.internal.ui.actions.ImportRootsAction.getSourceResources(ImportRootsAction.java:111)
> at qut.ifcm.cdo.editor.ImportIfcRootsAction.doRun(ImportIfcRootsAction.java:37)
> at qut.ifcm.cdo.editor.ImportIfcRootsAction.testRun(ImportIfcRootsAction.java:63)
> at qut.designview.test.ContainmentTest.testBasicContainment(ContainmentTest.java:105)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at junit.framework.TestCase.runTest(TestCase.java:168)
> at org.eclipse.net4j.util.tests.AbstractOMTest.runBare(AbstractOMTest.java:214)
> at org.eclipse.emf.cdo.tests.config.impl.ConfigTest.runBare(ConfigTest.java:504)
> at junit.framework.TestResult$1.protect(TestResult.java:110)
> at junit.framework.TestResult.runProtected(TestResult.java:128)
> at junit.framework.TestResult.run(TestResult.java:113)
> at junit.framework.TestCase.run(TestCase.java:124)
> at org.eclipse.net4j.util.tests.AbstractOMTest.run(AbstractOMTest.java:260)
> at junit.framework.TestSuite.runTest(TestSuite.java:243)
> at org.eclipse.emf.cdo.tests.config.impl.ConfigTestSuite$TestWrapper.runTest(ConfigTestSuite.java:126)
> at junit.framework.TestSuite.run(TestSuite.java:238)
> at junit.framework.TestSuite.runTest(TestSuite.java:243)
> at junit.framework.TestSuite.run(TestSuite.java:238)
> at junit.framework.TestSuite.runTest(TestSuite.java:243)
> at junit.framework.TestSuite.run(TestSuite.java:238)
> at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
> at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
> at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> Caused by: org.eclipse.emf.cdo.util.CommitException: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.emf.internal.cdo.transaction.CDOTransactionImpl.commit(CDOTransactionImpl.java:1072)
> at org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:955)
> ... 44 more
> Caused by: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.signal.RequestWithConfirmation.getRemoteException(RequestWithConfirmation.java:139)
> at org.eclipse.net4j.signal.RequestWithConfirmation.setRemoteException(RequestWithConfirmation.java:128)
> at org.eclipse.net4j.signal.SignalProtocol.handleRemoteException(SignalProtocol.java:423)
> at org.eclipse.net4j.signal.RemoteExceptionIndication.indicating(RemoteExceptionIndication.java:63)
> at org.eclipse.net4j.signal.Indication.doExtendedInput(Indication.java:55)
> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
> at org.eclipse.net4j.signal.Indication.execute(Indication.java:49)
> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
> ... 5 more
> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
> at java.util.TimerThread.mainLoop(Timer.java:512)
> at java.util.TimerThread.run(Timer.java:462)
>
>

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Report message to a moderator

(no subject) [message #687305 is a reply to message #687088]

Sun, 29 May 2011 07:00

Eike Stepper is currently offline

Friend

Messages: 6682
Registered: July 2009

Senior Member

One more note: Especially about the non-auditing revision cache I know that it is being used in production with huge model scenarios. It has undergone extensive testing and no memory leaks have been found.

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Am 29.05.2011 08:57, schrieb Eike Stepper:
> Am 29.05.2011 04:53, schrieb exquisitus:
>> Hi Eike,
>>
>>
>> > I doubt that, in general, changing soft references to weak references
>> > can cure an evident memory leak.
>>
>> I actually not proposed to change soft references to weak references, but I proposed to change strong references to weak references by using a WeakHashMap instead of a HashMap.
> WeakHashMaps are usually used for canonical mappings, i.e. for values that are needed no longer than the keys. That's not the case for the revision cache because revisions are used independently of their keys (CDOIDs) and because CDOIDs are not expected to be unique (e.g. two revisions that reference the same third revision are not required to use the same target CDOID *instance* ).
>
> So, neither *weak* nor *key* makes sense in this type of cache.
>
>> I dont care that the HashMap'values are CacheSoftReference instances, the data type is irrelevant for the memory problem.
> What exactly is the memory problem. As I mentioned in my first reply, growing memory consumption is a good sign for a cache as long as it does not result in OutOfMemoryErrors. It's very unlikely that CDO swallows exceptions. To double check you can register your custom log and trace listeners: http://wiki.eclipse.org/FAQ_for_CDO_and_Net4j#How_can_I_enable_tracing.3F
>
>> I care that the HashMap's keys are never garbage collected.
> That's not true. The values (keyed soft refs) are registered with a ReferenceQueue where they are enqueued by the JVM automatically when the values are garbage collected. This queue is monitored by a worker thread that periodically polls the queue, takes the keys of the keyed soft refs and removes them (i.e. the cache map entries) from the cache map. This behaviour is configurable if you cast your cache to ReferenceQueueWorker:
>
> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setPollMillis(long)
> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setMaxWorkPerPoll(int)
>
>> When memory is running low, a WeakHashMap can remove those keys, which are not referenced by the remainder application (and I guess this is the case here).
> Exactly that's the reason why it does not make sense. The cache content was likely to disappear immediately after the first usage (remember what I said about unspec'ed behaviour of weak refs).
>
>>
>> Another solution for this problem
> So far I don't see a problem.
>
>> would be to use a LRU cache for CDORevisionCacheNonAuditing#revisions , in the same way it was done for ObjectTypeCache$MemoryCache.
> In the case of a revision cache it's not so easy because the lookup strategies are more complex: lookup by id,branch,time (range search) and lookup by id,branch,version. We used to ship an LRU cache but it turned out to be very hard to maintain and it offered almost no benefit compared to a purely memory-sensitive cache implementation (which would be needed anyway).
>
>> So it should be defined that way:
>>
>> private Map<CDOID, Reference<InternalCDORevision>> revisions = new LinkedHashMap<CDOID, Reference<InternalCDORevision>>() {
>>
>> @Override
>> protected boolean removeEldestEntry(java.util.Map.Entry<CDOID, Reference<InternalCDORevision>> eldest)
>> {
>> return size() > 100000;
>> }
>> }
>>
>> But I actually dont like this concept, sinse it is hard to find a good limit like 100000. Hard to adapt it to the users' actually available memory. some have 2 gigs, some have 64bit with 16gigs RAM. Are there heuristics so that this limit is adjusted?
> The only advantage of LRU caches (in Java!) is that they can have a custom eviction policy applied, which is (in Java!) not possible for memory-sensitive caches. Some articles claim to have evidence that (in Java!) the best results can be achieved with two-level caches. Level1 is LRU that evicts by application policy to level2. Level2 is memory-sensitive. As I said, this turned out to be too complex for our non-standard access patterns.
>
>> SoftReference's documentation says "Virtual machine implementations are, however, encouraged to bias against clearing recently-created or recently-used soft references." So since the docu is from Sun, they should also have implemented some LRU strategy for e.g. a WeakHashMap. So depending on the -Xmx setting of the user, the user can roughly manage this limit himself when using WeakHashMaps directly in combination with the -Xmx setting.
>>
>> > That's wrong. Both cases do not cause memory leaks. In the case of the
>> > CDORevisionCache(s) you may indeed encounter an increase in memory
>> > consumption but if memory goes low enough the softly reachable
>> > CDORevisions will be garbage collected and the respective map entries
>> > will be removed from the cache map. If that does really not happen
>> > (unlikely) then there is a bug in the cache implementation.
>>
>> Ah, here is a misunderstanding, either from me or you. You wrote "the respective map entries will be removed from the cache map", but which documentation says this is be done?
> The code. We consider the cache mostly an implementation detail, but it strikes me that we should better expose the ReferenceQueueWorker configuration in CDORevisionCache.
>
>> If the CDORevision is softly referenced by a CacheSoftReference and is garbage collected, the CacheSoftReference instance itself will NOT be garbage collected at all and will remain as value in the HashMap. But even if the CacheSoftReference instance itself would be garbage collected, I dont see which part of the Java framework can remove this value and the corresponding key from the HashMap. The documentation of e.g. SoftReference says nothing about removing strong references.
> See above.
>
>>
>>
>>
>> > Where is the stack trace of the OutOfMemoryError that could be an
>> > indication for a memory leak?
>>
>> This is a problem, I run the test with a bigger model over night to produce a oufofmem exception, restricting RAM to 60mb and trying to read in a model file of 114mb to the database. I use JVM acceptor, with client and server having to share the 60mb.
> If you want to find evidence of memory leaks I would not start with a heap max near the application minimum requirement under no load. If you want to test scalability you should find a heap max value that under normal load operates without any problems and then start to increase the load.
>
>> I saved the partially build model every 5000 new model elements (about 0.2% of the whole model) , so that the CLEAN CDOObjects can be garbage collected. However at about 17.7% a timeout occured,
> Note that modify/commit operates do not have the same scalability characteristics as load/read operations. If you want to test the revision cache (in your scenarion there are at least 2 such caches!) populate a huge repository in small steps (small enough that they have no problems with your heap size limit). Restart the JVM and iterate over the entire model without holding on to any objects. Then you should be able to see how the cache behaves.
>
>> but since the last 0.2% of the previous save() took already nearly twice as long as normal, I guess the RAM limit was quite full, much time was spend with memory releasing and allocation.
> Maybe, maybe not.
>
>> So I guess behind the timeout exception, there was a outofmem exception somewhere, which was catched, but only resulted in non-responsive server. So the -XX:+HeapDumpOnOutOfMemoryError option could not create a dump heap!
> Again, you are guessing.
>
>>
>> As I already said, I also have a memory leak problem in my code, since while parsing the model file, I have to build up a list of unresolved forward-references, which can be resolved when elements occuring later in the file get parsed. (Its not an XMI file, actually an IFC file, but the problem is nearly equivalent to XMI files).
> We seem to apply different semantics to the term "memory leak". I would not call the above scenario a memory leak. It's just a piece of transient information that *needs* to be kept in memory for an indeterministic, *but not indefinite* time.
>
>> I will later resolve this problem by storing forward references as CDO objects into the database, however forward references are quite rare in contrast to backward references in IFC files (I guess in XMI files, they are 50%/50%).
>>
>> I print out the console:
> With only 60mb for client and server this means almost nothing IMHO. Please try the approach that I outlined above.
>
> Cheers
> /Eike
>
> ----
> http://www.esc-net.de
> http://thegordian.blogspot.com
> http://twitter.com/eikestepper
>
>
>>
>>
>> Saving at 16.498753336740343% #dirty=5000 %dirty=0.19566501204097264
>> Saving took 63938
>> Saving at 16.694418348781312% #dirty=5000 %dirty=0.1956650120409691
>> Saving took 63093
>> Saving at 16.897071396966602% #dirty=5000 %dirty=0.20265304818529017
>> Saving took 61688
>> Saving at 17.09273640900757% #dirty=5000 %dirty=0.1956650120409691
>> Saving took 60594
>> Saving at 17.29538945719286% #dirty=5000 %dirty=0.20265304818529017
>> Saving took 61890
>> Saving at 17.491054469233834% #dirty=5000 %dirty=0.19566501204097264
>> Saving took 98000
>> Saving at 17.69370751741912% #dirty=5000 %dirty=0.20265304818528662
>> [ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
>> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
>> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
>> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
>> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
>> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
>> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
>> at java.util.TimerThread.mainLoop(Timer.java:512)
>> at java.util.TimerThread.run(Timer.java:462)
>> [ERROR] org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
>> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
>> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
>> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
>> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
>> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
>> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
>> at java.util.TimerThread.mainLoop(Timer.java:512)
>> at java.util.TimerThread.run(Timer.java:462)
>> ------------------------- END -------------------------
>>
>>
>>
>> org.eclipse.net4j.util.transaction.TransactionException: org.eclipse.emf.cdo.util.CommitException: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:959)
>> at qut.part21.loader.Part21LoaderCDO.consume(Part21LoaderCDO.java:41)
>> at qut.part21.parser.ClearTextReader.jjtreeCloseNodeScope(ClearTextReader.java:18)
>> at qut.part21.parser.ClearTextReader.entity_instance(ClearTextReader.java:606)
>> at qut.part21.parser.ClearTextReader.entity_instance_list(ClearTextReader.java:534)
>> at qut.part21.parser.ClearTextReader.data_section(ClearTextReader.java:492)
>> at qut.part21.parser.ClearTextReader.exchange_file(ClearTextReader.java:81)
>> at qut.part21.parser.ClearTextReader.syntax(ClearTextReader.java:813)
>> at qut.part21.loader.Part21ResourceImpl.doLoad(Part21ResourceImpl.java:44)
>> at IFC2X3.util.IFC2X3ResourceImpl.doLoad(IFC2X3ResourceImpl.java:46)
>> at org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1511)
>> at org.eclipse.emf.ecore.resource.impl.ResourceImpl.load(ResourceImpl.java:1290)
>> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoad(ResourceSetImpl.java:255)
>> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.demandLoadHelper(ResourceSetImpl.java:270)
>> at org.eclipse.emf.ecore.resource.impl.ResourceSetImpl.getResource(ResourceSetImpl.java:397)
>> at org.eclipse.emf.cdo.internal.ui.actions.ImportRootsAction.getSourceResources(ImportRootsAction.java:111)
>> at qut.ifcm.cdo.editor.ImportIfcRootsAction.doRun(ImportIfcRootsAction.java:37)
>> at qut.ifcm.cdo.editor.ImportIfcRootsAction.testRun(ImportIfcRootsAction.java:63)
>> at qut.designview.test.ContainmentTest.testBasicContainment(ContainmentTest.java:105)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at junit.framework.TestCase.runTest(TestCase.java:168)
>> at org.eclipse.net4j.util.tests.AbstractOMTest.runBare(AbstractOMTest.java:214)
>> at org.eclipse.emf.cdo.tests.config.impl.ConfigTest.runBare(ConfigTest.java:504)
>> at junit.framework.TestResult$1.protect(TestResult.java:110)
>> at junit.framework.TestResult.runProtected(TestResult.java:128)
>> at junit.framework.TestResult.run(TestResult.java:113)
>> at junit.framework.TestCase.run(TestCase.java:124)
>> at org.eclipse.net4j.util.tests.AbstractOMTest.run(AbstractOMTest.java:260)
>> at junit.framework.TestSuite.runTest(TestSuite.java:243)
>> at org.eclipse.emf.cdo.tests.config.impl.ConfigTestSuite$TestWrapper.runTest(ConfigTestSuite.java:126)
>> at junit.framework.TestSuite.run(TestSuite.java:238)
>> at junit.framework.TestSuite.runTest(TestSuite.java:243)
>> at junit.framework.TestSuite.run(TestSuite.java:238)
>> at junit.framework.TestSuite.runTest(TestSuite.java:243)
>> at junit.framework.TestSuite.run(TestSuite.java:238)
>> at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
>> at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
>> at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
>> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
>> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
>> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
>> at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
>> Caused by: org.eclipse.emf.cdo.util.CommitException: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.emf.internal.cdo.transaction.CDOTransactionImpl.commit(CDOTransactionImpl.java:1072)
>> at org.eclipse.emf.cdo.eresource.impl.CDOResourceImpl.save(CDOResourceImpl.java:955)
>> ... 44 more
>> Caused by: org.eclipse.net4j.signal.RemoteException: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.signal.RequestWithConfirmation.getRemoteException(RequestWithConfirmation.java:139)
>> at org.eclipse.net4j.signal.RequestWithConfirmation.setRemoteException(RequestWithConfirmation.java:128)
>> at org.eclipse.net4j.signal.SignalProtocol.handleRemoteException(SignalProtocol.java:423)
>> at org.eclipse.net4j.signal.RemoteExceptionIndication.indicating(RemoteExceptionIndication.java:63)
>> at org.eclipse.net4j.signal.Indication.doExtendedInput(Indication.java:55)
>> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
>> at org.eclipse.net4j.signal.Indication.execute(Indication.java:49)
>> at org.eclipse.net4j.signal.Signal.runSync(Signal.java:251)
>> at org.eclipse.net4j.signal.Signal.run(Signal.java:147)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> Caused by: org.eclipse.net4j.util.om.monitor.MonitorCanceledException: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.Monitor.checkCanceled(Monitor.java:56)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.checkCanceled(TimeoutMonitor.java:116)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.checkCanceled(NestedMonitor.java:55)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.hasBegun(AbstractMonitor.java:36)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.checkBegun(AbstractMonitor.java:141)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:54)
>> at org.eclipse.net4j.util.om.monitor.NestedMonitor.worked(NestedMonitor.java:71)
>> at org.eclipse.net4j.util.om.monitor.AbstractMonitor.worked(AbstractMonitor.java:60)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicatingCommit(CommitTransactionIndication.java:176)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CommitTransactionIndication.indicating(CommitTransactionIndication.java:91)
>> at org.eclipse.emf.cdo.server.internal.net4j.protocol.CDOServerIndicationWithMonitoring.indicating(CDOServerIndicationWithMonitoring.java:109)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.indicating(IndicationWithMonitoring.java:84)
>> at org.eclipse.net4j.signal.IndicationWithResponse.doExtendedInput(IndicationWithResponse.java:90)
>> at org.eclipse.net4j.signal.Signal.doInput(Signal.java:326)
>> at org.eclipse.net4j.signal.IndicationWithResponse.execute(IndicationWithResponse.java:63)
>> at org.eclipse.net4j.signal.IndicationWithMonitoring.execute(IndicationWithMonitoring.java:63)
>> ... 5 more
>> Caused by: org.eclipse.net4j.util.concurrent.TimeoutRuntimeException: Timeout after 10078 millis
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor.handleTimeout(TimeoutMonitor.java:121)
>> at org.eclipse.net4j.util.om.monitor.TimeoutMonitor$1.handleTimeout(TimeoutMonitor.java:61)
>> at org.eclipse.net4j.util.concurrent.Timeouter$1.run(Timeouter.java:87)
>> at java.util.TimerThread.mainLoop(Timer.java:512)
>> at java.util.TimerThread.run(Timer.java:462)
>>
>>

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper

Report message to a moderator

(no subject) [message #687306 is a reply to message #687088]

Sun, 29 May 2011 12:35

John Smith is currently offline

Friend

Messages: 137
Registered: July 2009

Senior Member

> That's not true. The values (keyed soft refs) are registered with a
> ReferenceQueue where they are enqueued by the JVM automatically when the
> values are garbage collected. This queue is monitored by a worker thread
> that periodically polls the queue, takes the keys of the keyed soft refs
> and removes them (i.e. the cache map entries) from the cache map. This
> behaviour is configurable if you cast your cache to ReferenceQueueWorker:
>
> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setPollMillis(long)
> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setMaxWorkPerPoll(int)

I was not aware of the functionality of ReferenceQueue , yeah then its
no leak.

>> As I already said, I also have a memory leak problem in my code, since
>> while parsing the model file, I have to build up a list of unresolved
>> forward-references, which can be resolved when elements occuring later
>> in the file get parsed. (Its not an XMI file, actually an IFC file,
>> but the problem is nearly equivalent to XMI files).
> We seem to apply different semantics to the term "memory leak". I would
> not call the above scenario a memory leak. It's just a piece of
> transient information that *needs* to be kept in memory for an
> indeterministic, *but not indefinite* time.

Ok, a little bit imprecise, I am used to see it from a tester's point of
view.

Now I figured out that a larger IFC model i tested was likely generated
by some other CAD vendor, since there are nearly no backward references,
only forward references. It seems they used a totally different approach
for serialization. So first I have to store the forward references in
the DB to proceed with my leak search:) I plan to use an int-array, I
think this is supported by CDO, right? As CDO currently has no good
solution to manage large containment lists (even if only "smart
pointers" are used in the list), I now splitted the containment list of
the root element containing the whole model into a binary tree. This
effectively means, that for every stored model element, about one or two
containment-helper-elements are additionally stored, each holding up to
3 references (one for the 0 branch, one for the 1 branch, and one for
the model element itself), so this is quite acceptable I think. So now I
want to store additionally an int-array in the
containment-helper-element for keeping information about unresolved
forward references of the currently parsed element. After the whole IFC
file is parsed, I will iterate over the whole binary tree (by using
eAllContents()), and resolve the forward references by using the
int-arrays, and maybe setting the int-array to [] since they are no
longer needed.

I plan to contribute this solution to CDO since it seems to be very useful:
(a) on client side, only O(log(modelsize)) RAM is needed for parsing big
XMI files (the path from a leaf of the binary tree to the root). The
binary tree can be used as map for backward references, and can also
store forward references in an int-array.
(b) it solves the problem of inefficient large lists when the model is
loaded from CDO database
(c) serialization, which has the problem of memorizing (XMI-)IDs for
serialized elements, can also be done in O(log(modelsize)) time, since
the ids are already stored in the binary tree.

Another thought of mine was to write to DB directly, without using CDO,
this should be faster. But I dont want to deal with SQL and I want to
use CDO's garbage-collection-optimized-model-handling for the binary
tree itself on client side. Nevertheless it would be nice that the
domain metamodel and the domain editors do not need to know the EMF
metamodel for the binary tree. So the root element's containment list,
say its feature myContainedElements, should use the binary list
automatically. I partially realized this by deriving the root element's
metaclass from the binary tree's root node metaclass. A further step
would be to mark myContainedElements as derived. Any idea to better hide
the binary tree is welcome!

Report message to a moderator

(no subject) [message #687322 is a reply to message #687090]

Mon, 30 May 2011 10:20

Stefan Winkler is currently offline

Friend

Messages: 307
Registered: July 2009
Location: Germany

Senior Member

Hi,

you wrote
> As CDO currently has no good solution to manage large containment lists

what exactly do you mean by "good" solution?
good == performance (speed) or good == memory footprint

Please be aware that
a) EMF (and thus, CDO) does not support int[] out of the box, you'd have
to write an EDataType along with a proper mapping (including a good CDO
TypeMapping)
b) The int[] would be kept as an atomic value. Each time you commit it,
it would be transferred and written into the DB as a whole (it would not
be delta-aware)

So, could you please elaborate a bit more
- why wouldn't you use an EAttribute of type EInt with isMany == true?
- why do you want to store the int[] in the DB in the first place? If I
understand correctly, it is only a helper field, and as such it would be
of transient nature, right? So you could store it locally (in memory or
in a temp file), couldn't you?

Cheers,
Stefan

Am 29.05.11 14:35, schrieb exquisitus:
>
>> That's not true. The values (keyed soft refs) are registered with a
>> ReferenceQueue where they are enqueued by the JVM automatically when the
>> values are garbage collected. This queue is monitored by a worker thread
>> that periodically polls the queue, takes the keys of the keyed soft refs
>> and removes them (i.e. the cache map entries) from the cache map. This
>> behaviour is configurable if you cast your cache to ReferenceQueueWorker:
>>
>> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setPollMillis(long)
>> org.eclipse.net4j.util.ref.ReferenceQueueWorker.setMaxWorkPerPoll(int)
>
> I was not aware of the functionality of ReferenceQueue , yeah then its
> no leak.
>
>
>>> As I already said, I also have a memory leak problem in my code, since
>>> while parsing the model file, I have to build up a list of unresolved
>>> forward-references, which can be resolved when elements occuring later
>>> in the file get parsed. (Its not an XMI file, actually an IFC file,
>>> but the problem is nearly equivalent to XMI files).
>> We seem to apply different semantics to the term "memory leak". I would
>> not call the above scenario a memory leak. It's just a piece of
>> transient information that *needs* to be kept in memory for an
>> indeterministic, *but not indefinite* time.
>
>
> Ok, a little bit imprecise, I am used to see it from a tester's point of
> view.
>
> Now I figured out that a larger IFC model i tested was likely generated
> by some other CAD vendor, since there are nearly no backward references,
> only forward references. It seems they used a totally different approach
> for serialization. So first I have to store the forward references in
> the DB to proceed with my leak search:) I plan to use an int-array, I
> think this is supported by CDO, right? As CDO currently has no good
> solution to manage large containment lists (even if only "smart
> pointers" are used in the list), I now splitted the containment list of
> the root element containing the whole model into a binary tree. This
> effectively means, that for every stored model element, about one or two
> containment-helper-elements are additionally stored, each holding up to
> 3 references (one for the 0 branch, one for the 1 branch, and one for
> the model element itself), so this is quite acceptable I think. So now I
> want to store additionally an int-array in the
> containment-helper-element for keeping information about unresolved
> forward references of the currently parsed element. After the whole IFC
> file is parsed, I will iterate over the whole binary tree (by using
> eAllContents()), and resolve the forward references by using the
> int-arrays, and maybe setting the int-array to [] since they are no
> longer needed.
>
> I plan to contribute this solution to CDO since it seems to be very useful:
> (a) on client side, only O(log(modelsize)) RAM is needed for parsing big
> XMI files (the path from a leaf of the binary tree to the root). The
> binary tree can be used as map for backward references, and can also
> store forward references in an int-array.
> (b) it solves the problem of inefficient large lists when the model is
> loaded from CDO database
> (c) serialization, which has the problem of memorizing (XMI-)IDs for
> serialized elements, can also be done in O(log(modelsize)) time, since
> the ids are already stored in the binary tree.
>
> Another thought of mine was to write to DB directly, without using CDO,
> this should be faster. But I dont want to deal with SQL and I want to
> use CDO's garbage-collection-optimized-model-handling for the binary
> tree itself on client side. Nevertheless it would be nice that the
> domain metamodel and the domain editors do not need to know the EMF
> metamodel for the binary tree. So the root element's containment list,
> say its feature myContainedElements, should use the binary list
> automatically. I partially realized this by deriving the root element's
> metaclass from the binary tree's root node metaclass. A further step
> would be to mark myContainedElements as derived. Any idea to better hide
> the binary tree is welcome!

Report message to a moderator

(no subject) [message #687330 is a reply to message #687099]

Tue, 31 May 2011 04:25

John Smith is currently offline

Friend

Messages: 137
Registered: July 2009

Senior Member

> you wrote
>> As CDO currently has no good solution to manage large containment lists
>
> what exactly do you mean by "good" solution?
> good == performance (speed) or good == memory footprint

good == memory footprint. As I understand the code of CDO, it creates a
list of always full size, which has large memory footprint, even if the
items are only CDOElementProxyImpl instances, see
CDOListWithElementProxiesImpl. Maybe a better solution would be not to
derive from ArrayList but from HashMap<Integer, Object>: thus mapping
the index to the currently hold value (a CDOElementProxy, a CDOID or the
model element itself). Then, if listAsMap.get(i) would return null, the
implementation could actually return CDOElementProxyImpl(i), so
demand-created.

> - why wouldn't you use an EAttribute of type EInt with isMany == true?
I actually modeled it that way, sorry for any confusion.

> - why do you want to store the int[] in the DB in the first place? If I
> understand correctly, it is only a helper field, and as such it would be
> of transient nature, right? So you could store it locally (in memory or
> in a temp file), couldn't you?

Storing this transient information in memory is no option, because it
will cause an outofmem error.
Storing in a temp file would be an option, but this requires some smart
logic to write entries and find them later in a big file - nothing else
is done by a DB, so I use CDO at best.

I now successful tried this approach (storing forward references as
int-array with the model itself) for a small 6mb IFC model.
For a larger 120mb model, I got an outofmem errpr. I think the reason
was that ObjectTypeCache.DEFAULT_CACHE_CAPACITY was 100000, and this was
to much as I only permitted 60mb RAM for server and client. I will do
another test with 90MB (i want to keep this limit down, to produce
outofmems, in order to find memory leaks).

Report message to a moderator

(no subject) [message #687332 is a reply to message #687105]

Tue, 31 May 2011 08:47

Stefan Winkler is currently offline

Friend

Messages: 307
Registered: July 2009
Location: Germany

Senior Member

Hi,

Am 31.05.11 06:25, schrieb exquisitus:
>>> As CDO currently has no good solution to manage large containment lists
>>
>> what exactly do you mean by "good" solution?
>> good == performance (speed) or good == memory footprint
>
> good == memory footprint. As I understand the code of CDO, it creates a
> list of always full size, which has large memory footprint, even if the
> items are only CDOElementProxyImpl instances, see
> CDOListWithElementProxiesImpl. Maybe a better solution would be not to
> derive from ArrayList but from HashMap<Integer, Object>: thus mapping
> the index to the currently hold value (a CDOElementProxy, a CDOID or the
> model element itself). Then, if listAsMap.get(i) would return null, the
> implementation could actually return CDOElementProxyImpl(i), so
> demand-created.

Well, implementing a list as a hash-map would maybe work for the get(i)
method. But inserting, removing, and moving elements would be really
costly, as you'd have to change all subsequent indices (= hash keys!).
At the same time, we don't win anything by this, because the reason for
the element proxies is to know that there is an element there, which
could be loaded on demand. You'd have to differentiate between no
element (because list isn't large enough) from proxy element (can be
loaded on demand) anyways. So the hashmap does not yield any advantage.

> I now successful tried this approach (storing forward references as
> int-array with the model itself) for a small 6mb IFC model.
> For a larger 120mb model, I got an outofmem errpr. I think the reason
> was that ObjectTypeCache.DEFAULT_CACHE_CAPACITY was 100000, and this was
> to much as I only permitted 60mb RAM for server and client. I will do
> another test with 90MB (i want to keep this limit down, to produce
> outofmems, in order to find memory leaks).
I don't know whether you are aware that dirty objects (before committing
them) do not get GC'd. So, the transaction size (amount of
dirty/uncommitted objects) also is a source of OOMEs. If this is the
problem, you should commit more often...

Cheers,
Stefan

Report message to a moderator

(no subject) [message #687333 is a reply to message #687106]

Tue, 31 May 2011 09:49

John Smith is currently offline

Friend

Messages: 137
Registered: July 2009

Senior Member

> Well, implementing a list as a hash-map would maybe work for the get(i)
> method. But inserting, removing, and moving elements would be really
> costly, as you'd have to change all subsequent indices (= hash keys!).
The same effort must be done for lists! if you insert an element at
position i in a list, all following elements with index greater than i
must be shifted! In a hashmap, all keys greater than i must be adapted
accordingly. Ok, this takes O(n) (iterating over all keys), in contrast
to O(n\2) for lists, considering the average case. But if the map only
contains e.g. half of the elements, then we have alrady the same effort!
The less elements are in the map, the better performs the map in
contrast to a list. Considering large lists where you want to handle no
much elements at a time, a map performs much better.

By the way, adding an element to the end would be as fast as for a list.
In 99% of all cases you add to the end. Normaly only when you want to
delete elements from the list, the above comparison is relevant.

> At the same time, we don't win anything by this, because the reason for
> the element proxies is to know that there is an element there, which
> could be loaded on demand. You'd have to differentiate between no
> element (because list isn't large enough) from proxy element (can be
> loaded on demand) anyways. So the hashmap does not yield any advantage.

The list.size() method can be used to determine how many elements are
available. However by deriving from ArrayList, the current
implementation has to add dummies, namely CDOElementProxyImpl, to get
list.size "to the right number".

However when the List interface would be re-implemented, so that the
implemented methods would use a HashMap, and if the size() method would
return an own managed counter of how many elements are available, then
the HashMap-approach can unfolds its power.

>
>> I now successful tried this approach (storing forward references as
>> int-array with the model itself) for a small 6mb IFC model.
>> For a larger 120mb model, I got an outofmem errpr. I think the reason
>> was that ObjectTypeCache.DEFAULT_CACHE_CAPACITY was 100000, and this was
>> to much as I only permitted 60mb RAM for server and client. I will do
>> another test with 90MB (i want to keep this limit down, to produce
>> outofmems, in order to find memory leaks).
> I don't know whether you are aware that dirty objects (before committing
> them) do not get GC'd. So, the transaction size (amount of
> dirty/uncommitted objects) also is a source of OOMEs. If this is the
> problem, you should commit more often...
I commit every 5000 "dirty" elements, so this cannot be a source of OOME.

Report message to a moderator

(no subject) [message #687334 is a reply to message #687107]

Tue, 31 May 2011 11:22

Stefan Winkler is currently offline

Friend

Messages: 307
Registered: July 2009
Location: Germany

Senior Member

Hi,

ok, let's go away from the technical stuff (list, hashmap, etc.).
What you really would like to have is a more (memory-)efficient
implementation of partially loaded lists, right?

I am not sure if this can be done (mainly because I think, the standard
behavior should be kept and an optimization like the one you propose
should be configurable; also, I am currently unsure how much the
structure/implementation is induced and dependent by EMF itself).

Technically, an implementation like a ChunkedList (a collection of
chunks of ArrayLists) would be preferrable over a HashMap.

And finally, one would have to think about eviction strategies, because
currently, lists cannot be "unloaded" partially.

Cheers,
Stefan

PS: I usually like to know who I'm talking to. So if you don't mind,
please put some name in your messages rather than "exquisitus". :-P

Am 31.05.11 11:49, schrieb exquisitus:
>
> > Well, implementing a list as a hash-map would maybe work for the get(i)
> > method. But inserting, removing, and moving elements would be really
> > costly, as you'd have to change all subsequent indices (= hash keys!).
> The same effort must be done for lists! if you insert an element at
> position i in a list, all following elements with index greater than i
> must be shifted! In a hashmap, all keys greater than i must be adapted
> accordingly. Ok, this takes O(n) (iterating over all keys), in contrast
> to O(n\2) for lists, considering the average case. But if the map only
> contains e.g. half of the elements, then we have alrady the same effort!
> The less elements are in the map, the better performs the map in
> contrast to a list. Considering large lists where you want to handle no
> much elements at a time, a map performs much better.
>
> By the way, adding an element to the end would be as fast as for a list.
> In 99% of all cases you add to the end. Normaly only when you want to
> delete elements from the list, the above comparison is relevant.
>
> > At the same time, we don't win anything by this, because the reason for
> > the element proxies is to know that there is an element there, which
> > could be loaded on demand. You'd have to differentiate between no
> > element (because list isn't large enough) from proxy element (can be
> > loaded on demand) anyways. So the hashmap does not yield any advantage.
>
> The list.size() method can be used to determine how many elements are
> available. However by deriving from ArrayList, the current
> implementation has to add dummies, namely CDOElementProxyImpl, to get
> list.size "to the right number".
>
> However when the List interface would be re-implemented, so that the
> implemented methods would use a HashMap, and if the size() method would
> return an own managed counter of how many elements are available, then
> the HashMap-approach can unfolds its power.
>
> >
> >> I now successful tried this approach (storing forward references as
> >> int-array with the model itself) for a small 6mb IFC model.
> >> For a larger 120mb model, I got an outofmem errpr. I think the reason
> >> was that ObjectTypeCache.DEFAULT_CACHE_CAPACITY was 100000, and this
> was
> >> to much as I only permitted 60mb RAM for server and client. I will do
> >> another test with 90MB (i want to keep this limit down, to produce
> >> outofmems, in order to find memory leaks).
> > I don't know whether you are aware that dirty objects (before committing
> > them) do not get GC'd. So, the transaction size (amount of
> > dirty/uncommitted objects) also is a source of OOMEs. If this is the
> > problem, you should commit more often...
> I commit every 5000 "dirty" elements, so this cannot be a source of OOME.

Report message to a moderator

Switch to threaded view of this topic

Create a new topic

Submit Reply

Previous Topic:	(no subject)
Next Topic:	(no subject)

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Fri Apr 19 02:08:27 GMT 2024

Powered by FUDForum. Page generated in 0.02962 seconds

.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top