Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF » [CDO] Store multiple large files concurrently(Storing large files in CDO)
[CDO] Store multiple large files concurrently [message #1829495] Sun, 05 July 2020 11:29 Go to next message
Ewoud Werkman is currently offline Ewoud WerkmanFriend
Messages: 28
Registered: January 2018
Junior Member
Hello!

I'm creating a REST api for accessing a CDO repository. Multiple users can connect at the same time to store and retrieve models from the repo. I started out with caching a SessionConfiguration (which also caches an open session) for each user's HTTP session to speed up CDO access.

I'm now trying to store a larger set of files (>4MB XMI each, ~80MB in total) in CDO in parallel (in the same cached CDO user session) and having a problem with it.

I lock the folder in which all files will be stored by using a folder.lockOption().lock() with a timeout of 10 minutes, but get a TimeoutRuntimeException while the next file is waiting for the lock during the commit of the curent file in the repo. I assume this has something to do with that this lock is in the current CDOSession. How can I improve this? Do I need to queue these files first?

Additionally, it takes quite some time to store a single file in the repository (the commit itself takes quite some time). Are there ways to speed up this process?

Thanks in advance!

Ewoud
Re: [CDO] Store multiple large files concurrently [message #1829518 is a reply to message #1829495] Mon, 06 July 2020 05:59 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Quote:
I'm creating a REST api for accessing a CDO repository. Multiple users can connect at the same time to store and retrieve models from the repo. I started out with caching a SessionConfiguration (which also caches an open session) for each user's HTTP session to speed up CDO access.


Hi Ewoud,

a CDOSession, through it's CDORevisionManager, caches CDORevisions which contain all the data of the objects. These revision caches can become big. Using multiple client sessions on the server and keeping them open can become a resource problem. As an alternative you could open just one session and open a separate transaction for each user of this session. Unfortunately the user name is currently associated with the session (if at all) and not with a transaction or commit. In theory this could be enhanced if you are interested in sponsoring a respective effort.

Quote:
I'm now trying to store a larger set of files (>4MB XMI each, ~80MB in total) in CDO in parallel (in the same cached CDO user session) and having a problem with it.


Does that mean you open 1 transaction on that session and have multiple threads calling commit on that transaction? Note that all methods of CDOTransaction and CDOView, as well as all model getters and setters synchronize on the view/transaction monitor lock. They will not commit in parallel.

Quote:
I lock the folder in which all files will be stored by using a folder.lockOption().lock() with a timeout of 10 minutes, but get a TimeoutRuntimeException while the next file is waiting for the lock during the commit of the curent file in the repo. I assume this has something to do with that this lock is in the current CDOSession. How can I improve this? Do I need to queue these files first?


I guess you mean folder.cdoWriteOption().lock()? This would not mean that you own a write lock. It means that you own the exclusive right to acquire a write lock. With this type of lock (i.e., lock option) you prevent others from acquiring a write lock, but you don't prevent others from acquiring read locks.

Quote:
Additionally, it takes quite some time to store a single file in the repository (the commit itself takes quite some time). Are there ways to speed up this process?


It's not enirely clear to me what you're doing and how. Is it possible that you show the relevant code and the full stack traces that you receive?


Re: [CDO] Store multiple large files concurrently [message #1829624 is a reply to message #1829518] Tue, 07 July 2020 19:11 Go to previous messageGo to next message
Ewoud Werkman is currently offline Ewoud WerkmanFriend
Messages: 28
Registered: January 2018
Junior Member
Eike Stepper wrote on Mon, 06 July 2020 05:59

Quote:
I'm creating a REST api for accessing a CDO repository. Multiple users can connect at the same time to store and retrieve models from the repo. I started out with caching a SessionConfiguration (which also caches an open session) for each user's HTTP session to speed up CDO access.


Hi Ewoud,

a CDOSession, through it's CDORevisionManager, caches CDORevisions which contain all the data of the objects. These revision caches can become big. Using multiple client sessions on the server and keeping them open can become a resource problem. As an alternative you could open just one session and open a separate transaction for each user of this session. Unfortunately the user name is currently associated with the session (if at all) and not with a transaction or commit. In theory this could be enhanced if you are interested in sponsoring a respective effort.


Thanks for your quick reply!
Yes, that is spot on. I tried first using one SessionConfiguration and a single session, but I could not set the user name, and that information is necessary for our use case (keep a history of commits by each user on the same model). The user name is set in the SessionConfiguration and not when opening a new session or transaction (it will generate a"Session is already open" error). Unfortunately, sponsoring is not an option for me.

Quote:
Quote:
I'm now trying to store a larger set of files (>4MB XMI each, ~80MB in total) in CDO in parallel (in the same cached CDO user session) and having a problem with it.


Does that mean you open 1 transaction on that session and have multiple threads calling commit on that transaction? Note that all methods of CDOTransaction and CDOView, as well as all model getters and setters synchronize on the view/transaction monitor lock. They will not commit in parallel.

Before committing the transaction I first lock the folder in which the resource will be placed. I do the following:

CDOTransaction transaction = session.openTransaction();
CDOResourceFolder resourceFolder = transaction.getResourceFolder(folder);
CDOLock cdoWriteOption = resourceFolder.cdoWriteOption();
try {
	System.out.print("Acquiring write lock for folder "+ resourceFolder.getPath() + "...");
	cdoWriteOption.lock(CONCURRENTWRITE_TIMEOUT_MINUTES, TimeUnit.MINUTES);
	System.out.println("done");
	CDOResource resource = transaction.getOrCreateResource(resourceName);
	if (resource.getContents().size() == 1) {
		// overwrite
		resource.getContents().set(0, rootObject);
	} else {
		resource.getContents().add(rootObject);
	}
	System.out.println("Committing transaction");
	CDOCommitInfo commitInfo = transaction.commit();
} catch (Exception e) {
	System.out.println("Acquiring write lock: timed out");
	throw e;
} finally {
	cdoWriteOption.unlock();
}

The web browser (ajax calls) opens a HTTP connection to the CDO REST endpoint for each file that gets uploaded, so if the user uploads 5 files, it will do five parallel connections and the code above will be called 5 times concurrently.
The principle I thought would work is lock on the folder (the folder resource itself also changes due to adding or updating a resource in that folder) for a write option (so other clients can still read from the folder in their session). The first connection will get the lock and the others will wait for a maximum of CONCURRENTWRITE_TIMEOUT_MINUTES minutes before timing out. Unfortunately this doesn't work, as I get the following TimeoutException:
[INFO] [ERROR   ] Error: org.eclipse.net4j.util.WrappedException: java.util.concurrent.TimeoutException
[INFO] [err] org.eclipse.net4j.util.WrappedException: java.util.concurrent.TimeoutException
[INFO] [err]    at org.eclipse.net4j.util.WrappedException.wrap(WrappedException.java:54)
[INFO] [err]    at org.eclipse.emf.cdo.internal.net4j.protocol.CDOClientProtocol.lockObjects2(CDOClientProtocol.java:337)
[INFO] [err]    at org.eclipse.emf.internal.cdo.view.CDOViewImpl.lockObjects(CDOViewImpl.java:405)
[INFO] [err]    at org.eclipse.emf.internal.cdo.view.CDOViewImpl.lockObjects(CDOViewImpl.java:353)
[INFO] [err]    at org.eclipse.emf.internal.cdo.object.CDOLockImpl.lock(CDOLockImpl.java:84)
[INFO] [err]    at nl.tno.esdl.hub.cdo.CDOManager.storeResource(CDOManager.java:200)
[INFO] [err]    at nl.tno.esdl.hub.CDOServerResource.putResource(CDOServerResource.java:186)
[INFO] [err]    at nl.tno.esdl.hub.CDOServerResource$Proxy$_$$_WeldClientProxy.putResource(Unknown Source)
[INFO] [err]    at sun.reflect.GeneratedMethodAccessor955.invoke(Unknown Source)
[INFO] [err]    at java.lang.reflect.Method.invoke(Method.java:498)
[INFO] [err]    at com.ibm.ws.jaxrs20.cdi.component.JaxRsFactoryImplicitBeanCDICustomizer.serviceInvoke(JaxRsFactoryImplicitBeanCDICustomizer.java:339)
[INFO] [err]    at [internal classes]
[INFO] [err]    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[INFO] [err]    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[INFO] [err]    at java.lang.Thread.run(Thread.java:748)
[INFO] [err] Caused by: java.util.concurrent.TimeoutException
[INFO] [err]    at org.eclipse.net4j.util.io.IOTimeoutException.createTimeoutException(IOTimeoutException.java:46)
[INFO] [err]    at org.eclipse.net4j.signal.Signal.runSync(Signal.java:301)
[INFO] [err]    at org.eclipse.net4j.signal.SignalProtocol.startSignal(SignalProtocol.java:527)
[INFO] [err]    at org.eclipse.net4j.signal.RequestWithConfirmation.doSend(RequestWithConfirmation.java:90)
[INFO] [err]    at org.eclipse.net4j.signal.RequestWithConfirmation.send(RequestWithConfirmation.java:76)
[INFO] [err]    at org.eclipse.emf.cdo.internal.net4j.protocol.CDOClientProtocol.lockObjects2(CDOClientProtocol.java:318)
[INFO] [err]    ... 57 more
[INFO] [err] Caused by: org.eclipse.net4j.util.io.IOTimeoutException
[INFO] [err]    at org.eclipse.net4j.buffer.BufferInputStream.computeTimeout(BufferInputStream.java:354)
[INFO] [err]    at org.eclipse.net4j.buffer.BufferInputStream.ensureBuffer(BufferInputStream.java:310)
[INFO] [err]    at org.eclipse.net4j.buffer.BufferInputStream.read(BufferInputStream.java:135)
[INFO] [err]    at java.io.DataInputStream.readBoolean(DataInputStream.java:242)
[INFO] [err]    at org.eclipse.net4j.util.io.ExtendedDataInput$Delegating.readBoolean(ExtendedDataInput.java:79)
[INFO] [err]    at org.eclipse.emf.cdo.internal.net4j.protocol.LockObjectsRequest.confirming(LockObjectsRequest.java:78)
[INFO] [err]    at org.eclipse.emf.cdo.internal.net4j.protocol.LockObjectsRequest.confirming(LockObjectsRequest.java:1)
[INFO] [err]    at org.eclipse.emf.cdo.internal.net4j.protocol.CDOClientRequest.confirming(CDOClientRequest.java:104)
[INFO] [err]    at org.eclipse.net4j.signal.RequestWithConfirmation.doExtendedInput(RequestWithConfirmation.java:126)
[INFO] [err]    at org.eclipse.net4j.signal.Signal.doInput(Signal.java:391)
[INFO] [err]    at org.eclipse.net4j.signal.RequestWithConfirmation.doExecute(RequestWithConfirmation.java:106)
[INFO] [err]    at org.eclipse.net4j.signal.SignalActor.execute(SignalActor.java:53)
[INFO] [err]    at org.eclipse.net4j.signal.Signal.runSync(Signal.java:297)
[INFO] [err]    ... 61 more


Quote:
Quote:
I lock the folder in which all files will be stored by using a folder.lockOption().lock() with a timeout of 10 minutes, but get a TimeoutRuntimeException while the next file is waiting for the lock during the commit of the curent file in the repo. I assume this has something to do with that this lock is in the current CDOSession. How can I improve this? Do I need to queue these files first?


I guess you mean folder.cdoWriteOption().lock()? This would not mean that you own a write lock. It means that you own the exclusive right to acquire a write lock. With this type of lock (i.e., lock option) you prevent others from acquiring a write lock, but you don't prevent others from acquiring read locks.


Ah, I misunderstood that, but folder.cdoWriteLock().lock() gives me the same issue (with the same user session and CDOSession). It seems that locking always gives the above Exception, even with one file. I seem to misunderstand the use of locks. Is locking on a ResourceFolder not a good idea?

Quote:
Quote:
Additionally, it takes quite some time to store a single file in the repository (the commit itself takes quite some time). Are there ways to speed up this process?


It's not enirely clear to me what you're doing and how. Is it possible that you show the relevant code and the full stack traces that you receive?


I hope I clarified it a bit. What I meant was that storing a single 4MB XMI takes significant time and I was wondering how I could speed this up. The database is a PostgreSQL 12 database and both the db and the cdo server run on a 8 core 32gb machine.

[Updated on: Tue, 07 July 2020 19:23]

Report message to a moderator

Re: [CDO] Store multiple large files concurrently [message #1829634 is a reply to message #1829624] Wed, 08 July 2020 05:48 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
A few comments:

1. You need a cdoWriteLock() to exclude concurrent commits and other write locks.

2. A write lock does not exclude other reads, just other read locks.

3. The timeout on the lock() call should be smaller than Net4j's protocol timeout.
You can increase the protocol timeout like so:
((org.eclipse.emf.cdo.net4j.CDONet4jSession)session).options().getNet4jProtocol().setTimeout(lockTimeout + 10000);
This should fix your exception.

4. If all your model access goes through your REST endpoint then you could perhaps do the thread synchronization through
an own in-memory mechansim. Then you could leave the protocol timeout small.

5. Regarding the time needed for committing a large CLOB: Many things happen during a commit.
Network transport to the server, (implicit) locking, pre-commit checks, and actual DB persisting.
Can you use a profiler to measure the time spent in TransactionCommitContext.writeAccessor()?


Re: [CDO] Store multiple large files concurrently [message #1829699 is a reply to message #1829634] Thu, 09 July 2020 07:08 Go to previous messageGo to next message
Ewoud Werkman is currently offline Ewoud WerkmanFriend
Messages: 28
Registered: January 2018
Junior Member
Hi Eike,

Thanks for clarifying.
I messed up here quite well: the above mentioned code is annotated with @Singleton and is injected in the running (JAX-RS) request. So it is single threaded, and therefore it will always own the write lock. Therefore not barrier was created by using the locks, and consequently generated the timeouts.

I have to redesign this code to allow for concurrent writes in a proper way.

Regarding performance, it seems that here my setup is failing. Looking at the CPU usage and memory usage of the CDO Server is quite low. I run everything in Docker containers in Windows Subsytem for Linux (WSL) which seems to have quite some overhead for this use case. Time spend in TransactionCommitContext.writeAccessor() is under half a second for a 4MB XMI.

Thanks,
Ewoud
Re: [CDO] Store multiple large files concurrently [message #1829702 is a reply to message #1829699] Thu, 09 July 2020 08:12 Go to previous message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
So it sounds as if CDO is working and performing as expected. Good ;-)

Good luck for your refactorings!


Previous Topic:How to transform a xmi file to atl file?
Next Topic:Referenced model in a jar
Goto Forum:
  


Current Time: Wed Apr 24 15:14:30 GMT 2024

Powered by FUDForum. Page generated in 0.03387 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top