Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF » [CDO] Performance Evaluation
[CDO] Performance Evaluation [message #902925] Tue, 21 August 2012 08:33 Go to next message
Benjamin Kiel is currently offline Benjamin Kiel
Messages: 2
Registered: August 2012
Junior Member
Hello,

we have got the requirement to create and persist large (> 1,000,000 objects) models as fast as possible. So, we wanted to figure out if CDO could do this job. We wrote a simple application (see below) which creates 100 times 10000 objects and stores them in the model repository. Unfortunately, the required time increases linearly:

1K objects: 39948ms
2K objects: 23735ms
3K objects: 20148ms
4K objects: 26071ms
5K objects: 31677ms
6K objects: 38203ms
7K objects: 42505ms
8K objects: 47674ms
9K objects: 54031ms
10K objects: 60786ms
11K objects: 71306ms
12K objects: 78143ms
13K objects: 81833ms
14K objects: 83154ms
15K objects: 89243ms
16K objects: 93089ms
17K objects: 96053ms
18K objects: 103253ms
19K objects: 107409ms
20K objects: 111533ms
21K objects: 120376ms
22K objects: 123256ms
23K objects: 127350ms
24K objects: 132841ms
25K objects: 139857ms
26K objects: 143583ms
27K objects: 149437ms
28K objects: 154525ms
29K objects: 160366ms
30K objects: 165254ms
31K objects: 170953ms
32K objects: 178189ms
33K objects: 183107ms
34K objects: 189509ms
35K objects: 195502ms
36K objects: 198536ms
37K objects: 206793ms
38K objects: 213348ms
39K objects: 220337ms
40K objects: 228114ms
41K objects: 232621ms
42K objects: 239319ms
43K objects: 242981ms
44K objects: 246999ms
45K objects: 254089ms
46K objects: 256820ms
47K objects: 266246ms
48K objects: 269477ms
49K objects: 276253ms
50K objects: 278943ms
51K objects: 296901ms
52K objects: 315233ms
53K objects: 329660ms
54K objects: 331005ms
55K objects: 346839ms
56K objects: 341569ms

Does anybody have experience with large models? Is it possible to persist objects in constant time? Maybe there is a switch we have not find yet.


Our test environment:
- Intel Core 2 Duo รก 2GHz
- 3 GB RAM, Ubuntu
- Eclipse Juno, CDO 4.1
- MySQL 5.1

cdo-server.xml:
<?xml version="1.0" encoding="UTF-8"?>
<cdoServer>
	<acceptor type="tcp" listenAddr="0.0.0.0" port="2036"></acceptor>

	<repository name="repo1">
		<property name="overrideUUID" value=""/>
		<property name="supportingAudits" value="false"/>
		<property name="supportingBranches" value="false"/>
		<property name="serializeCommits" value="false" />
		<property name="verifyingRevisions" value="false"/>
		<property name="currentLRUCapacity" value="10000"/>
		<property name="revisedLRUCapacity" value="10000"/>

		<store type="hibernate">
			<mappingProvider type="teneo">
				<property name="teneo.mapping.cascade_policy_on_non_containment" value="PERSIST,MERGE"/>
				<property name="teneo.mapping.persistence_xml" value="/META-INF/company_model_teneo_annotations.xml"/>
				<property name="teneo.mapping.inheritance" value="SINGLE_TABLE"/>
				<property name="teneo.mapping.add_index_for_fk" value="true" />
				<property name="teneo.mapping.fetch_one_to_many_extra_lazy" value="true" />
			</mappingProvider>

			<property name="hibernate.hbm2ddl.auto" value="create"/>
			<property name="hibernate.show_sql" value="false"/>
			<property name="hibernate.connection.pool_size" value="10"/>
			<property name="hibernate.cache.provider_class" value="org.hibernate.cache.HashtableCacheProvider"/>

 			<property name="hibernate.dialect" value="org.hibernate.dialect.MySQL5InnoDBDialect"/> 
			<property name="hibernate.connection.driver_class" value="com.mysql.jdbc.Driver"/>
			<property name="hibernate.connection.url" value="jdbc:mysql://localhost:3306/cdohibernate"/>
			<property name="hibernate.connection.username" value="root"/>
			<property name="hibernate.connection.password" value="root"/>
		</store>
	</repository>
</cdoServer>


MyTest.java (based on org.eclipse.emf.cdo.examples.hibernate.client.HibernateQueryTest by Martin Taal):
package de.benjaminkiel.org.eclipse.cdo.test;

import java.util.LinkedList;
import java.util.List;
import org.eclipse.emf.cdo.examples.company.CompanyFactory;
import org.eclipse.emf.cdo.examples.company.CompanyPackage;
import org.eclipse.emf.cdo.examples.company.Customer;
import org.eclipse.emf.cdo.net4j.CDONet4jSession;
import org.eclipse.emf.cdo.net4j.CDONet4jSessionConfiguration;
import org.eclipse.emf.cdo.net4j.CDONet4jUtil;
import org.eclipse.emf.cdo.session.CDOCollectionLoadingPolicy;
import org.eclipse.emf.cdo.session.CDOSession;
import org.eclipse.emf.cdo.transaction.CDOTransaction;
import org.eclipse.emf.cdo.util.CDOUtil;
import org.eclipse.net4j.Net4jUtil;
import org.eclipse.net4j.connector.IConnector;
import org.eclipse.net4j.tcp.TCPUtil;
import org.eclipse.net4j.util.container.ContainerUtil;
import org.eclipse.net4j.util.container.IManagedContainer;

public class MyTest {

	private static final int ROUNDS = 100;
	private static final int NUM_OF_CUSTOMERS = 10000;
	private static final String REPO_NAME = "repo1";
	private static final String CONNECTION_ADDRESS = "localhost:2036";
	private static CDONet4jSessionConfiguration sessionConfiguration = null;

	public static void main(final String[] args) throws Exception {
		final CDOSession session = openSession();
		final CDOTransaction transaction = session.openTransaction();

		long last = System.currentTimeMillis();
		final CDOCollectionLoadingPolicy policy = CDOUtil
				.createCollectionLoadingPolicy(0, 100);
		session.options().setCollectionLoadingPolicy(policy);

		for (int i = 0; i < ROUNDS; i++) {
			transaction.getOrCreateResource("/test1").getContents()
					.addAll(fillResource());
			transaction.commit();
			System.out.println("round " + (i + 1) + ": "
					+ (System.currentTimeMillis() - last) + "ms");
			last = System.currentTimeMillis();
		}
	}

	private static List<Customer> fillResource() {
		final List<Customer> customers = new LinkedList<Customer>();
		for (int i = 0; i < NUM_OF_CUSTOMERS; i++) {
			final Customer customer = CompanyFactory.eINSTANCE.createCustomer();
			customer.setCity("City " + i);
			customer.setName(i + "");
			customer.setStreet("Street " + i);
			customers.add(customer);
		}
		return customers;
	}

	private static CDOSession openSession() {
		if (sessionConfiguration == null) {
			initialize();
		}

		final CDONet4jSession cdoSession = sessionConfiguration
				.openNet4jSession();
		cdoSession.getPackageRegistry().putEPackage(CompanyPackage.eINSTANCE);
		return cdoSession;
	}

	private static void initialize() {
		final IManagedContainer container = ContainerUtil.createContainer();
		Net4jUtil.prepareContainer(container);
		TCPUtil.prepareContainer(container);
		CDONet4jUtil.prepareContainer(container);
		container.activate();

		final IConnector connector = TCPUtil.getConnector(container,
				CONNECTION_ADDRESS);

		sessionConfiguration = CDONet4jUtil.createNet4jSessionConfiguration();
		sessionConfiguration.setConnector(connector);
		sessionConfiguration.setRepositoryName(REPO_NAME);
	}
}
Re: [CDO] Performance Evaluation [message #902932 is a reply to message #902925] Tue, 21 August 2012 09:11 Go to previous messageGo to next message
Eike Stepper is currently offline Eike Stepper
Messages: 5545
Registered: July 2009
Senior Member
Hi Benjamin,

You should add the keyword Hibernate or Teneo to the subject of posts to make Martin aware of them. I've cc'ed him now.

More comments below...



Am 21.08.2012 10:33, schrieb Benjamin Kiel:
> Hello,
>
> we have got the requirement to create and persist large (> 1,000,000 objects) models as fast as possible. So, we
> wanted to figure out if CDO could do this job. We wrote a simple application (see below) which creates 100 times 10000
> objects
So, 10K objects per commit.

> and stores them in the model repository. Unfortunately, the required time increases linearly:
>
> 1K objects: 39948ms
Do you really mean 10K, 20K, ... ?

> 2K objects: 23735ms
> 3K objects: 20148ms
> 4K objects: 26071ms
> 5K objects: 31677ms
> 6K objects: 38203ms
> 7K objects: 42505ms
> 8K objects: 47674ms
> 9K objects: 54031ms
> 10K objects: 60786ms
> 11K objects: 71306ms
> 12K objects: 78143ms
> 13K objects: 81833ms
> 14K objects: 83154ms
> 15K objects: 89243ms
> 16K objects: 93089ms
> 17K objects: 96053ms
> 18K objects: 103253ms
> 19K objects: 107409ms
> 20K objects: 111533ms
> 21K objects: 120376ms
> 22K objects: 123256ms
> 23K objects: 127350ms
> 24K objects: 132841ms
> 25K objects: 139857ms
> 26K objects: 143583ms
> 27K objects: 149437ms
> 28K objects: 154525ms
> 29K objects: 160366ms
> 30K objects: 165254ms
> 31K objects: 170953ms
> 32K objects: 178189ms
> 33K objects: 183107ms
> 34K objects: 189509ms
> 35K objects: 195502ms
> 36K objects: 198536ms
> 37K objects: 206793ms
> 38K objects: 213348ms
> 39K objects: 220337ms
> 40K objects: 228114ms
> 41K objects: 232621ms
> 42K objects: 239319ms
> 43K objects: 242981ms
> 44K objects: 246999ms
> 45K objects: 254089ms
> 46K objects: 256820ms
> 47K objects: 266246ms
> 48K objects: 269477ms
> 49K objects: 276253ms
> 50K objects: 278943ms
> 51K objects: 296901ms
> 52K objects: 315233ms
> 53K objects: 329660ms
> 54K objects: 331005ms
> 55K objects: 346839ms
> 56K objects: 341569ms
Am I interpreting this correctly, with each new 10K objects the insert time increases by ~5 seconds?
>
> Does anybody have experience with large models? Is it possible to persist objects in constant time? Maybe there is a
> switch we have not find yet.
I have absolutely no Hibernate experience, so I hope that Martin Taal has an idea.

My feeling is that updating the DB indexes must incur some kind of penalty htat depends on the number of existing objects.

Accidentally I've started to work on a completely new store with a major focus on commit times, the LissomeStore. It's
not fully finished but it passes almost all tests in our compliance suite. Both the store and the tests are already
available in Git. The store is based on journaling and writes read optimization data from a background thread. That
means that a commit takes just as long as appending it to a buffered random access file. I'd be happy if you would like
to give it an early try. Please let me know if you need more start pointers.

Cheers
/Eike

----
http://www.esc-net.de
http://thegordian.blogspot.com
http://twitter.com/eikestepper
Re: [CDO] Performance Evaluation [message #902965 is a reply to message #902932] Tue, 21 August 2012 12:37 Go to previous messageGo to next message
Martin Taal is currently offline Martin Taal
Messages: 5339
Registered: July 2009
Senior Member
Hi Benjamin,
It is possible that times increase if more data is in the system. To get best performance it can make sense to tune the
database. As Eike mentions time is spend on index updates. What database are you using?

Also what is really timeconsuming is to add records to a list like the resource contents. This because the list index
needs to be updated. Also with hibernate the complete content of the list is send to the client, no paged reading of
list contents. The list chunked reader is something on the list to develop for the HB store.

gr. Martin

On 08/21/2012 11:11 AM, Eike Stepper wrote:
> Hi Benjamin,
>
> You should add the keyword Hibernate or Teneo to the subject of posts to make Martin aware of them. I've cc'ed him now.
>
> More comments below...
>
>
>
> Am 21.08.2012 10:33, schrieb Benjamin Kiel:
>> Hello,
>>
>> we have got the requirement to create and persist large (> 1,000,000 objects) models as fast as possible. So, we
>> wanted to figure out if CDO could do this job. We wrote a simple application (see below) which creates 100 times 10000
>> objects
> So, 10K objects per commit.
>
>> and stores them in the model repository. Unfortunately, the required time increases linearly:
>>
>> 1K objects: 39948ms
> Do you really mean 10K, 20K, ... ?
>
>> 2K objects: 23735ms
>> 3K objects: 20148ms
>> 4K objects: 26071ms
>> 5K objects: 31677ms
>> 6K objects: 38203ms
>> 7K objects: 42505ms
>> 8K objects: 47674ms
>> 9K objects: 54031ms
>> 10K objects: 60786ms
>> 11K objects: 71306ms
>> 12K objects: 78143ms
>> 13K objects: 81833ms
>> 14K objects: 83154ms
>> 15K objects: 89243ms
>> 16K objects: 93089ms
>> 17K objects: 96053ms
>> 18K objects: 103253ms
>> 19K objects: 107409ms
>> 20K objects: 111533ms
>> 21K objects: 120376ms
>> 22K objects: 123256ms
>> 23K objects: 127350ms
>> 24K objects: 132841ms
>> 25K objects: 139857ms
>> 26K objects: 143583ms
>> 27K objects: 149437ms
>> 28K objects: 154525ms
>> 29K objects: 160366ms
>> 30K objects: 165254ms
>> 31K objects: 170953ms
>> 32K objects: 178189ms
>> 33K objects: 183107ms
>> 34K objects: 189509ms
>> 35K objects: 195502ms
>> 36K objects: 198536ms
>> 37K objects: 206793ms
>> 38K objects: 213348ms
>> 39K objects: 220337ms
>> 40K objects: 228114ms
>> 41K objects: 232621ms
>> 42K objects: 239319ms
>> 43K objects: 242981ms
>> 44K objects: 246999ms
>> 45K objects: 254089ms
>> 46K objects: 256820ms
>> 47K objects: 266246ms
>> 48K objects: 269477ms
>> 49K objects: 276253ms
>> 50K objects: 278943ms
>> 51K objects: 296901ms
>> 52K objects: 315233ms
>> 53K objects: 329660ms
>> 54K objects: 331005ms
>> 55K objects: 346839ms
>> 56K objects: 341569ms
> Am I interpreting this correctly, with each new 10K objects the insert time increases by ~5 seconds?
>>
>> Does anybody have experience with large models? Is it possible to persist objects in constant time? Maybe there is a
>> switch we have not find yet.
> I have absolutely no Hibernate experience, so I hope that Martin Taal has an idea.
>
> My feeling is that updating the DB indexes must incur some kind of penalty htat depends on the number of existing objects.
>
> Accidentally I've started to work on a completely new store with a major focus on commit times, the LissomeStore. It's
> not fully finished but it passes almost all tests in our compliance suite. Both the store and the tests are already
> available in Git. The store is based on journaling and writes read optimization data from a background thread. That
> means that a commit takes just as long as appending it to a buffered random access file. I'd be happy if you would like
> to give it an early try. Please let me know if you need more start pointers.
>
> Cheers
> /Eike
>
> ----
> http://www.esc-net.de
> http://thegordian.blogspot.com
> http://twitter.com/eikestepper
>
>


--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Cell: +31 (0)6 288 48 943
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@xxxxxxxx - mtaal@xxxxxxxx
Web: www.springsite.com - www.elver.org
Re: [CDO] Performance Evaluation [message #903159 is a reply to message #902932] Wed, 22 August 2012 10:11 Go to previous message
Benjamin Kiel is currently offline Benjamin Kiel
Messages: 2
Registered: August 2012
Junior Member
Hi Eike and Martin,
Thank you for your replies:

> It is possible that times increase if more data is in the system. To get best performance it can make sense to tune the
> database. As Eike mentions time is spend on index updates. What database are you using?
We are using MySQL 5.1
We also tried DBStore in combination with H2 or PostgreSQL and observed the same behavior.

>
> Also what is really timeconsuming is to add records to a list like the resource contents. This because the list index
> needs to be updated. Also with hibernate the complete content of the list is send to the client, no paged reading of
> list contents. The list chunked reader is something on the list to develop for the HB store.

>
> gr. Martin
>
> On 08/21/2012 11:11 AM, Eike Stepper wrote:
>> Hi Benjamin,
>
>> You should add the keyword Hibernate or Teneo to the subject of posts to make Martin aware of them. I've cc'ed him now.
>>
>> More comments below...
>>
>>
>>
>> Am 21.08.2012 10:33, schrieb Benjamin Kiel:
>>> Hello,
>>>
>>> we have got the requirement to create and persist large (> 1,000,000 objects) models as fast as possible. So, we
>>> wanted to figure out if CDO could do this job. We wrote a simple application (see below) which creates 100 times 10000
>>> objects
>> So, 10K objects per commit.
That's right.

>>
>>> and stores them in the model repository. Unfortunately, the required time increases linearly:
>>>
>>> 1K objects: 39948ms
>> Do you really mean 10K, 20K, ... ?
This was my mistake. I actually meant:
10K objects: 39948ms

>>
>>> 2K objects: 23735ms
>>> 10K objects: 60786ms
>>> 15K objects: 89243ms
>>> 20K objects: 111533ms
>>> 25K objects: 139857ms
>>> 30K objects: 165254ms
>>> 35K objects: 195502ms
>>> 40K objects: 228114ms
>>> 45K objects: 254089ms
>>> 50K objects: 278943ms
>>> 55K objects: 346839ms
>>> 56K objects: 341569ms
>> Am I interpreting this correctly, with each new 10K objects the insert time increases by ~5 seconds?
Exactly.

>>>
>>> Does anybody have experience with large models? Is it possible to persist objects in constant time? Maybe there is a
>>> switch we have not found yet.
>> I have absolutely no Hibernate experience, so I hope that Martin Taal has an idea.
>>
>> My feeling is that updating the DB indexes must incur some kind of penalty htat depends on the number of existing objects.
>>
>> Accidentally I've started to work on a completely new store with a major focus on commit times, the LissomeStore. It's
>> not fully finished but it passes almost all tests in our compliance suite. Both the store and the tests are already
>> available in Git. The store is based on journaling and writes read optimization data from a background thread. That
>> means that a commit takes just as long as appending it to a buffered random access file. I'd be happy if you would like
>> to give it an early try. Please let me know if you need more start pointers.
Sounds nice! I will try this and let you know my experiences.

>>
>> Cheers
>> /Eike
Previous Topic:EMF-Compare Diffs as Command?
Next Topic:[CDO] CDO thread prevent JVM to exit on client side standalone
Goto Forum:
  


Current Time: Sat Oct 25 00:06:48 GMT 2014

Powered by FUDForum. Page generated in 0.01829 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software