Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF "Technology" (Ecore Tools, EMFatic, etc)  » [Teneo] Performance creating a DB
[Teneo] Performance creating a DB [message #83513] Wed, 16 May 2007 05:53 Go to next message
Eclipse UserFriend
Originally posted by: irbull.cs.uvic.ca

I was wondering if there are any tips or pointers regarding performance
and Teneo. I have a fairly large model (about 60M in XML) and I just
loaded it into mysql using Teneo. Basically I just loaded the root
contents from the resource and saved it to a hibernate session.

I was wondering if there are any obvious ways to speed this up (I just
realized I missed something in my model so I will be running it again
tomorrow). I didn't know if there were a bunch of extra calculations
(notifications) being done that I could safely turn off during this
operation. So far all I do is:


Session session = ...
Transaction tx = ....

session.save(rootNode);

tx.commit();
session.close();

cheers,
ian
Re: [Teneo] Performance creating a DB [message #83530 is a reply to message #83513] Wed, 16 May 2007 06:03 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: irbull.cs.uvic.ca

I should mention that it took about 9 hours to complete.

cheers,
ian

Ian Bull wrote:
> I was wondering if there are any tips or pointers regarding performance
> and Teneo. I have a fairly large model (about 60M in XML) and I just
> loaded it into mysql using Teneo. Basically I just loaded the root
> contents from the resource and saved it to a hibernate session.
>
> I was wondering if there are any obvious ways to speed this up (I just
> realized I missed something in my model so I will be running it again
> tomorrow). I didn't know if there were a bunch of extra calculations
> (notifications) being done that I could safely turn off during this
> operation. So far all I do is:
>
>
> Session session = ...
> Transaction tx = ....
>
> session.save(rootNode);
>
> tx.commit();
> session.close();
>
> cheers,
> ian
Re: [Teneo] Performance creating a DB [message #83544 is a reply to message #83530] Wed, 16 May 2007 06:40 Go to previous messageGo to next message
Martin Taal is currently offline Martin TaalFriend
Messages: 5468
Registered: July 2009
Senior Member
Hmm this is too long for the amount of data. I am doing a project were I insert about 200000 objects
in different tables and this takes about 15 minutes on mysql (also using teneo/hibernate).
One thing you can do is if the data tree has independent structures to save these independent trees
first (with a commit after each save) and then save the rootNode as the last one. My experience with
inserting/updating large datasets is that frequent commits can speed up things (factor 10 or more).
Ofcourse this only works if you can drop-create the db if the import fails.

Btw, the runtime part of Teneo should only cost very little extra performance. So the performance
'issue' should be somewhere in hibernate or mysql. Which uses the most cpu: java or mysql?

gr. Martin

Ian Bull wrote:
> I should mention that it took about 9 hours to complete.
>
> cheers,
> ian
>
> Ian Bull wrote:
>> I was wondering if there are any tips or pointers regarding
>> performance and Teneo. I have a fairly large model (about 60M in XML)
>> and I just loaded it into mysql using Teneo. Basically I just loaded
>> the root contents from the resource and saved it to a hibernate session.
>>
>> I was wondering if there are any obvious ways to speed this up (I just
>> realized I missed something in my model so I will be running it again
>> tomorrow). I didn't know if there were a bunch of extra calculations
>> (notifications) being done that I could safely turn off during this
>> operation. So far all I do is:
>>
>>
>> Session session = ...
>> Transaction tx = ....
>>
>> session.save(rootNode);
>>
>> tx.commit();
>> session.close();
>>
>> cheers,
>> ian


--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@springsite.com - mtaal@elver.org
Web: www.springsite.com - www.elver.org
Re: [Teneo] Performance creating a DB [message #83746 is a reply to message #83544] Wed, 16 May 2007 15:17 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: irbull.cs.uvic.ca

Thanks for the quick response Martin,

I watched Top for a while and Java seemed to have most of the CPU, but
MySQL was definitely there.

The problem is, I am basically saving a tree structure with lots of
cross relationships between the different tree nodes (something like an
abstract syntax tree with all the semantic relationships included). I
assume that each time I do a save, a lot of other queries are done to
connect all the nodes, and this is where the performance hit comes.

Once it is in MySQL, Teneo / hibernate works great to get all the data back.

Since I have to re-run my database build today, I will try and save
parts of the tree in sequence and try and isolate the performance
bottleneck.

cheers,
ian



Martin Taal wrote:
> Hmm this is too long for the amount of data. I am doing a project were I
> insert about 200000 objects in different tables and this takes about 15
> minutes on mysql (also using teneo/hibernate).
> One thing you can do is if the data tree has independent structures to
> save these independent trees first (with a commit after each save) and
> then save the rootNode as the last one. My experience with
> inserting/updating large datasets is that frequent commits can speed up
> things (factor 10 or more). Ofcourse this only works if you can
> drop-create the db if the import fails.
>
> Btw, the runtime part of Teneo should only cost very little extra
> performance. So the performance 'issue' should be somewhere in hibernate
> or mysql. Which uses the most cpu: java or mysql?
>
> gr. Martin
>
> Ian Bull wrote:
>> I should mention that it took about 9 hours to complete.
>>
>> cheers,
>> ian
>>
>> Ian Bull wrote:
>>> I was wondering if there are any tips or pointers regarding
>>> performance and Teneo. I have a fairly large model (about 60M in XML)
>>> and I just loaded it into mysql using Teneo. Basically I just loaded
>>> the root contents from the resource and saved it to a hibernate session.
>>>
>>> I was wondering if there are any obvious ways to speed this up (I
>>> just realized I missed something in my model so I will be running it
>>> again tomorrow). I didn't know if there were a bunch of extra
>>> calculations (notifications) being done that I could safely turn off
>>> during this operation. So far all I do is:
>>>
>>>
>>> Session session = ...
>>> Transaction tx = ....
>>>
>>> session.save(rootNode);
>>>
>>> tx.commit();
>>> session.close();
>>>
>>> cheers,
>>> ian
>
>
Re: [Teneo] Performance creating a DB [message #83761 is a reply to message #83746] Thu, 17 May 2007 00:28 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: irbull.cs.uvic.ca

I have spent most the day playing with this and trying different
options. I stopped the cascading, so when one EObject is added it
doesn't add all the related ones, and this speeds up the addition of
each object, but I haven't determined if this actually makes the entire
insert faster.

What I did notice (which is probably not Teneo, more likely Hibernate,
MySQL or the JDBC driver) is that as the tables grow in size, each
insert becomes "noticeably" slower. My basic model consists of nodes
(35,000) each with about 8-10 attributes (name, value pairs). This means
that I will have 35,000 nodes and almost 300,000 attributes. The first
100 nodes (and 1,000) attributes are inserted in about a second, but
after the tables grow (500 nodes, 50,000 attribute rows) everything is
much slower. 100 nodes takes a few seconds to insert and it keeps
getting slower. Does this make sense?

I know this is not a Teneo issue, but anyone has advice on how to
configure teneo for batch insertion, it would be greatly appreciated. I
have tried to set hibernate.jdbc.batch_size 30, but each save still
seems to result in an insert (instead of 1 insert for 30 nodes). I have
also tried the stateless session, but this threw an exception when I
tried to insert a node.

Cheers,
Ian


Ian Bull wrote:
> Thanks for the quick response Martin,
>
> I watched Top for a while and Java seemed to have most of the CPU, but
> MySQL was definitely there.
>
> The problem is, I am basically saving a tree structure with lots of
> cross relationships between the different tree nodes (something like an
> abstract syntax tree with all the semantic relationships included). I
> assume that each time I do a save, a lot of other queries are done to
> connect all the nodes, and this is where the performance hit comes.
>
> Once it is in MySQL, Teneo / hibernate works great to get all the data
> back.
>
> Since I have to re-run my database build today, I will try and save
> parts of the tree in sequence and try and isolate the performance
> bottleneck.
>
> cheers,
> ian
>
>
>
> Martin Taal wrote:
>> Hmm this is too long for the amount of data. I am doing a project were
>> I insert about 200000 objects in different tables and this takes about
>> 15 minutes on mysql (also using teneo/hibernate).
>> One thing you can do is if the data tree has independent structures to
>> save these independent trees first (with a commit after each save) and
>> then save the rootNode as the last one. My experience with
>> inserting/updating large datasets is that frequent commits can speed
>> up things (factor 10 or more). Ofcourse this only works if you can
>> drop-create the db if the import fails.
>>
>> Btw, the runtime part of Teneo should only cost very little extra
>> performance. So the performance 'issue' should be somewhere in
>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>
>> gr. Martin
>>
>> Ian Bull wrote:
>>> I should mention that it took about 9 hours to complete.
>>>
>>> cheers,
>>> ian
>>>
>>> Ian Bull wrote:
>>>> I was wondering if there are any tips or pointers regarding
>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>> loaded the root contents from the resource and saved it to a
>>>> hibernate session.
>>>>
>>>> I was wondering if there are any obvious ways to speed this up (I
>>>> just realized I missed something in my model so I will be running it
>>>> again tomorrow). I didn't know if there were a bunch of extra
>>>> calculations (notifications) being done that I could safely turn off
>>>> during this operation. So far all I do is:
>>>>
>>>>
>>>> Session session = ...
>>>> Transaction tx = ....
>>>>
>>>> session.save(rootNode);
>>>>
>>>> tx.commit();
>>>> session.close();
>>>>
>>>> cheers,
>>>> ian
>>
>>
Re: [Teneo] Performance creating a DB [message #83792 is a reply to message #83761] Thu, 17 May 2007 05:38 Go to previous messageGo to next message
Martin Taal is currently offline Martin TaalFriend
Messages: 5468
Registered: July 2009
Senior Member
Afaics it is really something which needs to be solved on the Hibernate side (so it is not a Teneo
issue).
I am not sure why the jdbc batch size is ignored, Teneo does not influence/change that setting.
Reading the docs hibernate it says that insert batching is disabled when using an identity
identifier generator.

One thing you can try is to add an interceptor or event listener to hibernate and use this to
forcefully call the session.commit or session.flush to speed things up. Interceptors can be passed
to the session when opening one so it won't influence your standard operation. It is not the nicest
solution but it may speed things up.

gr. Martin

Ian Bull wrote:
> I have spent most the day playing with this and trying different
> options. I stopped the cascading, so when one EObject is added it
> doesn't add all the related ones, and this speeds up the addition of
> each object, but I haven't determined if this actually makes the entire
> insert faster.
>
> What I did notice (which is probably not Teneo, more likely Hibernate,
> MySQL or the JDBC driver) is that as the tables grow in size, each
> insert becomes "noticeably" slower. My basic model consists of nodes
> (35,000) each with about 8-10 attributes (name, value pairs). This means
> that I will have 35,000 nodes and almost 300,000 attributes. The first
> 100 nodes (and 1,000) attributes are inserted in about a second, but
> after the tables grow (500 nodes, 50,000 attribute rows) everything is
> much slower. 100 nodes takes a few seconds to insert and it keeps
> getting slower. Does this make sense?
>
> I know this is not a Teneo issue, but anyone has advice on how to
> configure teneo for batch insertion, it would be greatly appreciated. I
> have tried to set hibernate.jdbc.batch_size 30, but each save still
> seems to result in an insert (instead of 1 insert for 30 nodes). I have
> also tried the stateless session, but this threw an exception when I
> tried to insert a node.
>
> Cheers,
> Ian
>
>
> Ian Bull wrote:
>> Thanks for the quick response Martin,
>>
>> I watched Top for a while and Java seemed to have most of the CPU, but
>> MySQL was definitely there.
>>
>> The problem is, I am basically saving a tree structure with lots of
>> cross relationships between the different tree nodes (something like
>> an abstract syntax tree with all the semantic relationships
>> included). I assume that each time I do a save, a lot of other
>> queries are done to connect all the nodes, and this is where the
>> performance hit comes.
>>
>> Once it is in MySQL, Teneo / hibernate works great to get all the data
>> back.
>>
>> Since I have to re-run my database build today, I will try and save
>> parts of the tree in sequence and try and isolate the performance
>> bottleneck.
>>
>> cheers,
>> ian
>>
>>
>>
>> Martin Taal wrote:
>>> Hmm this is too long for the amount of data. I am doing a project
>>> were I insert about 200000 objects in different tables and this takes
>>> about 15 minutes on mysql (also using teneo/hibernate).
>>> One thing you can do is if the data tree has independent structures
>>> to save these independent trees first (with a commit after each save)
>>> and then save the rootNode as the last one. My experience with
>>> inserting/updating large datasets is that frequent commits can speed
>>> up things (factor 10 or more). Ofcourse this only works if you can
>>> drop-create the db if the import fails.
>>>
>>> Btw, the runtime part of Teneo should only cost very little extra
>>> performance. So the performance 'issue' should be somewhere in
>>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>>
>>> gr. Martin
>>>
>>> Ian Bull wrote:
>>>> I should mention that it took about 9 hours to complete.
>>>>
>>>> cheers,
>>>> ian
>>>>
>>>> Ian Bull wrote:
>>>>> I was wondering if there are any tips or pointers regarding
>>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>>> loaded the root contents from the resource and saved it to a
>>>>> hibernate session.
>>>>>
>>>>> I was wondering if there are any obvious ways to speed this up (I
>>>>> just realized I missed something in my model so I will be running
>>>>> it again tomorrow). I didn't know if there were a bunch of extra
>>>>> calculations (notifications) being done that I could safely turn
>>>>> off during this operation. So far all I do is:
>>>>>
>>>>>
>>>>> Session session = ...
>>>>> Transaction tx = ....
>>>>>
>>>>> session.save(rootNode);
>>>>>
>>>>> tx.commit();
>>>>> session.close();
>>>>>
>>>>> cheers,
>>>>> ian
>>>
>>>


--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@springsite.com - mtaal@elver.org
Web: www.springsite.com - www.elver.org
Re: [Teneo] Performance creating a DB [message #92343 is a reply to message #83761] Wed, 01 August 2007 17:24 Go to previous message
Alain Picard is currently offline Alain PicardFriend
Messages: 253
Registered: July 2009
Senior Member
Ian,

After posting a similar topic, I just noticed this earlier posting. We are in a similar situation as we are saving tree structures
(annotated concrete syntax tree) much like you and it is taking a very long time.

Were you able to find some sort of solution to this problem?

Thanks
Alain


Ian Bull wrote:
> I have spent most the day playing with this and trying different
> options. I stopped the cascading, so when one EObject is added it
> doesn't add all the related ones, and this speeds up the addition of
> each object, but I haven't determined if this actually makes the entire
> insert faster.
>
> What I did notice (which is probably not Teneo, more likely Hibernate,
> MySQL or the JDBC driver) is that as the tables grow in size, each
> insert becomes "noticeably" slower. My basic model consists of nodes
> (35,000) each with about 8-10 attributes (name, value pairs). This means
> that I will have 35,000 nodes and almost 300,000 attributes. The first
> 100 nodes (and 1,000) attributes are inserted in about a second, but
> after the tables grow (500 nodes, 50,000 attribute rows) everything is
> much slower. 100 nodes takes a few seconds to insert and it keeps
> getting slower. Does this make sense?
>
> I know this is not a Teneo issue, but anyone has advice on how to
> configure teneo for batch insertion, it would be greatly appreciated. I
> have tried to set hibernate.jdbc.batch_size 30, but each save still
> seems to result in an insert (instead of 1 insert for 30 nodes). I have
> also tried the stateless session, but this threw an exception when I
> tried to insert a node.
>
> Cheers,
> Ian
>
>
> Ian Bull wrote:
>> Thanks for the quick response Martin,
>>
>> I watched Top for a while and Java seemed to have most of the CPU, but
>> MySQL was definitely there.
>>
>> The problem is, I am basically saving a tree structure with lots of
>> cross relationships between the different tree nodes (something like
>> an abstract syntax tree with all the semantic relationships
>> included). I assume that each time I do a save, a lot of other
>> queries are done to connect all the nodes, and this is where the
>> performance hit comes.
>>
>> Once it is in MySQL, Teneo / hibernate works great to get all the data
>> back.
>>
>> Since I have to re-run my database build today, I will try and save
>> parts of the tree in sequence and try and isolate the performance
>> bottleneck.
>>
>> cheers,
>> ian
>>
>>
>>
>> Martin Taal wrote:
>>> Hmm this is too long for the amount of data. I am doing a project
>>> were I insert about 200000 objects in different tables and this takes
>>> about 15 minutes on mysql (also using teneo/hibernate).
>>> One thing you can do is if the data tree has independent structures
>>> to save these independent trees first (with a commit after each save)
>>> and then save the rootNode as the last one. My experience with
>>> inserting/updating large datasets is that frequent commits can speed
>>> up things (factor 10 or more). Ofcourse this only works if you can
>>> drop-create the db if the import fails.
>>>
>>> Btw, the runtime part of Teneo should only cost very little extra
>>> performance. So the performance 'issue' should be somewhere in
>>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>>
>>> gr. Martin
>>>
>>> Ian Bull wrote:
>>>> I should mention that it took about 9 hours to complete.
>>>>
>>>> cheers,
>>>> ian
>>>>
>>>> Ian Bull wrote:
>>>>> I was wondering if there are any tips or pointers regarding
>>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>>> loaded the root contents from the resource and saved it to a
>>>>> hibernate session.
>>>>>
>>>>> I was wondering if there are any obvious ways to speed this up (I
>>>>> just realized I missed something in my model so I will be running
>>>>> it again tomorrow). I didn't know if there were a bunch of extra
>>>>> calculations (notifications) being done that I could safely turn
>>>>> off during this operation. So far all I do is:
>>>>>
>>>>>
>>>>> Session session = ...
>>>>> Transaction tx = ....
>>>>>
>>>>> session.save(rootNode);
>>>>>
>>>>> tx.commit();
>>>>> session.close();
>>>>>
>>>>> cheers,
>>>>> ian
>>>
>>>
Re: [Teneo] Performance creating a DB [message #606699 is a reply to message #83513] Wed, 16 May 2007 06:03 Go to previous message
Ian Bull is currently offline Ian BullFriend
Messages: 145
Registered: July 2009
Senior Member
I should mention that it took about 9 hours to complete.

cheers,
ian

Ian Bull wrote:
> I was wondering if there are any tips or pointers regarding performance
> and Teneo. I have a fairly large model (about 60M in XML) and I just
> loaded it into mysql using Teneo. Basically I just loaded the root
> contents from the resource and saved it to a hibernate session.
>
> I was wondering if there are any obvious ways to speed this up (I just
> realized I missed something in my model so I will be running it again
> tomorrow). I didn't know if there were a bunch of extra calculations
> (notifications) being done that I could safely turn off during this
> operation. So far all I do is:
>
>
> Session session = ...
> Transaction tx = ....
>
> session.save(rootNode);
>
> tx.commit();
> session.close();
>
> cheers,
> ian
Re: [Teneo] Performance creating a DB [message #606700 is a reply to message #83530] Wed, 16 May 2007 06:40 Go to previous message
Martin Taal is currently offline Martin TaalFriend
Messages: 5468
Registered: July 2009
Senior Member
Hmm this is too long for the amount of data. I am doing a project were I insert about 200000 objects
in different tables and this takes about 15 minutes on mysql (also using teneo/hibernate).
One thing you can do is if the data tree has independent structures to save these independent trees
first (with a commit after each save) and then save the rootNode as the last one. My experience with
inserting/updating large datasets is that frequent commits can speed up things (factor 10 or more).
Ofcourse this only works if you can drop-create the db if the import fails.

Btw, the runtime part of Teneo should only cost very little extra performance. So the performance
'issue' should be somewhere in hibernate or mysql. Which uses the most cpu: java or mysql?

gr. Martin

Ian Bull wrote:
> I should mention that it took about 9 hours to complete.
>
> cheers,
> ian
>
> Ian Bull wrote:
>> I was wondering if there are any tips or pointers regarding
>> performance and Teneo. I have a fairly large model (about 60M in XML)
>> and I just loaded it into mysql using Teneo. Basically I just loaded
>> the root contents from the resource and saved it to a hibernate session.
>>
>> I was wondering if there are any obvious ways to speed this up (I just
>> realized I missed something in my model so I will be running it again
>> tomorrow). I didn't know if there were a bunch of extra calculations
>> (notifications) being done that I could safely turn off during this
>> operation. So far all I do is:
>>
>>
>> Session session = ...
>> Transaction tx = ....
>>
>> session.save(rootNode);
>>
>> tx.commit();
>> session.close();
>>
>> cheers,
>> ian


--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@springsite.com - mtaal@elver.org
Web: www.springsite.com - www.elver.org
Re: [Teneo] Performance creating a DB [message #606713 is a reply to message #83544] Wed, 16 May 2007 15:17 Go to previous message
Ian Bull is currently offline Ian BullFriend
Messages: 145
Registered: July 2009
Senior Member
Thanks for the quick response Martin,

I watched Top for a while and Java seemed to have most of the CPU, but
MySQL was definitely there.

The problem is, I am basically saving a tree structure with lots of
cross relationships between the different tree nodes (something like an
abstract syntax tree with all the semantic relationships included). I
assume that each time I do a save, a lot of other queries are done to
connect all the nodes, and this is where the performance hit comes.

Once it is in MySQL, Teneo / hibernate works great to get all the data back.

Since I have to re-run my database build today, I will try and save
parts of the tree in sequence and try and isolate the performance
bottleneck.

cheers,
ian



Martin Taal wrote:
> Hmm this is too long for the amount of data. I am doing a project were I
> insert about 200000 objects in different tables and this takes about 15
> minutes on mysql (also using teneo/hibernate).
> One thing you can do is if the data tree has independent structures to
> save these independent trees first (with a commit after each save) and
> then save the rootNode as the last one. My experience with
> inserting/updating large datasets is that frequent commits can speed up
> things (factor 10 or more). Ofcourse this only works if you can
> drop-create the db if the import fails.
>
> Btw, the runtime part of Teneo should only cost very little extra
> performance. So the performance 'issue' should be somewhere in hibernate
> or mysql. Which uses the most cpu: java or mysql?
>
> gr. Martin
>
> Ian Bull wrote:
>> I should mention that it took about 9 hours to complete.
>>
>> cheers,
>> ian
>>
>> Ian Bull wrote:
>>> I was wondering if there are any tips or pointers regarding
>>> performance and Teneo. I have a fairly large model (about 60M in XML)
>>> and I just loaded it into mysql using Teneo. Basically I just loaded
>>> the root contents from the resource and saved it to a hibernate session.
>>>
>>> I was wondering if there are any obvious ways to speed this up (I
>>> just realized I missed something in my model so I will be running it
>>> again tomorrow). I didn't know if there were a bunch of extra
>>> calculations (notifications) being done that I could safely turn off
>>> during this operation. So far all I do is:
>>>
>>>
>>> Session session = ...
>>> Transaction tx = ....
>>>
>>> session.save(rootNode);
>>>
>>> tx.commit();
>>> session.close();
>>>
>>> cheers,
>>> ian
>
>
Re: [Teneo] Performance creating a DB [message #606714 is a reply to message #83746] Thu, 17 May 2007 00:28 Go to previous message
Ian Bull is currently offline Ian BullFriend
Messages: 145
Registered: July 2009
Senior Member
I have spent most the day playing with this and trying different
options. I stopped the cascading, so when one EObject is added it
doesn't add all the related ones, and this speeds up the addition of
each object, but I haven't determined if this actually makes the entire
insert faster.

What I did notice (which is probably not Teneo, more likely Hibernate,
MySQL or the JDBC driver) is that as the tables grow in size, each
insert becomes "noticeably" slower. My basic model consists of nodes
(35,000) each with about 8-10 attributes (name, value pairs). This means
that I will have 35,000 nodes and almost 300,000 attributes. The first
100 nodes (and 1,000) attributes are inserted in about a second, but
after the tables grow (500 nodes, 50,000 attribute rows) everything is
much slower. 100 nodes takes a few seconds to insert and it keeps
getting slower. Does this make sense?

I know this is not a Teneo issue, but anyone has advice on how to
configure teneo for batch insertion, it would be greatly appreciated. I
have tried to set hibernate.jdbc.batch_size 30, but each save still
seems to result in an insert (instead of 1 insert for 30 nodes). I have
also tried the stateless session, but this threw an exception when I
tried to insert a node.

Cheers,
Ian


Ian Bull wrote:
> Thanks for the quick response Martin,
>
> I watched Top for a while and Java seemed to have most of the CPU, but
> MySQL was definitely there.
>
> The problem is, I am basically saving a tree structure with lots of
> cross relationships between the different tree nodes (something like an
> abstract syntax tree with all the semantic relationships included). I
> assume that each time I do a save, a lot of other queries are done to
> connect all the nodes, and this is where the performance hit comes.
>
> Once it is in MySQL, Teneo / hibernate works great to get all the data
> back.
>
> Since I have to re-run my database build today, I will try and save
> parts of the tree in sequence and try and isolate the performance
> bottleneck.
>
> cheers,
> ian
>
>
>
> Martin Taal wrote:
>> Hmm this is too long for the amount of data. I am doing a project were
>> I insert about 200000 objects in different tables and this takes about
>> 15 minutes on mysql (also using teneo/hibernate).
>> One thing you can do is if the data tree has independent structures to
>> save these independent trees first (with a commit after each save) and
>> then save the rootNode as the last one. My experience with
>> inserting/updating large datasets is that frequent commits can speed
>> up things (factor 10 or more). Ofcourse this only works if you can
>> drop-create the db if the import fails.
>>
>> Btw, the runtime part of Teneo should only cost very little extra
>> performance. So the performance 'issue' should be somewhere in
>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>
>> gr. Martin
>>
>> Ian Bull wrote:
>>> I should mention that it took about 9 hours to complete.
>>>
>>> cheers,
>>> ian
>>>
>>> Ian Bull wrote:
>>>> I was wondering if there are any tips or pointers regarding
>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>> loaded the root contents from the resource and saved it to a
>>>> hibernate session.
>>>>
>>>> I was wondering if there are any obvious ways to speed this up (I
>>>> just realized I missed something in my model so I will be running it
>>>> again tomorrow). I didn't know if there were a bunch of extra
>>>> calculations (notifications) being done that I could safely turn off
>>>> during this operation. So far all I do is:
>>>>
>>>>
>>>> Session session = ...
>>>> Transaction tx = ....
>>>>
>>>> session.save(rootNode);
>>>>
>>>> tx.commit();
>>>> session.close();
>>>>
>>>> cheers,
>>>> ian
>>
>>
Re: [Teneo] Performance creating a DB [message #606716 is a reply to message #83761] Thu, 17 May 2007 05:38 Go to previous message
Martin Taal is currently offline Martin TaalFriend
Messages: 5468
Registered: July 2009
Senior Member
Afaics it is really something which needs to be solved on the Hibernate side (so it is not a Teneo
issue).
I am not sure why the jdbc batch size is ignored, Teneo does not influence/change that setting.
Reading the docs hibernate it says that insert batching is disabled when using an identity
identifier generator.

One thing you can try is to add an interceptor or event listener to hibernate and use this to
forcefully call the session.commit or session.flush to speed things up. Interceptors can be passed
to the session when opening one so it won't influence your standard operation. It is not the nicest
solution but it may speed things up.

gr. Martin

Ian Bull wrote:
> I have spent most the day playing with this and trying different
> options. I stopped the cascading, so when one EObject is added it
> doesn't add all the related ones, and this speeds up the addition of
> each object, but I haven't determined if this actually makes the entire
> insert faster.
>
> What I did notice (which is probably not Teneo, more likely Hibernate,
> MySQL or the JDBC driver) is that as the tables grow in size, each
> insert becomes "noticeably" slower. My basic model consists of nodes
> (35,000) each with about 8-10 attributes (name, value pairs). This means
> that I will have 35,000 nodes and almost 300,000 attributes. The first
> 100 nodes (and 1,000) attributes are inserted in about a second, but
> after the tables grow (500 nodes, 50,000 attribute rows) everything is
> much slower. 100 nodes takes a few seconds to insert and it keeps
> getting slower. Does this make sense?
>
> I know this is not a Teneo issue, but anyone has advice on how to
> configure teneo for batch insertion, it would be greatly appreciated. I
> have tried to set hibernate.jdbc.batch_size 30, but each save still
> seems to result in an insert (instead of 1 insert for 30 nodes). I have
> also tried the stateless session, but this threw an exception when I
> tried to insert a node.
>
> Cheers,
> Ian
>
>
> Ian Bull wrote:
>> Thanks for the quick response Martin,
>>
>> I watched Top for a while and Java seemed to have most of the CPU, but
>> MySQL was definitely there.
>>
>> The problem is, I am basically saving a tree structure with lots of
>> cross relationships between the different tree nodes (something like
>> an abstract syntax tree with all the semantic relationships
>> included). I assume that each time I do a save, a lot of other
>> queries are done to connect all the nodes, and this is where the
>> performance hit comes.
>>
>> Once it is in MySQL, Teneo / hibernate works great to get all the data
>> back.
>>
>> Since I have to re-run my database build today, I will try and save
>> parts of the tree in sequence and try and isolate the performance
>> bottleneck.
>>
>> cheers,
>> ian
>>
>>
>>
>> Martin Taal wrote:
>>> Hmm this is too long for the amount of data. I am doing a project
>>> were I insert about 200000 objects in different tables and this takes
>>> about 15 minutes on mysql (also using teneo/hibernate).
>>> One thing you can do is if the data tree has independent structures
>>> to save these independent trees first (with a commit after each save)
>>> and then save the rootNode as the last one. My experience with
>>> inserting/updating large datasets is that frequent commits can speed
>>> up things (factor 10 or more). Ofcourse this only works if you can
>>> drop-create the db if the import fails.
>>>
>>> Btw, the runtime part of Teneo should only cost very little extra
>>> performance. So the performance 'issue' should be somewhere in
>>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>>
>>> gr. Martin
>>>
>>> Ian Bull wrote:
>>>> I should mention that it took about 9 hours to complete.
>>>>
>>>> cheers,
>>>> ian
>>>>
>>>> Ian Bull wrote:
>>>>> I was wondering if there are any tips or pointers regarding
>>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>>> loaded the root contents from the resource and saved it to a
>>>>> hibernate session.
>>>>>
>>>>> I was wondering if there are any obvious ways to speed this up (I
>>>>> just realized I missed something in my model so I will be running
>>>>> it again tomorrow). I didn't know if there were a bunch of extra
>>>>> calculations (notifications) being done that I could safely turn
>>>>> off during this operation. So far all I do is:
>>>>>
>>>>>
>>>>> Session session = ...
>>>>> Transaction tx = ....
>>>>>
>>>>> session.save(rootNode);
>>>>>
>>>>> tx.commit();
>>>>> session.close();
>>>>>
>>>>> cheers,
>>>>> ian
>>>
>>>


--

With Regards, Martin Taal

Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@springsite.com - mtaal@elver.org
Web: www.springsite.com - www.elver.org
Re: [Teneo] Performance creating a DB [message #609407 is a reply to message #83761] Wed, 01 August 2007 17:24 Go to previous message
Alain Picard is currently offline Alain PicardFriend
Messages: 253
Registered: July 2009
Senior Member
Ian,

After posting a similar topic, I just noticed this earlier posting. We are in a similar situation as we are saving tree structures
(annotated concrete syntax tree) much like you and it is taking a very long time.

Were you able to find some sort of solution to this problem?

Thanks
Alain


Ian Bull wrote:
> I have spent most the day playing with this and trying different
> options. I stopped the cascading, so when one EObject is added it
> doesn't add all the related ones, and this speeds up the addition of
> each object, but I haven't determined if this actually makes the entire
> insert faster.
>
> What I did notice (which is probably not Teneo, more likely Hibernate,
> MySQL or the JDBC driver) is that as the tables grow in size, each
> insert becomes "noticeably" slower. My basic model consists of nodes
> (35,000) each with about 8-10 attributes (name, value pairs). This means
> that I will have 35,000 nodes and almost 300,000 attributes. The first
> 100 nodes (and 1,000) attributes are inserted in about a second, but
> after the tables grow (500 nodes, 50,000 attribute rows) everything is
> much slower. 100 nodes takes a few seconds to insert and it keeps
> getting slower. Does this make sense?
>
> I know this is not a Teneo issue, but anyone has advice on how to
> configure teneo for batch insertion, it would be greatly appreciated. I
> have tried to set hibernate.jdbc.batch_size 30, but each save still
> seems to result in an insert (instead of 1 insert for 30 nodes). I have
> also tried the stateless session, but this threw an exception when I
> tried to insert a node.
>
> Cheers,
> Ian
>
>
> Ian Bull wrote:
>> Thanks for the quick response Martin,
>>
>> I watched Top for a while and Java seemed to have most of the CPU, but
>> MySQL was definitely there.
>>
>> The problem is, I am basically saving a tree structure with lots of
>> cross relationships between the different tree nodes (something like
>> an abstract syntax tree with all the semantic relationships
>> included). I assume that each time I do a save, a lot of other
>> queries are done to connect all the nodes, and this is where the
>> performance hit comes.
>>
>> Once it is in MySQL, Teneo / hibernate works great to get all the data
>> back.
>>
>> Since I have to re-run my database build today, I will try and save
>> parts of the tree in sequence and try and isolate the performance
>> bottleneck.
>>
>> cheers,
>> ian
>>
>>
>>
>> Martin Taal wrote:
>>> Hmm this is too long for the amount of data. I am doing a project
>>> were I insert about 200000 objects in different tables and this takes
>>> about 15 minutes on mysql (also using teneo/hibernate).
>>> One thing you can do is if the data tree has independent structures
>>> to save these independent trees first (with a commit after each save)
>>> and then save the rootNode as the last one. My experience with
>>> inserting/updating large datasets is that frequent commits can speed
>>> up things (factor 10 or more). Ofcourse this only works if you can
>>> drop-create the db if the import fails.
>>>
>>> Btw, the runtime part of Teneo should only cost very little extra
>>> performance. So the performance 'issue' should be somewhere in
>>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>>
>>> gr. Martin
>>>
>>> Ian Bull wrote:
>>>> I should mention that it took about 9 hours to complete.
>>>>
>>>> cheers,
>>>> ian
>>>>
>>>> Ian Bull wrote:
>>>>> I was wondering if there are any tips or pointers regarding
>>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>>> loaded the root contents from the resource and saved it to a
>>>>> hibernate session.
>>>>>
>>>>> I was wondering if there are any obvious ways to speed this up (I
>>>>> just realized I missed something in my model so I will be running
>>>>> it again tomorrow). I didn't know if there were a bunch of extra
>>>>> calculations (notifications) being done that I could safely turn
>>>>> off during this operation. So far all I do is:
>>>>>
>>>>>
>>>>> Session session = ...
>>>>> Transaction tx = ....
>>>>>
>>>>> session.save(rootNode);
>>>>>
>>>>> tx.commit();
>>>>> session.close();
>>>>>
>>>>> cheers,
>>>>> ian
>>>
>>>
Previous Topic:[Teneo] Model persistence performance
Next Topic:problem about deleting ojects with foreign key constrain
Goto Forum:
  


Current Time: Fri Aug 19 10:30:33 GMT 2022

Powered by FUDForum. Page generated in 0.02426 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top