Home » Modeling » EMF "Technology" (Ecore Tools, EMFatic, etc)  » [Teneo] Performance creating a DB 
| [Teneo] Performance creating a DB [message #83513] | 
Wed, 16 May 2007 01:53   | 
 
Eclipse User  | 
 | 
 | 
   | 
 
Originally posted by: irbull.cs.uvic.ca 
 
I was wondering if there are any tips or pointers regarding performance  
and Teneo. I have a fairly large model (about 60M in XML) and I just  
loaded it into mysql using Teneo.  Basically I just loaded the root  
contents from the resource and saved it to a hibernate session. 
 
I was wondering if there are any obvious ways to speed this up (I just  
realized I missed something in my model so I will be running it again  
tomorrow).  I didn't know if there were a bunch of extra calculations  
(notifications) being done that I could safely turn off during this  
operation.  So far all I do is: 
 
 
Session session = ... 
Transaction tx = .... 
 
session.save(rootNode); 
 
tx.commit(); 
session.close(); 
 
cheers, 
ian
 |  
 |  
  |   |   |  
| Re: [Teneo] Performance creating a DB [message #83746 is a reply to message #83544] | 
Wed, 16 May 2007 11:17    | 
 
Eclipse User  | 
 | 
 | 
   | 
 
Originally posted by: irbull.cs.uvic.ca 
 
Thanks for the quick response Martin, 
 
I watched Top for a while and Java seemed to have most of the CPU, but  
MySQL was definitely there. 
 
The problem is, I am basically saving a tree structure with lots of  
cross relationships between the different tree nodes (something like an  
abstract syntax tree with all the semantic relationships included).  I  
assume that each time I do a save, a lot of other queries are done to  
connect all the nodes, and this is where the performance hit comes. 
 
Once it is in MySQL, Teneo / hibernate works great to get all the data back. 
 
Since I have to re-run my database build today, I will try and save  
parts of the tree in sequence and try and isolate the performance  
bottleneck. 
 
cheers, 
ian 
 
 
 
Martin Taal wrote: 
> Hmm this is too long for the amount of data. I am doing a project were I  
> insert about 200000 objects in different tables and this takes about 15  
> minutes on mysql (also using teneo/hibernate). 
> One thing you can do is if the data tree has independent structures to  
> save these independent trees first (with a commit after each save) and  
> then save the rootNode as the last one. My experience with  
> inserting/updating large datasets is that frequent commits can speed up  
> things (factor 10 or more). Ofcourse this only works if you can  
> drop-create the db if the import fails. 
>  
> Btw, the runtime part of Teneo should only cost very little extra  
> performance. So the performance 'issue' should be somewhere in hibernate  
> or mysql. Which uses the most cpu: java or mysql? 
>  
> gr. Martin 
>  
> Ian Bull wrote: 
>> I should mention that it took about 9 hours to complete. 
>> 
>> cheers, 
>> ian 
>> 
>> Ian Bull wrote: 
>>> I was wondering if there are any tips or pointers regarding  
>>> performance and Teneo. I have a fairly large model (about 60M in XML)  
>>> and I just loaded it into mysql using Teneo.  Basically I just loaded  
>>> the root contents from the resource and saved it to a hibernate session. 
>>> 
>>> I was wondering if there are any obvious ways to speed this up (I  
>>> just realized I missed something in my model so I will be running it  
>>> again tomorrow).  I didn't know if there were a bunch of extra  
>>> calculations (notifications) being done that I could safely turn off  
>>> during this operation.  So far all I do is: 
>>> 
>>> 
>>> Session session = ... 
>>> Transaction tx = .... 
>>> 
>>> session.save(rootNode); 
>>> 
>>> tx.commit(); 
>>> session.close(); 
>>> 
>>> cheers, 
>>> ian 
>  
>
 |  
 |  
  |  
| Re: [Teneo] Performance creating a DB [message #83761 is a reply to message #83746] | 
Wed, 16 May 2007 20:28    | 
 
Eclipse User  | 
 | 
 | 
   | 
 
Originally posted by: irbull.cs.uvic.ca 
 
I have spent most the day playing with this and trying different  
options. I stopped the cascading, so when one EObject is added it  
doesn't add all the related ones, and this speeds up the addition of  
each object, but I haven't determined if this actually makes the entire  
insert faster. 
 
What I did notice (which is probably not Teneo, more likely Hibernate,  
MySQL or the JDBC driver) is that as the tables grow in size, each  
insert becomes "noticeably" slower.  My basic model consists of nodes  
(35,000) each with about 8-10 attributes (name, value pairs). This means  
that I will have 35,000 nodes and almost 300,000 attributes.  The first  
100 nodes (and 1,000) attributes are inserted in about a second, but  
after the tables grow (500 nodes, 50,000 attribute rows) everything is  
much slower.  100 nodes takes a few seconds to insert and it keeps  
getting slower.  Does this make sense? 
 
I know this is not a Teneo issue, but anyone has advice on how to  
configure teneo for batch insertion, it would be greatly appreciated. I  
have tried to set hibernate.jdbc.batch_size 30, but each save still  
seems to result in an insert (instead of 1 insert for 30 nodes).  I have  
also tried the stateless session, but this threw an exception when I  
tried to insert a node. 
 
Cheers, 
Ian 
 
 
Ian Bull wrote: 
> Thanks for the quick response Martin, 
>  
> I watched Top for a while and Java seemed to have most of the CPU, but  
> MySQL was definitely there. 
>  
> The problem is, I am basically saving a tree structure with lots of  
> cross relationships between the different tree nodes (something like an  
> abstract syntax tree with all the semantic relationships included).  I  
> assume that each time I do a save, a lot of other queries are done to  
> connect all the nodes, and this is where the performance hit comes. 
>  
> Once it is in MySQL, Teneo / hibernate works great to get all the data  
> back. 
>  
> Since I have to re-run my database build today, I will try and save  
> parts of the tree in sequence and try and isolate the performance  
> bottleneck. 
>  
> cheers, 
> ian 
>  
>  
>  
> Martin Taal wrote: 
>> Hmm this is too long for the amount of data. I am doing a project were  
>> I insert about 200000 objects in different tables and this takes about  
>> 15 minutes on mysql (also using teneo/hibernate). 
>> One thing you can do is if the data tree has independent structures to  
>> save these independent trees first (with a commit after each save) and  
>> then save the rootNode as the last one. My experience with  
>> inserting/updating large datasets is that frequent commits can speed  
>> up things (factor 10 or more). Ofcourse this only works if you can  
>> drop-create the db if the import fails. 
>> 
>> Btw, the runtime part of Teneo should only cost very little extra  
>> performance. So the performance 'issue' should be somewhere in  
>> hibernate or mysql. Which uses the most cpu: java or mysql? 
>> 
>> gr. Martin 
>> 
>> Ian Bull wrote: 
>>> I should mention that it took about 9 hours to complete. 
>>> 
>>> cheers, 
>>> ian 
>>> 
>>> Ian Bull wrote: 
>>>> I was wondering if there are any tips or pointers regarding  
>>>> performance and Teneo. I have a fairly large model (about 60M in  
>>>> XML) and I just loaded it into mysql using Teneo.  Basically I just  
>>>> loaded the root contents from the resource and saved it to a  
>>>> hibernate session. 
>>>> 
>>>> I was wondering if there are any obvious ways to speed this up (I  
>>>> just realized I missed something in my model so I will be running it  
>>>> again tomorrow).  I didn't know if there were a bunch of extra  
>>>> calculations (notifications) being done that I could safely turn off  
>>>> during this operation.  So far all I do is: 
>>>> 
>>>> 
>>>> Session session = ... 
>>>> Transaction tx = .... 
>>>> 
>>>> session.save(rootNode); 
>>>> 
>>>> tx.commit(); 
>>>> session.close(); 
>>>> 
>>>> cheers, 
>>>> ian 
>> 
>>
 |  
 |  
  |  
| Re: [Teneo] Performance creating a DB [message #83792 is a reply to message #83761] | 
Thu, 17 May 2007 01:38    | 
 
Eclipse User  | 
 | 
 | 
   | 
 
Afaics it is really something which needs to be solved on the Hibernate side (so it is not a Teneo  
issue). 
I am not sure why the jdbc batch size is ignored, Teneo does not influence/change that setting.  
Reading the docs hibernate it says that insert batching is disabled when using an identity  
identifier generator. 
 
One thing you can try is to add an interceptor or event listener to hibernate and use this to  
forcefully call the session.commit or session.flush to speed things up. Interceptors can be passed  
to the session when opening one so it won't influence your standard operation. It is not the nicest  
solution but it may speed things up. 
 
gr. Martin 
 
Ian Bull wrote: 
> I have spent most the day playing with this and trying different  
> options. I stopped the cascading, so when one EObject is added it  
> doesn't add all the related ones, and this speeds up the addition of  
> each object, but I haven't determined if this actually makes the entire  
> insert faster. 
>  
> What I did notice (which is probably not Teneo, more likely Hibernate,  
> MySQL or the JDBC driver) is that as the tables grow in size, each  
> insert becomes "noticeably" slower.  My basic model consists of nodes  
> (35,000) each with about 8-10 attributes (name, value pairs). This means  
> that I will have 35,000 nodes and almost 300,000 attributes.  The first  
> 100 nodes (and 1,000) attributes are inserted in about a second, but  
> after the tables grow (500 nodes, 50,000 attribute rows) everything is  
> much slower.  100 nodes takes a few seconds to insert and it keeps  
> getting slower.  Does this make sense? 
>  
> I know this is not a Teneo issue, but anyone has advice on how to  
> configure teneo for batch insertion, it would be greatly appreciated. I  
> have tried to set hibernate.jdbc.batch_size 30, but each save still  
> seems to result in an insert (instead of 1 insert for 30 nodes).  I have  
> also tried the stateless session, but this threw an exception when I  
> tried to insert a node. 
>  
> Cheers, 
> Ian 
>  
>  
> Ian Bull wrote: 
>> Thanks for the quick response Martin, 
>> 
>> I watched Top for a while and Java seemed to have most of the CPU, but  
>> MySQL was definitely there. 
>> 
>> The problem is, I am basically saving a tree structure with lots of  
>> cross relationships between the different tree nodes (something like  
>> an abstract syntax tree with all the semantic relationships  
>> included).  I assume that each time I do a save, a lot of other  
>> queries are done to connect all the nodes, and this is where the  
>> performance hit comes. 
>> 
>> Once it is in MySQL, Teneo / hibernate works great to get all the data  
>> back. 
>> 
>> Since I have to re-run my database build today, I will try and save  
>> parts of the tree in sequence and try and isolate the performance  
>> bottleneck. 
>> 
>> cheers, 
>> ian 
>> 
>> 
>> 
>> Martin Taal wrote: 
>>> Hmm this is too long for the amount of data. I am doing a project  
>>> were I insert about 200000 objects in different tables and this takes  
>>> about 15 minutes on mysql (also using teneo/hibernate). 
>>> One thing you can do is if the data tree has independent structures  
>>> to save these independent trees first (with a commit after each save)  
>>> and then save the rootNode as the last one. My experience with  
>>> inserting/updating large datasets is that frequent commits can speed  
>>> up things (factor 10 or more). Ofcourse this only works if you can  
>>> drop-create the db if the import fails. 
>>> 
>>> Btw, the runtime part of Teneo should only cost very little extra  
>>> performance. So the performance 'issue' should be somewhere in  
>>> hibernate or mysql. Which uses the most cpu: java or mysql? 
>>> 
>>> gr. Martin 
>>> 
>>> Ian Bull wrote: 
>>>> I should mention that it took about 9 hours to complete. 
>>>> 
>>>> cheers, 
>>>> ian 
>>>> 
>>>> Ian Bull wrote: 
>>>>> I was wondering if there are any tips or pointers regarding  
>>>>> performance and Teneo. I have a fairly large model (about 60M in  
>>>>> XML) and I just loaded it into mysql using Teneo.  Basically I just  
>>>>> loaded the root contents from the resource and saved it to a  
>>>>> hibernate session. 
>>>>> 
>>>>> I was wondering if there are any obvious ways to speed this up (I  
>>>>> just realized I missed something in my model so I will be running  
>>>>> it again tomorrow).  I didn't know if there were a bunch of extra  
>>>>> calculations (notifications) being done that I could safely turn  
>>>>> off during this operation.  So far all I do is: 
>>>>> 
>>>>> 
>>>>> Session session = ... 
>>>>> Transaction tx = .... 
>>>>> 
>>>>> session.save(rootNode); 
>>>>> 
>>>>> tx.commit(); 
>>>>> session.close(); 
>>>>> 
>>>>> cheers, 
>>>>> ian 
>>> 
>>> 
 
 
--  
 
With Regards, Martin Taal 
 
Springsite/Elver.org 
Office: Hardwareweg 4, 3821 BV Amersfoort 
Postal: Nassaulaan 7, 3941 EC Doorn 
The Netherlands 
Tel: +31 (0)84 420 2397 
Fax: +31 (0)84 225 9307 
Mail: mtaal@springsite.com - mtaal@elver.org 
Web: www.springsite.com - www.elver.org
 |  
 |  
  |  
| Re: [Teneo] Performance creating a DB [message #92343 is a reply to message #83761] | 
Wed, 01 August 2007 13:24   | 
 
Eclipse User  | 
 | 
 | 
   | 
 
Ian, 
 
After posting a similar topic, I just noticed this earlier posting. We are in a similar situation as we are saving tree structures  
(annotated concrete syntax tree) much like you and it is taking a very long time. 
 
Were you able to find some sort of solution to this problem? 
 
Thanks 
Alain 
 
 
Ian Bull wrote: 
> I have spent most the day playing with this and trying different  
> options. I stopped the cascading, so when one EObject is added it  
> doesn't add all the related ones, and this speeds up the addition of  
> each object, but I haven't determined if this actually makes the entire  
> insert faster. 
>  
> What I did notice (which is probably not Teneo, more likely Hibernate,  
> MySQL or the JDBC driver) is that as the tables grow in size, each  
> insert becomes "noticeably" slower.  My basic model consists of nodes  
> (35,000) each with about 8-10 attributes (name, value pairs). This means  
> that I will have 35,000 nodes and almost 300,000 attributes.  The first  
> 100 nodes (and 1,000) attributes are inserted in about a second, but  
> after the tables grow (500 nodes, 50,000 attribute rows) everything is  
> much slower.  100 nodes takes a few seconds to insert and it keeps  
> getting slower.  Does this make sense? 
>  
> I know this is not a Teneo issue, but anyone has advice on how to  
> configure teneo for batch insertion, it would be greatly appreciated. I  
> have tried to set hibernate.jdbc.batch_size 30, but each save still  
> seems to result in an insert (instead of 1 insert for 30 nodes).  I have  
> also tried the stateless session, but this threw an exception when I  
> tried to insert a node. 
>  
> Cheers, 
> Ian 
>  
>  
> Ian Bull wrote: 
>> Thanks for the quick response Martin, 
>> 
>> I watched Top for a while and Java seemed to have most of the CPU, but  
>> MySQL was definitely there. 
>> 
>> The problem is, I am basically saving a tree structure with lots of  
>> cross relationships between the different tree nodes (something like  
>> an abstract syntax tree with all the semantic relationships  
>> included).  I assume that each time I do a save, a lot of other  
>> queries are done to connect all the nodes, and this is where the  
>> performance hit comes. 
>> 
>> Once it is in MySQL, Teneo / hibernate works great to get all the data  
>> back. 
>> 
>> Since I have to re-run my database build today, I will try and save  
>> parts of the tree in sequence and try and isolate the performance  
>> bottleneck. 
>> 
>> cheers, 
>> ian 
>> 
>> 
>> 
>> Martin Taal wrote: 
>>> Hmm this is too long for the amount of data. I am doing a project  
>>> were I insert about 200000 objects in different tables and this takes  
>>> about 15 minutes on mysql (also using teneo/hibernate). 
>>> One thing you can do is if the data tree has independent structures  
>>> to save these independent trees first (with a commit after each save)  
>>> and then save the rootNode as the last one. My experience with  
>>> inserting/updating large datasets is that frequent commits can speed  
>>> up things (factor 10 or more). Ofcourse this only works if you can  
>>> drop-create the db if the import fails. 
>>> 
>>> Btw, the runtime part of Teneo should only cost very little extra  
>>> performance. So the performance 'issue' should be somewhere in  
>>> hibernate or mysql. Which uses the most cpu: java or mysql? 
>>> 
>>> gr. Martin 
>>> 
>>> Ian Bull wrote: 
>>>> I should mention that it took about 9 hours to complete. 
>>>> 
>>>> cheers, 
>>>> ian 
>>>> 
>>>> Ian Bull wrote: 
>>>>> I was wondering if there are any tips or pointers regarding  
>>>>> performance and Teneo. I have a fairly large model (about 60M in  
>>>>> XML) and I just loaded it into mysql using Teneo.  Basically I just  
>>>>> loaded the root contents from the resource and saved it to a  
>>>>> hibernate session. 
>>>>> 
>>>>> I was wondering if there are any obvious ways to speed this up (I  
>>>>> just realized I missed something in my model so I will be running  
>>>>> it again tomorrow).  I didn't know if there were a bunch of extra  
>>>>> calculations (notifications) being done that I could safely turn  
>>>>> off during this operation.  So far all I do is: 
>>>>> 
>>>>> 
>>>>> Session session = ... 
>>>>> Transaction tx = .... 
>>>>> 
>>>>> session.save(rootNode); 
>>>>> 
>>>>> tx.commit(); 
>>>>> session.close(); 
>>>>> 
>>>>> cheers, 
>>>>> ian 
>>> 
>>>
 |  
 |  
  |  
| Re: [Teneo] Performance creating a DB [message #606699 is a reply to message #83513] | 
Wed, 16 May 2007 02:03   | 
 
Eclipse User  | 
 | 
 | 
   | 
 
I should mention that it took about 9 hours to complete. 
 
cheers, 
ian 
 
Ian Bull wrote: 
> I was wondering if there are any tips or pointers regarding performance  
> and Teneo. I have a fairly large model (about 60M in XML) and I just  
> loaded it into mysql using Teneo.  Basically I just loaded the root  
> contents from the resource and saved it to a hibernate session. 
>  
> I was wondering if there are any obvious ways to speed this up (I just  
> realized I missed something in my model so I will be running it again  
> tomorrow).  I didn't know if there were a bunch of extra calculations  
> (notifications) being done that I could safely turn off during this  
> operation.  So far all I do is: 
>  
>  
> Session session = ... 
> Transaction tx = .... 
>  
> session.save(rootNode); 
>  
> tx.commit(); 
> session.close(); 
>  
> cheers, 
> ian
 |  
 |  
  |  
| Re: [Teneo] Performance creating a DB [message #606700 is a reply to message #83530] | 
Wed, 16 May 2007 02:40   | 
 
Eclipse User  | 
 | 
 | 
   | 
 
Hmm this is too long for the amount of data. I am doing a project were I insert about 200000 objects  
in different tables and this takes about 15 minutes on mysql (also using teneo/hibernate). 
One thing you can do is if the data tree has independent structures to save these independent trees  
first (with a commit after each save) and then save the rootNode as the last one. My experience with  
inserting/updating large datasets is that frequent commits can speed up things (factor 10 or more).  
Ofcourse this only works if you can drop-create the db if the import fails. 
 
Btw, the runtime part of Teneo should only cost very little extra performance. So the performance  
'issue' should be somewhere in hibernate or mysql. Which uses the most cpu: java or mysql? 
 
gr. Martin 
 
Ian Bull wrote: 
> I should mention that it took about 9 hours to complete. 
>  
> cheers, 
> ian 
>  
> Ian Bull wrote: 
>> I was wondering if there are any tips or pointers regarding  
>> performance and Teneo. I have a fairly large model (about 60M in XML)  
>> and I just loaded it into mysql using Teneo.  Basically I just loaded  
>> the root contents from the resource and saved it to a hibernate session. 
>> 
>> I was wondering if there are any obvious ways to speed this up (I just  
>> realized I missed something in my model so I will be running it again  
>> tomorrow).  I didn't know if there were a bunch of extra calculations  
>> (notifications) being done that I could safely turn off during this  
>> operation.  So far all I do is: 
>> 
>> 
>> Session session = ... 
>> Transaction tx = .... 
>> 
>> session.save(rootNode); 
>> 
>> tx.commit(); 
>> session.close(); 
>> 
>> cheers, 
>> ian 
 
 
--  
 
With Regards, Martin Taal 
 
Springsite/Elver.org 
Office: Hardwareweg 4, 3821 BV Amersfoort 
Postal: Nassaulaan 7, 3941 EC Doorn 
The Netherlands 
Tel: +31 (0)84 420 2397 
Fax: +31 (0)84 225 9307 
Mail: mtaal@springsite.com - mtaal@elver.org 
Web: www.springsite.com - www.elver.org
 |  
 |  
  |  
| Re: [Teneo] Performance creating a DB [message #606713 is a reply to message #83544] | 
Wed, 16 May 2007 11:17   | 
 
Eclipse User  | 
 | 
 | 
   | 
 
Thanks for the quick response Martin, 
 
I watched Top for a while and Java seemed to have most of the CPU, but  
MySQL was definitely there. 
 
The problem is, I am basically saving a tree structure with lots of  
cross relationships between the different tree nodes (something like an  
abstract syntax tree with all the semantic relationships included).  I  
assume that each time I do a save, a lot of other queries are done to  
connect all the nodes, and this is where the performance hit comes. 
 
Once it is in MySQL, Teneo / hibernate works great to get all the data back. 
 
Since I have to re-run my database build today, I will try and save  
parts of the tree in sequence and try and isolate the performance  
bottleneck. 
 
cheers, 
ian 
 
 
 
Martin Taal wrote: 
> Hmm this is too long for the amount of data. I am doing a project were I  
> insert about 200000 objects in different tables and this takes about 15  
> minutes on mysql (also using teneo/hibernate). 
> One thing you can do is if the data tree has independent structures to  
> save these independent trees first (with a commit after each save) and  
> then save the rootNode as the last one. My experience with  
> inserting/updating large datasets is that frequent commits can speed up  
> things (factor 10 or more). Ofcourse this only works if you can  
> drop-create the db if the import fails. 
>  
> Btw, the runtime part of Teneo should only cost very little extra  
> performance. So the performance 'issue' should be somewhere in hibernate  
> or mysql. Which uses the most cpu: java or mysql? 
>  
> gr. Martin 
>  
> Ian Bull wrote: 
>> I should mention that it took about 9 hours to complete. 
>> 
>> cheers, 
>> ian 
>> 
>> Ian Bull wrote: 
>>> I was wondering if there are any tips or pointers regarding  
>>> performance and Teneo. I have a fairly large model (about 60M in XML)  
>>> and I just loaded it into mysql using Teneo.  Basically I just loaded  
>>> the root contents from the resource and saved it to a hibernate session. 
>>> 
>>> I was wondering if there are any obvious ways to speed this up (I  
>>> just realized I missed something in my model so I will be running it  
>>> again tomorrow).  I didn't know if there were a bunch of extra  
>>> calculations (notifications) being done that I could safely turn off  
>>> during this operation.  So far all I do is: 
>>> 
>>> 
>>> Session session = ... 
>>> Transaction tx = .... 
>>> 
>>> session.save(rootNode); 
>>> 
>>> tx.commit(); 
>>> session.close(); 
>>> 
>>> cheers, 
>>> ian 
>  
>
 |  
 |  
  |  
| Re: [Teneo] Performance creating a DB [message #606714 is a reply to message #83746] | 
Wed, 16 May 2007 20:28   | 
 
Eclipse User  | 
 | 
 | 
   | 
 
I have spent most the day playing with this and trying different  
options. I stopped the cascading, so when one EObject is added it  
doesn't add all the related ones, and this speeds up the addition of  
each object, but I haven't determined if this actually makes the entire  
insert faster. 
 
What I did notice (which is probably not Teneo, more likely Hibernate,  
MySQL or the JDBC driver) is that as the tables grow in size, each  
insert becomes "noticeably" slower.  My basic model consists of nodes  
(35,000) each with about 8-10 attributes (name, value pairs). This means  
that I will have 35,000 nodes and almost 300,000 attributes.  The first  
100 nodes (and 1,000) attributes are inserted in about a second, but  
after the tables grow (500 nodes, 50,000 attribute rows) everything is  
much slower.  100 nodes takes a few seconds to insert and it keeps  
getting slower.  Does this make sense? 
 
I know this is not a Teneo issue, but anyone has advice on how to  
configure teneo for batch insertion, it would be greatly appreciated. I  
have tried to set hibernate.jdbc.batch_size 30, but each save still  
seems to result in an insert (instead of 1 insert for 30 nodes).  I have  
also tried the stateless session, but this threw an exception when I  
tried to insert a node. 
 
Cheers, 
Ian 
 
 
Ian Bull wrote: 
> Thanks for the quick response Martin, 
>  
> I watched Top for a while and Java seemed to have most of the CPU, but  
> MySQL was definitely there. 
>  
> The problem is, I am basically saving a tree structure with lots of  
> cross relationships between the different tree nodes (something like an  
> abstract syntax tree with all the semantic relationships included).  I  
> assume that each time I do a save, a lot of other queries are done to  
> connect all the nodes, and this is where the performance hit comes. 
>  
> Once it is in MySQL, Teneo / hibernate works great to get all the data  
> back. 
>  
> Since I have to re-run my database build today, I will try and save  
> parts of the tree in sequence and try and isolate the performance  
> bottleneck. 
>  
> cheers, 
> ian 
>  
>  
>  
> Martin Taal wrote: 
>> Hmm this is too long for the amount of data. I am doing a project were  
>> I insert about 200000 objects in different tables and this takes about  
>> 15 minutes on mysql (also using teneo/hibernate). 
>> One thing you can do is if the data tree has independent structures to  
>> save these independent trees first (with a commit after each save) and  
>> then save the rootNode as the last one. My experience with  
>> inserting/updating large datasets is that frequent commits can speed  
>> up things (factor 10 or more). Ofcourse this only works if you can  
>> drop-create the db if the import fails. 
>> 
>> Btw, the runtime part of Teneo should only cost very little extra  
>> performance. So the performance 'issue' should be somewhere in  
>> hibernate or mysql. Which uses the most cpu: java or mysql? 
>> 
>> gr. Martin 
>> 
>> Ian Bull wrote: 
>>> I should mention that it took about 9 hours to complete. 
>>> 
>>> cheers, 
>>> ian 
>>> 
>>> Ian Bull wrote: 
>>>> I was wondering if there are any tips or pointers regarding  
>>>> performance and Teneo. I have a fairly large model (about 60M in  
>>>> XML) and I just loaded it into mysql using Teneo.  Basically I just  
>>>> loaded the root contents from the resource and saved it to a  
>>>> hibernate session. 
>>>> 
>>>> I was wondering if there are any obvious ways to speed this up (I  
>>>> just realized I missed something in my model so I will be running it  
>>>> again tomorrow).  I didn't know if there were a bunch of extra  
>>>> calculations (notifications) being done that I could safely turn off  
>>>> during this operation.  So far all I do is: 
>>>> 
>>>> 
>>>> Session session = ... 
>>>> Transaction tx = .... 
>>>> 
>>>> session.save(rootNode); 
>>>> 
>>>> tx.commit(); 
>>>> session.close(); 
>>>> 
>>>> cheers, 
>>>> ian 
>> 
>>
 |  
 |  
  |  
| Re: [Teneo] Performance creating a DB [message #606716 is a reply to message #83761] | 
Thu, 17 May 2007 01:38   | 
 
Eclipse User  | 
 | 
 | 
   | 
 
Afaics it is really something which needs to be solved on the Hibernate side (so it is not a Teneo  
issue). 
I am not sure why the jdbc batch size is ignored, Teneo does not influence/change that setting.  
Reading the docs hibernate it says that insert batching is disabled when using an identity  
identifier generator. 
 
One thing you can try is to add an interceptor or event listener to hibernate and use this to  
forcefully call the session.commit or session.flush to speed things up. Interceptors can be passed  
to the session when opening one so it won't influence your standard operation. It is not the nicest  
solution but it may speed things up. 
 
gr. Martin 
 
Ian Bull wrote: 
> I have spent most the day playing with this and trying different  
> options. I stopped the cascading, so when one EObject is added it  
> doesn't add all the related ones, and this speeds up the addition of  
> each object, but I haven't determined if this actually makes the entire  
> insert faster. 
>  
> What I did notice (which is probably not Teneo, more likely Hibernate,  
> MySQL or the JDBC driver) is that as the tables grow in size, each  
> insert becomes "noticeably" slower.  My basic model consists of nodes  
> (35,000) each with about 8-10 attributes (name, value pairs). This means  
> that I will have 35,000 nodes and almost 300,000 attributes.  The first  
> 100 nodes (and 1,000) attributes are inserted in about a second, but  
> after the tables grow (500 nodes, 50,000 attribute rows) everything is  
> much slower.  100 nodes takes a few seconds to insert and it keeps  
> getting slower.  Does this make sense? 
>  
> I know this is not a Teneo issue, but anyone has advice on how to  
> configure teneo for batch insertion, it would be greatly appreciated. I  
> have tried to set hibernate.jdbc.batch_size 30, but each save still  
> seems to result in an insert (instead of 1 insert for 30 nodes).  I have  
> also tried the stateless session, but this threw an exception when I  
> tried to insert a node. 
>  
> Cheers, 
> Ian 
>  
>  
> Ian Bull wrote: 
>> Thanks for the quick response Martin, 
>> 
>> I watched Top for a while and Java seemed to have most of the CPU, but  
>> MySQL was definitely there. 
>> 
>> The problem is, I am basically saving a tree structure with lots of  
>> cross relationships between the different tree nodes (something like  
>> an abstract syntax tree with all the semantic relationships  
>> included).  I assume that each time I do a save, a lot of other  
>> queries are done to connect all the nodes, and this is where the  
>> performance hit comes. 
>> 
>> Once it is in MySQL, Teneo / hibernate works great to get all the data  
>> back. 
>> 
>> Since I have to re-run my database build today, I will try and save  
>> parts of the tree in sequence and try and isolate the performance  
>> bottleneck. 
>> 
>> cheers, 
>> ian 
>> 
>> 
>> 
>> Martin Taal wrote: 
>>> Hmm this is too long for the amount of data. I am doing a project  
>>> were I insert about 200000 objects in different tables and this takes  
>>> about 15 minutes on mysql (also using teneo/hibernate). 
>>> One thing you can do is if the data tree has independent structures  
>>> to save these independent trees first (with a commit after each save)  
>>> and then save the rootNode as the last one. My experience with  
>>> inserting/updating large datasets is that frequent commits can speed  
>>> up things (factor 10 or more). Ofcourse this only works if you can  
>>> drop-create the db if the import fails. 
>>> 
>>> Btw, the runtime part of Teneo should only cost very little extra  
>>> performance. So the performance 'issue' should be somewhere in  
>>> hibernate or mysql. Which uses the most cpu: java or mysql? 
>>> 
>>> gr. Martin 
>>> 
>>> Ian Bull wrote: 
>>>> I should mention that it took about 9 hours to complete. 
>>>> 
>>>> cheers, 
>>>> ian 
>>>> 
>>>> Ian Bull wrote: 
>>>>> I was wondering if there are any tips or pointers regarding  
>>>>> performance and Teneo. I have a fairly large model (about 60M in  
>>>>> XML) and I just loaded it into mysql using Teneo.  Basically I just  
>>>>> loaded the root contents from the resource and saved it to a  
>>>>> hibernate session. 
>>>>> 
>>>>> I was wondering if there are any obvious ways to speed this up (I  
>>>>> just realized I missed something in my model so I will be running  
>>>>> it again tomorrow).  I didn't know if there were a bunch of extra  
>>>>> calculations (notifications) being done that I could safely turn  
>>>>> off during this operation.  So far all I do is: 
>>>>> 
>>>>> 
>>>>> Session session = ... 
>>>>> Transaction tx = .... 
>>>>> 
>>>>> session.save(rootNode); 
>>>>> 
>>>>> tx.commit(); 
>>>>> session.close(); 
>>>>> 
>>>>> cheers, 
>>>>> ian 
>>> 
>>> 
 
 
--  
 
With Regards, Martin Taal 
 
Springsite/Elver.org 
Office: Hardwareweg 4, 3821 BV Amersfoort 
Postal: Nassaulaan 7, 3941 EC Doorn 
The Netherlands 
Tel: +31 (0)84 420 2397 
Fax: +31 (0)84 225 9307 
Mail: mtaal@springsite.com - mtaal@elver.org 
Web: www.springsite.com - www.elver.org
 |  
 |  
  |  
| Re: [Teneo] Performance creating a DB [message #609407 is a reply to message #83761] | 
Wed, 01 August 2007 13:24   | 
 
Eclipse User  | 
 | 
 | 
   | 
 
Ian, 
 
After posting a similar topic, I just noticed this earlier posting. We are in a similar situation as we are saving tree structures  
(annotated concrete syntax tree) much like you and it is taking a very long time. 
 
Were you able to find some sort of solution to this problem? 
 
Thanks 
Alain 
 
 
Ian Bull wrote: 
> I have spent most the day playing with this and trying different  
> options. I stopped the cascading, so when one EObject is added it  
> doesn't add all the related ones, and this speeds up the addition of  
> each object, but I haven't determined if this actually makes the entire  
> insert faster. 
>  
> What I did notice (which is probably not Teneo, more likely Hibernate,  
> MySQL or the JDBC driver) is that as the tables grow in size, each  
> insert becomes "noticeably" slower.  My basic model consists of nodes  
> (35,000) each with about 8-10 attributes (name, value pairs). This means  
> that I will have 35,000 nodes and almost 300,000 attributes.  The first  
> 100 nodes (and 1,000) attributes are inserted in about a second, but  
> after the tables grow (500 nodes, 50,000 attribute rows) everything is  
> much slower.  100 nodes takes a few seconds to insert and it keeps  
> getting slower.  Does this make sense? 
>  
> I know this is not a Teneo issue, but anyone has advice on how to  
> configure teneo for batch insertion, it would be greatly appreciated. I  
> have tried to set hibernate.jdbc.batch_size 30, but each save still  
> seems to result in an insert (instead of 1 insert for 30 nodes).  I have  
> also tried the stateless session, but this threw an exception when I  
> tried to insert a node. 
>  
> Cheers, 
> Ian 
>  
>  
> Ian Bull wrote: 
>> Thanks for the quick response Martin, 
>> 
>> I watched Top for a while and Java seemed to have most of the CPU, but  
>> MySQL was definitely there. 
>> 
>> The problem is, I am basically saving a tree structure with lots of  
>> cross relationships between the different tree nodes (something like  
>> an abstract syntax tree with all the semantic relationships  
>> included).  I assume that each time I do a save, a lot of other  
>> queries are done to connect all the nodes, and this is where the  
>> performance hit comes. 
>> 
>> Once it is in MySQL, Teneo / hibernate works great to get all the data  
>> back. 
>> 
>> Since I have to re-run my database build today, I will try and save  
>> parts of the tree in sequence and try and isolate the performance  
>> bottleneck. 
>> 
>> cheers, 
>> ian 
>> 
>> 
>> 
>> Martin Taal wrote: 
>>> Hmm this is too long for the amount of data. I am doing a project  
>>> were I insert about 200000 objects in different tables and this takes  
>>> about 15 minutes on mysql (also using teneo/hibernate). 
>>> One thing you can do is if the data tree has independent structures  
>>> to save these independent trees first (with a commit after each save)  
>>> and then save the rootNode as the last one. My experience with  
>>> inserting/updating large datasets is that frequent commits can speed  
>>> up things (factor 10 or more). Ofcourse this only works if you can  
>>> drop-create the db if the import fails. 
>>> 
>>> Btw, the runtime part of Teneo should only cost very little extra  
>>> performance. So the performance 'issue' should be somewhere in  
>>> hibernate or mysql. Which uses the most cpu: java or mysql? 
>>> 
>>> gr. Martin 
>>> 
>>> Ian Bull wrote: 
>>>> I should mention that it took about 9 hours to complete. 
>>>> 
>>>> cheers, 
>>>> ian 
>>>> 
>>>> Ian Bull wrote: 
>>>>> I was wondering if there are any tips or pointers regarding  
>>>>> performance and Teneo. I have a fairly large model (about 60M in  
>>>>> XML) and I just loaded it into mysql using Teneo.  Basically I just  
>>>>> loaded the root contents from the resource and saved it to a  
>>>>> hibernate session. 
>>>>> 
>>>>> I was wondering if there are any obvious ways to speed this up (I  
>>>>> just realized I missed something in my model so I will be running  
>>>>> it again tomorrow).  I didn't know if there were a bunch of extra  
>>>>> calculations (notifications) being done that I could safely turn  
>>>>> off during this operation.  So far all I do is: 
>>>>> 
>>>>> 
>>>>> Session session = ... 
>>>>> Transaction tx = .... 
>>>>> 
>>>>> session.save(rootNode); 
>>>>> 
>>>>> tx.commit(); 
>>>>> session.close(); 
>>>>> 
>>>>> cheers, 
>>>>> ian 
>>> 
>>>
 |  
 |  
  |   
Goto Forum:
 
 Current Time: Tue Nov 04 08:25:49 EST 2025 
 Powered by  FUDForum. Page generated in 0.06738 seconds  
 |