Home » Modeling » EMF "Technology" (Ecore Tools, EMFatic, etc) » [Teneo] Performance creating a DB
[Teneo] Performance creating a DB [message #83513] |
Wed, 16 May 2007 05:53 |
Eclipse User |
|
|
|
Originally posted by: irbull.cs.uvic.ca
I was wondering if there are any tips or pointers regarding performance
and Teneo. I have a fairly large model (about 60M in XML) and I just
loaded it into mysql using Teneo. Basically I just loaded the root
contents from the resource and saved it to a hibernate session.
I was wondering if there are any obvious ways to speed this up (I just
realized I missed something in my model so I will be running it again
tomorrow). I didn't know if there were a bunch of extra calculations
(notifications) being done that I could safely turn off during this
operation. So far all I do is:
Session session = ...
Transaction tx = ....
session.save(rootNode);
tx.commit();
session.close();
cheers,
ian
|
|
|
Re: [Teneo] Performance creating a DB [message #83530 is a reply to message #83513] |
Wed, 16 May 2007 06:03 |
Eclipse User |
|
|
|
Originally posted by: irbull.cs.uvic.ca
I should mention that it took about 9 hours to complete.
cheers,
ian
Ian Bull wrote:
> I was wondering if there are any tips or pointers regarding performance
> and Teneo. I have a fairly large model (about 60M in XML) and I just
> loaded it into mysql using Teneo. Basically I just loaded the root
> contents from the resource and saved it to a hibernate session.
>
> I was wondering if there are any obvious ways to speed this up (I just
> realized I missed something in my model so I will be running it again
> tomorrow). I didn't know if there were a bunch of extra calculations
> (notifications) being done that I could safely turn off during this
> operation. So far all I do is:
>
>
> Session session = ...
> Transaction tx = ....
>
> session.save(rootNode);
>
> tx.commit();
> session.close();
>
> cheers,
> ian
|
|
| |
Re: [Teneo] Performance creating a DB [message #83746 is a reply to message #83544] |
Wed, 16 May 2007 15:17 |
Eclipse User |
|
|
|
Originally posted by: irbull.cs.uvic.ca
Thanks for the quick response Martin,
I watched Top for a while and Java seemed to have most of the CPU, but
MySQL was definitely there.
The problem is, I am basically saving a tree structure with lots of
cross relationships between the different tree nodes (something like an
abstract syntax tree with all the semantic relationships included). I
assume that each time I do a save, a lot of other queries are done to
connect all the nodes, and this is where the performance hit comes.
Once it is in MySQL, Teneo / hibernate works great to get all the data back.
Since I have to re-run my database build today, I will try and save
parts of the tree in sequence and try and isolate the performance
bottleneck.
cheers,
ian
Martin Taal wrote:
> Hmm this is too long for the amount of data. I am doing a project were I
> insert about 200000 objects in different tables and this takes about 15
> minutes on mysql (also using teneo/hibernate).
> One thing you can do is if the data tree has independent structures to
> save these independent trees first (with a commit after each save) and
> then save the rootNode as the last one. My experience with
> inserting/updating large datasets is that frequent commits can speed up
> things (factor 10 or more). Ofcourse this only works if you can
> drop-create the db if the import fails.
>
> Btw, the runtime part of Teneo should only cost very little extra
> performance. So the performance 'issue' should be somewhere in hibernate
> or mysql. Which uses the most cpu: java or mysql?
>
> gr. Martin
>
> Ian Bull wrote:
>> I should mention that it took about 9 hours to complete.
>>
>> cheers,
>> ian
>>
>> Ian Bull wrote:
>>> I was wondering if there are any tips or pointers regarding
>>> performance and Teneo. I have a fairly large model (about 60M in XML)
>>> and I just loaded it into mysql using Teneo. Basically I just loaded
>>> the root contents from the resource and saved it to a hibernate session.
>>>
>>> I was wondering if there are any obvious ways to speed this up (I
>>> just realized I missed something in my model so I will be running it
>>> again tomorrow). I didn't know if there were a bunch of extra
>>> calculations (notifications) being done that I could safely turn off
>>> during this operation. So far all I do is:
>>>
>>>
>>> Session session = ...
>>> Transaction tx = ....
>>>
>>> session.save(rootNode);
>>>
>>> tx.commit();
>>> session.close();
>>>
>>> cheers,
>>> ian
>
>
|
|
|
Re: [Teneo] Performance creating a DB [message #83761 is a reply to message #83746] |
Thu, 17 May 2007 00:28 |
Eclipse User |
|
|
|
Originally posted by: irbull.cs.uvic.ca
I have spent most the day playing with this and trying different
options. I stopped the cascading, so when one EObject is added it
doesn't add all the related ones, and this speeds up the addition of
each object, but I haven't determined if this actually makes the entire
insert faster.
What I did notice (which is probably not Teneo, more likely Hibernate,
MySQL or the JDBC driver) is that as the tables grow in size, each
insert becomes "noticeably" slower. My basic model consists of nodes
(35,000) each with about 8-10 attributes (name, value pairs). This means
that I will have 35,000 nodes and almost 300,000 attributes. The first
100 nodes (and 1,000) attributes are inserted in about a second, but
after the tables grow (500 nodes, 50,000 attribute rows) everything is
much slower. 100 nodes takes a few seconds to insert and it keeps
getting slower. Does this make sense?
I know this is not a Teneo issue, but anyone has advice on how to
configure teneo for batch insertion, it would be greatly appreciated. I
have tried to set hibernate.jdbc.batch_size 30, but each save still
seems to result in an insert (instead of 1 insert for 30 nodes). I have
also tried the stateless session, but this threw an exception when I
tried to insert a node.
Cheers,
Ian
Ian Bull wrote:
> Thanks for the quick response Martin,
>
> I watched Top for a while and Java seemed to have most of the CPU, but
> MySQL was definitely there.
>
> The problem is, I am basically saving a tree structure with lots of
> cross relationships between the different tree nodes (something like an
> abstract syntax tree with all the semantic relationships included). I
> assume that each time I do a save, a lot of other queries are done to
> connect all the nodes, and this is where the performance hit comes.
>
> Once it is in MySQL, Teneo / hibernate works great to get all the data
> back.
>
> Since I have to re-run my database build today, I will try and save
> parts of the tree in sequence and try and isolate the performance
> bottleneck.
>
> cheers,
> ian
>
>
>
> Martin Taal wrote:
>> Hmm this is too long for the amount of data. I am doing a project were
>> I insert about 200000 objects in different tables and this takes about
>> 15 minutes on mysql (also using teneo/hibernate).
>> One thing you can do is if the data tree has independent structures to
>> save these independent trees first (with a commit after each save) and
>> then save the rootNode as the last one. My experience with
>> inserting/updating large datasets is that frequent commits can speed
>> up things (factor 10 or more). Ofcourse this only works if you can
>> drop-create the db if the import fails.
>>
>> Btw, the runtime part of Teneo should only cost very little extra
>> performance. So the performance 'issue' should be somewhere in
>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>
>> gr. Martin
>>
>> Ian Bull wrote:
>>> I should mention that it took about 9 hours to complete.
>>>
>>> cheers,
>>> ian
>>>
>>> Ian Bull wrote:
>>>> I was wondering if there are any tips or pointers regarding
>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>> loaded the root contents from the resource and saved it to a
>>>> hibernate session.
>>>>
>>>> I was wondering if there are any obvious ways to speed this up (I
>>>> just realized I missed something in my model so I will be running it
>>>> again tomorrow). I didn't know if there were a bunch of extra
>>>> calculations (notifications) being done that I could safely turn off
>>>> during this operation. So far all I do is:
>>>>
>>>>
>>>> Session session = ...
>>>> Transaction tx = ....
>>>>
>>>> session.save(rootNode);
>>>>
>>>> tx.commit();
>>>> session.close();
>>>>
>>>> cheers,
>>>> ian
>>
>>
|
|
|
Re: [Teneo] Performance creating a DB [message #83792 is a reply to message #83761] |
Thu, 17 May 2007 05:38 |
Martin Taal Messages: 5468 Registered: July 2009 |
Senior Member |
|
|
Afaics it is really something which needs to be solved on the Hibernate side (so it is not a Teneo
issue).
I am not sure why the jdbc batch size is ignored, Teneo does not influence/change that setting.
Reading the docs hibernate it says that insert batching is disabled when using an identity
identifier generator.
One thing you can try is to add an interceptor or event listener to hibernate and use this to
forcefully call the session.commit or session.flush to speed things up. Interceptors can be passed
to the session when opening one so it won't influence your standard operation. It is not the nicest
solution but it may speed things up.
gr. Martin
Ian Bull wrote:
> I have spent most the day playing with this and trying different
> options. I stopped the cascading, so when one EObject is added it
> doesn't add all the related ones, and this speeds up the addition of
> each object, but I haven't determined if this actually makes the entire
> insert faster.
>
> What I did notice (which is probably not Teneo, more likely Hibernate,
> MySQL or the JDBC driver) is that as the tables grow in size, each
> insert becomes "noticeably" slower. My basic model consists of nodes
> (35,000) each with about 8-10 attributes (name, value pairs). This means
> that I will have 35,000 nodes and almost 300,000 attributes. The first
> 100 nodes (and 1,000) attributes are inserted in about a second, but
> after the tables grow (500 nodes, 50,000 attribute rows) everything is
> much slower. 100 nodes takes a few seconds to insert and it keeps
> getting slower. Does this make sense?
>
> I know this is not a Teneo issue, but anyone has advice on how to
> configure teneo for batch insertion, it would be greatly appreciated. I
> have tried to set hibernate.jdbc.batch_size 30, but each save still
> seems to result in an insert (instead of 1 insert for 30 nodes). I have
> also tried the stateless session, but this threw an exception when I
> tried to insert a node.
>
> Cheers,
> Ian
>
>
> Ian Bull wrote:
>> Thanks for the quick response Martin,
>>
>> I watched Top for a while and Java seemed to have most of the CPU, but
>> MySQL was definitely there.
>>
>> The problem is, I am basically saving a tree structure with lots of
>> cross relationships between the different tree nodes (something like
>> an abstract syntax tree with all the semantic relationships
>> included). I assume that each time I do a save, a lot of other
>> queries are done to connect all the nodes, and this is where the
>> performance hit comes.
>>
>> Once it is in MySQL, Teneo / hibernate works great to get all the data
>> back.
>>
>> Since I have to re-run my database build today, I will try and save
>> parts of the tree in sequence and try and isolate the performance
>> bottleneck.
>>
>> cheers,
>> ian
>>
>>
>>
>> Martin Taal wrote:
>>> Hmm this is too long for the amount of data. I am doing a project
>>> were I insert about 200000 objects in different tables and this takes
>>> about 15 minutes on mysql (also using teneo/hibernate).
>>> One thing you can do is if the data tree has independent structures
>>> to save these independent trees first (with a commit after each save)
>>> and then save the rootNode as the last one. My experience with
>>> inserting/updating large datasets is that frequent commits can speed
>>> up things (factor 10 or more). Ofcourse this only works if you can
>>> drop-create the db if the import fails.
>>>
>>> Btw, the runtime part of Teneo should only cost very little extra
>>> performance. So the performance 'issue' should be somewhere in
>>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>>
>>> gr. Martin
>>>
>>> Ian Bull wrote:
>>>> I should mention that it took about 9 hours to complete.
>>>>
>>>> cheers,
>>>> ian
>>>>
>>>> Ian Bull wrote:
>>>>> I was wondering if there are any tips or pointers regarding
>>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>>> loaded the root contents from the resource and saved it to a
>>>>> hibernate session.
>>>>>
>>>>> I was wondering if there are any obvious ways to speed this up (I
>>>>> just realized I missed something in my model so I will be running
>>>>> it again tomorrow). I didn't know if there were a bunch of extra
>>>>> calculations (notifications) being done that I could safely turn
>>>>> off during this operation. So far all I do is:
>>>>>
>>>>>
>>>>> Session session = ...
>>>>> Transaction tx = ....
>>>>>
>>>>> session.save(rootNode);
>>>>>
>>>>> tx.commit();
>>>>> session.close();
>>>>>
>>>>> cheers,
>>>>> ian
>>>
>>>
--
With Regards, Martin Taal
Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@springsite.com - mtaal@elver.org
Web: www.springsite.com - www.elver.org
|
|
|
Re: [Teneo] Performance creating a DB [message #92343 is a reply to message #83761] |
Wed, 01 August 2007 17:24 |
Alain Picard Messages: 266 Registered: July 2009 |
Senior Member |
|
|
Ian,
After posting a similar topic, I just noticed this earlier posting. We are in a similar situation as we are saving tree structures
(annotated concrete syntax tree) much like you and it is taking a very long time.
Were you able to find some sort of solution to this problem?
Thanks
Alain
Ian Bull wrote:
> I have spent most the day playing with this and trying different
> options. I stopped the cascading, so when one EObject is added it
> doesn't add all the related ones, and this speeds up the addition of
> each object, but I haven't determined if this actually makes the entire
> insert faster.
>
> What I did notice (which is probably not Teneo, more likely Hibernate,
> MySQL or the JDBC driver) is that as the tables grow in size, each
> insert becomes "noticeably" slower. My basic model consists of nodes
> (35,000) each with about 8-10 attributes (name, value pairs). This means
> that I will have 35,000 nodes and almost 300,000 attributes. The first
> 100 nodes (and 1,000) attributes are inserted in about a second, but
> after the tables grow (500 nodes, 50,000 attribute rows) everything is
> much slower. 100 nodes takes a few seconds to insert and it keeps
> getting slower. Does this make sense?
>
> I know this is not a Teneo issue, but anyone has advice on how to
> configure teneo for batch insertion, it would be greatly appreciated. I
> have tried to set hibernate.jdbc.batch_size 30, but each save still
> seems to result in an insert (instead of 1 insert for 30 nodes). I have
> also tried the stateless session, but this threw an exception when I
> tried to insert a node.
>
> Cheers,
> Ian
>
>
> Ian Bull wrote:
>> Thanks for the quick response Martin,
>>
>> I watched Top for a while and Java seemed to have most of the CPU, but
>> MySQL was definitely there.
>>
>> The problem is, I am basically saving a tree structure with lots of
>> cross relationships between the different tree nodes (something like
>> an abstract syntax tree with all the semantic relationships
>> included). I assume that each time I do a save, a lot of other
>> queries are done to connect all the nodes, and this is where the
>> performance hit comes.
>>
>> Once it is in MySQL, Teneo / hibernate works great to get all the data
>> back.
>>
>> Since I have to re-run my database build today, I will try and save
>> parts of the tree in sequence and try and isolate the performance
>> bottleneck.
>>
>> cheers,
>> ian
>>
>>
>>
>> Martin Taal wrote:
>>> Hmm this is too long for the amount of data. I am doing a project
>>> were I insert about 200000 objects in different tables and this takes
>>> about 15 minutes on mysql (also using teneo/hibernate).
>>> One thing you can do is if the data tree has independent structures
>>> to save these independent trees first (with a commit after each save)
>>> and then save the rootNode as the last one. My experience with
>>> inserting/updating large datasets is that frequent commits can speed
>>> up things (factor 10 or more). Ofcourse this only works if you can
>>> drop-create the db if the import fails.
>>>
>>> Btw, the runtime part of Teneo should only cost very little extra
>>> performance. So the performance 'issue' should be somewhere in
>>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>>
>>> gr. Martin
>>>
>>> Ian Bull wrote:
>>>> I should mention that it took about 9 hours to complete.
>>>>
>>>> cheers,
>>>> ian
>>>>
>>>> Ian Bull wrote:
>>>>> I was wondering if there are any tips or pointers regarding
>>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>>> loaded the root contents from the resource and saved it to a
>>>>> hibernate session.
>>>>>
>>>>> I was wondering if there are any obvious ways to speed this up (I
>>>>> just realized I missed something in my model so I will be running
>>>>> it again tomorrow). I didn't know if there were a bunch of extra
>>>>> calculations (notifications) being done that I could safely turn
>>>>> off during this operation. So far all I do is:
>>>>>
>>>>>
>>>>> Session session = ...
>>>>> Transaction tx = ....
>>>>>
>>>>> session.save(rootNode);
>>>>>
>>>>> tx.commit();
>>>>> session.close();
>>>>>
>>>>> cheers,
>>>>> ian
>>>
>>>
|
|
| | |
Re: [Teneo] Performance creating a DB [message #606713 is a reply to message #83544] |
Wed, 16 May 2007 15:17 |
Ian Bull Messages: 145 Registered: July 2009 |
Senior Member |
|
|
Thanks for the quick response Martin,
I watched Top for a while and Java seemed to have most of the CPU, but
MySQL was definitely there.
The problem is, I am basically saving a tree structure with lots of
cross relationships between the different tree nodes (something like an
abstract syntax tree with all the semantic relationships included). I
assume that each time I do a save, a lot of other queries are done to
connect all the nodes, and this is where the performance hit comes.
Once it is in MySQL, Teneo / hibernate works great to get all the data back.
Since I have to re-run my database build today, I will try and save
parts of the tree in sequence and try and isolate the performance
bottleneck.
cheers,
ian
Martin Taal wrote:
> Hmm this is too long for the amount of data. I am doing a project were I
> insert about 200000 objects in different tables and this takes about 15
> minutes on mysql (also using teneo/hibernate).
> One thing you can do is if the data tree has independent structures to
> save these independent trees first (with a commit after each save) and
> then save the rootNode as the last one. My experience with
> inserting/updating large datasets is that frequent commits can speed up
> things (factor 10 or more). Ofcourse this only works if you can
> drop-create the db if the import fails.
>
> Btw, the runtime part of Teneo should only cost very little extra
> performance. So the performance 'issue' should be somewhere in hibernate
> or mysql. Which uses the most cpu: java or mysql?
>
> gr. Martin
>
> Ian Bull wrote:
>> I should mention that it took about 9 hours to complete.
>>
>> cheers,
>> ian
>>
>> Ian Bull wrote:
>>> I was wondering if there are any tips or pointers regarding
>>> performance and Teneo. I have a fairly large model (about 60M in XML)
>>> and I just loaded it into mysql using Teneo. Basically I just loaded
>>> the root contents from the resource and saved it to a hibernate session.
>>>
>>> I was wondering if there are any obvious ways to speed this up (I
>>> just realized I missed something in my model so I will be running it
>>> again tomorrow). I didn't know if there were a bunch of extra
>>> calculations (notifications) being done that I could safely turn off
>>> during this operation. So far all I do is:
>>>
>>>
>>> Session session = ...
>>> Transaction tx = ....
>>>
>>> session.save(rootNode);
>>>
>>> tx.commit();
>>> session.close();
>>>
>>> cheers,
>>> ian
>
>
|
|
|
Re: [Teneo] Performance creating a DB [message #606714 is a reply to message #83746] |
Thu, 17 May 2007 00:28 |
Ian Bull Messages: 145 Registered: July 2009 |
Senior Member |
|
|
I have spent most the day playing with this and trying different
options. I stopped the cascading, so when one EObject is added it
doesn't add all the related ones, and this speeds up the addition of
each object, but I haven't determined if this actually makes the entire
insert faster.
What I did notice (which is probably not Teneo, more likely Hibernate,
MySQL or the JDBC driver) is that as the tables grow in size, each
insert becomes "noticeably" slower. My basic model consists of nodes
(35,000) each with about 8-10 attributes (name, value pairs). This means
that I will have 35,000 nodes and almost 300,000 attributes. The first
100 nodes (and 1,000) attributes are inserted in about a second, but
after the tables grow (500 nodes, 50,000 attribute rows) everything is
much slower. 100 nodes takes a few seconds to insert and it keeps
getting slower. Does this make sense?
I know this is not a Teneo issue, but anyone has advice on how to
configure teneo for batch insertion, it would be greatly appreciated. I
have tried to set hibernate.jdbc.batch_size 30, but each save still
seems to result in an insert (instead of 1 insert for 30 nodes). I have
also tried the stateless session, but this threw an exception when I
tried to insert a node.
Cheers,
Ian
Ian Bull wrote:
> Thanks for the quick response Martin,
>
> I watched Top for a while and Java seemed to have most of the CPU, but
> MySQL was definitely there.
>
> The problem is, I am basically saving a tree structure with lots of
> cross relationships between the different tree nodes (something like an
> abstract syntax tree with all the semantic relationships included). I
> assume that each time I do a save, a lot of other queries are done to
> connect all the nodes, and this is where the performance hit comes.
>
> Once it is in MySQL, Teneo / hibernate works great to get all the data
> back.
>
> Since I have to re-run my database build today, I will try and save
> parts of the tree in sequence and try and isolate the performance
> bottleneck.
>
> cheers,
> ian
>
>
>
> Martin Taal wrote:
>> Hmm this is too long for the amount of data. I am doing a project were
>> I insert about 200000 objects in different tables and this takes about
>> 15 minutes on mysql (also using teneo/hibernate).
>> One thing you can do is if the data tree has independent structures to
>> save these independent trees first (with a commit after each save) and
>> then save the rootNode as the last one. My experience with
>> inserting/updating large datasets is that frequent commits can speed
>> up things (factor 10 or more). Ofcourse this only works if you can
>> drop-create the db if the import fails.
>>
>> Btw, the runtime part of Teneo should only cost very little extra
>> performance. So the performance 'issue' should be somewhere in
>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>
>> gr. Martin
>>
>> Ian Bull wrote:
>>> I should mention that it took about 9 hours to complete.
>>>
>>> cheers,
>>> ian
>>>
>>> Ian Bull wrote:
>>>> I was wondering if there are any tips or pointers regarding
>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>> loaded the root contents from the resource and saved it to a
>>>> hibernate session.
>>>>
>>>> I was wondering if there are any obvious ways to speed this up (I
>>>> just realized I missed something in my model so I will be running it
>>>> again tomorrow). I didn't know if there were a bunch of extra
>>>> calculations (notifications) being done that I could safely turn off
>>>> during this operation. So far all I do is:
>>>>
>>>>
>>>> Session session = ...
>>>> Transaction tx = ....
>>>>
>>>> session.save(rootNode);
>>>>
>>>> tx.commit();
>>>> session.close();
>>>>
>>>> cheers,
>>>> ian
>>
>>
|
|
|
Re: [Teneo] Performance creating a DB [message #606716 is a reply to message #83761] |
Thu, 17 May 2007 05:38 |
Martin Taal Messages: 5468 Registered: July 2009 |
Senior Member |
|
|
Afaics it is really something which needs to be solved on the Hibernate side (so it is not a Teneo
issue).
I am not sure why the jdbc batch size is ignored, Teneo does not influence/change that setting.
Reading the docs hibernate it says that insert batching is disabled when using an identity
identifier generator.
One thing you can try is to add an interceptor or event listener to hibernate and use this to
forcefully call the session.commit or session.flush to speed things up. Interceptors can be passed
to the session when opening one so it won't influence your standard operation. It is not the nicest
solution but it may speed things up.
gr. Martin
Ian Bull wrote:
> I have spent most the day playing with this and trying different
> options. I stopped the cascading, so when one EObject is added it
> doesn't add all the related ones, and this speeds up the addition of
> each object, but I haven't determined if this actually makes the entire
> insert faster.
>
> What I did notice (which is probably not Teneo, more likely Hibernate,
> MySQL or the JDBC driver) is that as the tables grow in size, each
> insert becomes "noticeably" slower. My basic model consists of nodes
> (35,000) each with about 8-10 attributes (name, value pairs). This means
> that I will have 35,000 nodes and almost 300,000 attributes. The first
> 100 nodes (and 1,000) attributes are inserted in about a second, but
> after the tables grow (500 nodes, 50,000 attribute rows) everything is
> much slower. 100 nodes takes a few seconds to insert and it keeps
> getting slower. Does this make sense?
>
> I know this is not a Teneo issue, but anyone has advice on how to
> configure teneo for batch insertion, it would be greatly appreciated. I
> have tried to set hibernate.jdbc.batch_size 30, but each save still
> seems to result in an insert (instead of 1 insert for 30 nodes). I have
> also tried the stateless session, but this threw an exception when I
> tried to insert a node.
>
> Cheers,
> Ian
>
>
> Ian Bull wrote:
>> Thanks for the quick response Martin,
>>
>> I watched Top for a while and Java seemed to have most of the CPU, but
>> MySQL was definitely there.
>>
>> The problem is, I am basically saving a tree structure with lots of
>> cross relationships between the different tree nodes (something like
>> an abstract syntax tree with all the semantic relationships
>> included). I assume that each time I do a save, a lot of other
>> queries are done to connect all the nodes, and this is where the
>> performance hit comes.
>>
>> Once it is in MySQL, Teneo / hibernate works great to get all the data
>> back.
>>
>> Since I have to re-run my database build today, I will try and save
>> parts of the tree in sequence and try and isolate the performance
>> bottleneck.
>>
>> cheers,
>> ian
>>
>>
>>
>> Martin Taal wrote:
>>> Hmm this is too long for the amount of data. I am doing a project
>>> were I insert about 200000 objects in different tables and this takes
>>> about 15 minutes on mysql (also using teneo/hibernate).
>>> One thing you can do is if the data tree has independent structures
>>> to save these independent trees first (with a commit after each save)
>>> and then save the rootNode as the last one. My experience with
>>> inserting/updating large datasets is that frequent commits can speed
>>> up things (factor 10 or more). Ofcourse this only works if you can
>>> drop-create the db if the import fails.
>>>
>>> Btw, the runtime part of Teneo should only cost very little extra
>>> performance. So the performance 'issue' should be somewhere in
>>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>>
>>> gr. Martin
>>>
>>> Ian Bull wrote:
>>>> I should mention that it took about 9 hours to complete.
>>>>
>>>> cheers,
>>>> ian
>>>>
>>>> Ian Bull wrote:
>>>>> I was wondering if there are any tips or pointers regarding
>>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>>> loaded the root contents from the resource and saved it to a
>>>>> hibernate session.
>>>>>
>>>>> I was wondering if there are any obvious ways to speed this up (I
>>>>> just realized I missed something in my model so I will be running
>>>>> it again tomorrow). I didn't know if there were a bunch of extra
>>>>> calculations (notifications) being done that I could safely turn
>>>>> off during this operation. So far all I do is:
>>>>>
>>>>>
>>>>> Session session = ...
>>>>> Transaction tx = ....
>>>>>
>>>>> session.save(rootNode);
>>>>>
>>>>> tx.commit();
>>>>> session.close();
>>>>>
>>>>> cheers,
>>>>> ian
>>>
>>>
--
With Regards, Martin Taal
Springsite/Elver.org
Office: Hardwareweg 4, 3821 BV Amersfoort
Postal: Nassaulaan 7, 3941 EC Doorn
The Netherlands
Tel: +31 (0)84 420 2397
Fax: +31 (0)84 225 9307
Mail: mtaal@springsite.com - mtaal@elver.org
Web: www.springsite.com - www.elver.org
|
|
|
Re: [Teneo] Performance creating a DB [message #609407 is a reply to message #83761] |
Wed, 01 August 2007 17:24 |
Alain Picard Messages: 266 Registered: July 2009 |
Senior Member |
|
|
Ian,
After posting a similar topic, I just noticed this earlier posting. We are in a similar situation as we are saving tree structures
(annotated concrete syntax tree) much like you and it is taking a very long time.
Were you able to find some sort of solution to this problem?
Thanks
Alain
Ian Bull wrote:
> I have spent most the day playing with this and trying different
> options. I stopped the cascading, so when one EObject is added it
> doesn't add all the related ones, and this speeds up the addition of
> each object, but I haven't determined if this actually makes the entire
> insert faster.
>
> What I did notice (which is probably not Teneo, more likely Hibernate,
> MySQL or the JDBC driver) is that as the tables grow in size, each
> insert becomes "noticeably" slower. My basic model consists of nodes
> (35,000) each with about 8-10 attributes (name, value pairs). This means
> that I will have 35,000 nodes and almost 300,000 attributes. The first
> 100 nodes (and 1,000) attributes are inserted in about a second, but
> after the tables grow (500 nodes, 50,000 attribute rows) everything is
> much slower. 100 nodes takes a few seconds to insert and it keeps
> getting slower. Does this make sense?
>
> I know this is not a Teneo issue, but anyone has advice on how to
> configure teneo for batch insertion, it would be greatly appreciated. I
> have tried to set hibernate.jdbc.batch_size 30, but each save still
> seems to result in an insert (instead of 1 insert for 30 nodes). I have
> also tried the stateless session, but this threw an exception when I
> tried to insert a node.
>
> Cheers,
> Ian
>
>
> Ian Bull wrote:
>> Thanks for the quick response Martin,
>>
>> I watched Top for a while and Java seemed to have most of the CPU, but
>> MySQL was definitely there.
>>
>> The problem is, I am basically saving a tree structure with lots of
>> cross relationships between the different tree nodes (something like
>> an abstract syntax tree with all the semantic relationships
>> included). I assume that each time I do a save, a lot of other
>> queries are done to connect all the nodes, and this is where the
>> performance hit comes.
>>
>> Once it is in MySQL, Teneo / hibernate works great to get all the data
>> back.
>>
>> Since I have to re-run my database build today, I will try and save
>> parts of the tree in sequence and try and isolate the performance
>> bottleneck.
>>
>> cheers,
>> ian
>>
>>
>>
>> Martin Taal wrote:
>>> Hmm this is too long for the amount of data. I am doing a project
>>> were I insert about 200000 objects in different tables and this takes
>>> about 15 minutes on mysql (also using teneo/hibernate).
>>> One thing you can do is if the data tree has independent structures
>>> to save these independent trees first (with a commit after each save)
>>> and then save the rootNode as the last one. My experience with
>>> inserting/updating large datasets is that frequent commits can speed
>>> up things (factor 10 or more). Ofcourse this only works if you can
>>> drop-create the db if the import fails.
>>>
>>> Btw, the runtime part of Teneo should only cost very little extra
>>> performance. So the performance 'issue' should be somewhere in
>>> hibernate or mysql. Which uses the most cpu: java or mysql?
>>>
>>> gr. Martin
>>>
>>> Ian Bull wrote:
>>>> I should mention that it took about 9 hours to complete.
>>>>
>>>> cheers,
>>>> ian
>>>>
>>>> Ian Bull wrote:
>>>>> I was wondering if there are any tips or pointers regarding
>>>>> performance and Teneo. I have a fairly large model (about 60M in
>>>>> XML) and I just loaded it into mysql using Teneo. Basically I just
>>>>> loaded the root contents from the resource and saved it to a
>>>>> hibernate session.
>>>>>
>>>>> I was wondering if there are any obvious ways to speed this up (I
>>>>> just realized I missed something in my model so I will be running
>>>>> it again tomorrow). I didn't know if there were a bunch of extra
>>>>> calculations (notifications) being done that I could safely turn
>>>>> off during this operation. So far all I do is:
>>>>>
>>>>>
>>>>> Session session = ...
>>>>> Transaction tx = ....
>>>>>
>>>>> session.save(rootNode);
>>>>>
>>>>> tx.commit();
>>>>> session.close();
>>>>>
>>>>> cheers,
>>>>> ian
>>>
>>>
|
|
|
Goto Forum:
Current Time: Fri Sep 20 21:51:39 GMT 2024
Powered by FUDForum. Page generated in 0.04182 seconds
|