Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF » [CDO] scalability of to-many references
[CDO] scalability of to-many references [message #500670] Fri, 27 November 2009 12:37 Go to next message
Lothar Werzinger is currently offline Lothar WerzingerFriend
Messages: 153
Registered: July 2009
Location: Bay Area
Senior Member
Hi,

I am currently doing some scalability tests and the current way that to-many
references are mapped to the database won't scale to large numbers of
objects contained in the to-many reference.

This is due to the fact that when auditing is used (mandatory for my
application) every time one ore more object references are added to the
to-many reference the whole list of object references (with a new CDO
version) is added to the table containing the to-many references. So if the
to-many reference contains one million object references and you add
another one then one million and one rows need to be inserted to the
database.

I propose a modified mapping that keeps auditibility without having to
replicate all the old object references.

As to-many references basically have tow operations (add and object and
remove an object) I propose to split the 'cdo_version' column into
two: 'cdo_version_added' and 'cdo_version_removed'.
If a new object gets added to the to-many reference, then
it's 'cdo_version_added' gets set. If it is removed
it's 'cdo_version_removed' gets set accordingly.

With this enhanced mapping only one row for each inserted object is used
unless the identical object is reinserted after it was removed. Only in
that case 'duplicate' rows (with different 'cdo_version_added'
and 'cdo_version_removed') need to be created.

The new mapping will be almost as effective retrieving the to-many reference
for a given CDO version.
The old (current) query would be like this:
SELECT * FROM X WHERE cdo_version=?
and the new (proposed) query would be
SELECT * FROM X WHERE cdo_version_added <= ? AND (cdo_version_removed is
NULL OR cdo_version_removed > ?)

Please let me know your thoughts.

P.S.
I am currently unfamiliar with the code that does the DB mapping, but I
would be willing to help with the new mapping if I can get the necessary
pointers where I need to focus my reading to modify and test the proposed
mapping.

Lothar
Re: [CDO] scalability of to-many references [message #500908 is a reply to message #500670] Mon, 30 November 2009 09:20 Go to previous messageGo to next message
Eike Stepper is currently offline Eike StepperFriend
Messages: 6682
Registered: July 2009
Senior Member
Hi Lothar,

It seems reasonable to optimize DB access for large collections in
auditing mode. I suggest that you file a bugzilla to discuss possible
solutions and attach patches. Stefan will want to add his opinion as well.

Cheers
/Eike

----
http://thegordian.blogspot.com
http://twitter.com/eikestepper



Lothar Werzinger schrieb:
> Hi,
>
> I am currently doing some scalability tests and the current way that to-many
> references are mapped to the database won't scale to large numbers of
> objects contained in the to-many reference.
>
> This is due to the fact that when auditing is used (mandatory for my
> application) every time one ore more object references are added to the
> to-many reference the whole list of object references (with a new CDO
> version) is added to the table containing the to-many references. So if the
> to-many reference contains one million object references and you add
> another one then one million and one rows need to be inserted to the
> database.
>
> I propose a modified mapping that keeps auditibility without having to
> replicate all the old object references.
>
> As to-many references basically have tow operations (add and object and
> remove an object) I propose to split the 'cdo_version' column into
> two: 'cdo_version_added' and 'cdo_version_removed'.
> If a new object gets added to the to-many reference, then
> it's 'cdo_version_added' gets set. If it is removed
> it's 'cdo_version_removed' gets set accordingly.
>
> With this enhanced mapping only one row for each inserted object is used
> unless the identical object is reinserted after it was removed. Only in
> that case 'duplicate' rows (with different 'cdo_version_added'
> and 'cdo_version_removed') need to be created.
>
> The new mapping will be almost as effective retrieving the to-many reference
> for a given CDO version.
> The old (current) query would be like this:
> SELECT * FROM X WHERE cdo_version=?
> and the new (proposed) query would be
> SELECT * FROM X WHERE cdo_version_added <= ? AND (cdo_version_removed is
> NULL OR cdo_version_removed > ?)
>
> Please let me know your thoughts.
>
> P.S.
> I am currently unfamiliar with the code that does the DB mapping, but I
> would be willing to help with the new mapping if I can get the necessary
> pointers where I need to focus my reading to modify and test the proposed
> mapping.
>
> Lothar
>
>
>


Re: [CDO] scalability of to-many references [message #500914 is a reply to message #500908] Mon, 30 November 2009 09:53 Go to previous message
Lothar Werzinger is currently offline Lothar WerzingerFriend
Messages: 153
Registered: July 2009
Location: Bay Area
Senior Member
Eike Stepper wrote:

> Hi Lothar,
>
> It seems reasonable to optimize DB access for large collections in
> auditing mode. I suggest that you file a bugzilla to discuss possible
> solutions and attach patches. Stefan will want to add his opinion as well.

Done. https://bugs.eclipse.org/bugs/show_bug.cgi?id=296440

> Cheers
> /Eike

Lothar
Previous Topic:Issue with References when upgrading to 2.5
Next Topic:Dynamic attributes value of EMF generated models in GMF editor
Goto Forum:
  


Current Time: Thu Apr 18 00:22:01 GMT 2024

Powered by FUDForum. Page generated in 0.02678 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top