Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF » (no subject)
(no subject) [message #686717] Thu, 12 May 2011 04:50 Go to next message
John Smith is currently offline John SmithFriend
Messages: 137
Registered: July 2009
Senior Member
When a XMI file is loaded, a parsed reference from an element A to an
element B may be recorded as a so called "forward reference" if B is not
yet parsed (since located further down the file). However if the
reference has an eOpposite, its not always required to record that
forward reference, since the reference will be set when the opposite
reference is parsed (together with element B). The local Java variable
"mustAdd" in XMLHandler determines this necessity whether a forward
reference must be recorded or not.

So I do not understand the statement

mustAdd = eOpposite.isTransient() || eReference.isMany();

Of course, if the eOpposite reference is transient, then the opposite
reference will never be parsed (together with element B) and a forward
reference must be added. However why must a forward reference be added
if its a many-valued reference? My only idea why this could be wanted is
to preserve the order of the references. But then this statement would
be more senseful:

mustAdd = eOpposite.isTransient() || eReference.isMany() &&
eReference.isOrdered();

This statement would mean, that for metamodels using bidirectional
references where order of references is not important no forward
references would need to be recorded at all.

Since reference are by default ordered, the new implementation would
affect few projects, however would speed up for projects using the
ordered=false feature. My goal is to write a CDO importer for XMI files
of several gigs size, flushing already read elements at defined time
intervals to the CDO database since CLEAN elements can be CDO-garbage
collected (it doesnt matter if they get DIRTY later, when the opposite
reference is read). And I fear the forward reference list could get very
large.
(no subject) [message #686721 is a reply to message #686717] Thu, 12 May 2011 13:12 Go to previous messageGo to next message
Ed Merks is currently offline Ed MerksFriend
Messages: 33136
Registered: July 2009
Senior Member
Comments below.

exquisitus wrote:
>
> When a XMI file is loaded, a parsed reference from an element A to an
> element B may be recorded as a so called "forward reference" if B is
> not yet parsed (since located further down the file). However if the
> reference has an eOpposite, its not always required to record that
> forward reference, since the reference will be set when the opposite
> reference is parsed (together with element B).
Yes, but will that properly preserve the order recorded in the XMI?
> The local Java variable "mustAdd" in XMLHandler determines this
> necessity whether a forward reference must be recorded or not.
>
> So I do not understand the statement
>
> mustAdd = eOpposite.isTransient() || eReference.isMany();
>
> Of course, if the eOpposite reference is transient, then the opposite
> reference will never be parsed (together with element B) and a forward
> reference must be added. However why must a forward reference be added
> if its a many-valued reference?
To preserve order.
> My only idea why this could be wanted is to preserve the order of the
> references. But then this statement would be more senseful:
>
> mustAdd = eOpposite.isTransient() || eReference.isMany()
> && eReference.isOrdered();
Why store it though if that stored order isn't important? It seems bad
to me that writing something and then reading it back in, doesn't
produce something that's EcoreUtil.equals.
>
> This statement would mean, that for metamodels using bidirectional
> references where order of references is not important no forward
> references would need to be recorded at all.
>
> Since reference are by default ordered, the new implementation would
> affect few projects, however would speed up for projects using the
> ordered=false feature. My goal is to write a CDO importer for XMI
> files of several gigs size, flushing already read elements at defined
> time intervals to the CDO database since CLEAN elements can be
> CDO-garbage collected (it doesnt matter if they get DIRTY later, when
> the opposite reference is read). And I fear the forward reference list
> could get very large.
I'm curious though why you need to persist it if order isn't important.
I guess that's more an issue how how CDO populates that feature's value?


Ed Merks
Professional Support: https://www.macromodeling.com/
(no subject) [message #686739 is a reply to message #686721] Fri, 13 May 2011 13:33 Go to previous messageGo to next message
John Smith is currently offline John SmithFriend
Messages: 137
Registered: July 2009
Senior Member
>> When a XMI file is loaded, a parsed reference from an element A to an
>> element B may be recorded as a so called "forward reference" if B is
>> not yet parsed (since located further down the file). However if the
>> reference has an eOpposite, its not always required to record that
>> forward reference, since the reference will be set when the opposite
>> reference is parsed (together with element B).
> Yes, but will that properly preserve the order recorded in the XMI?
>> The local Java variable "mustAdd" in XMLHandler determines this
>> necessity whether a forward reference must be recorded or not.
>>
>> So I do not understand the statement
>>
>> mustAdd = eOpposite.isTransient() || eReference.isMany();
>>
>> Of course, if the eOpposite reference is transient, then the opposite
>> reference will never be parsed (together with element B) and a forward
>> reference must be added. However why must a forward reference be added
>> if its a many-valued reference?
> To preserve order.
>> My only idea why this could be wanted is to preserve the order of the
>> references. But then this statement would be more senseful:
>>
>> mustAdd = eOpposite.isTransient() || eReference.isMany() &&
>> eReference.isOrdered();
> Why store it though if that stored order isn't important? It seems bad
> to me that writing something and then reading it back in, doesn't
> produce something that's EcoreUtil.equals.

Ok, then has the isOrdered property any relevance?

>>
>> This statement would mean, that for metamodels using bidirectional
>> references where order of references is not important no forward
>> references would need to be recorded at all.
>>
>> Since reference are by default ordered, the new implementation would
>> affect few projects, however would speed up for projects using the
>> ordered=false feature. My goal is to write a CDO importer for XMI
>> files of several gigs size, flushing already read elements at defined
>> time intervals to the CDO database since CLEAN elements can be
>> CDO-garbage collected (it doesnt matter if they get DIRTY later, when
>> the opposite reference is read). And I fear the forward reference list
>> could get very large.
> I'm curious though why you need to persist it if order isn't important.
> I guess that's more an issue how how CDO populates that feature's value?

No I just was wondering if my speculation was right, that the statement
has to do with ordering since it was not documented in the code. So if I
would write an OR mapping, then I am also required to preserve order for
the isOrdred=False case in order not to break with EcoreUtil.equals? I
already did write an OR mapping where I stored references as a Postgres
bit map, which cannot store order by nature (however this implementation
is already history, like so much which not gets open source :-).

Hm, just thinking: with my proposed implementation, doing a multiple
read/write of the XMI file (without changing anything), would
(1) some references go in circles
(2) nothing would change
(3) only the first write would change, the second no longer

if (3) is the case, would my proposal be interesting for you again (i
not know what case actually is true)?

my proposal also aims at the philosophy: cause as much chaos as you can,
to find bugs early. maybe this is not the best option for important code
like emf...

>>
> I'm curious though why you need to persist it if order isn't important.
> I guess that's more an issue how how CDO populates that feature's value?

I need to persist the order only if isOrdered=true, if isOrdered=false,
then order is not relevant to me. I only wanted to say, that my proposed
implementation would not break most of existing EMF applications since I
guess 99.9% of them use isOrdered=true, as it is the default.

It all comes down to the point whether to decide, that two ELists with
ordered=false shall be equal even if they have a different order. This
would be my intuive understanding of the ordered-property, and if that
is so, then the EcoreUtil.equals implementation has a severe bug.
(no subject) [message #686740 is a reply to message #686739] Fri, 13 May 2011 14:01 Go to previous messageGo to next message
Ed Merks is currently offline Ed MerksFriend
Messages: 33136
Registered: July 2009
Senior Member
Comments below.

exquisitus wrote:
>
> >> When a XMI file is loaded, a parsed reference from an element A to an
> >> element B may be recorded as a so called "forward reference" if B is
> >> not yet parsed (since located further down the file). However if the
> >> reference has an eOpposite, its not always required to record that
> >> forward reference, since the reference will be set when the opposite
> >> reference is parsed (together with element B).
> > Yes, but will that properly preserve the order recorded in the XMI?
> >> The local Java variable "mustAdd" in XMLHandler determines this
> >> necessity whether a forward reference must be recorded or not.
> >>
> >> So I do not understand the statement
> >>
> >> mustAdd = eOpposite.isTransient() || eReference.isMany();
> >>
> >> Of course, if the eOpposite reference is transient, then the opposite
> >> reference will never be parsed (together with element B) and a forward
> >> reference must be added. However why must a forward reference be added
> >> if its a many-valued reference?
> > To preserve order.
> >> My only idea why this could be wanted is to preserve the order of the
> >> references. But then this statement would be more senseful:
> >>
> >> mustAdd = eOpposite.isTransient() || eReference.isMany() &&
> >> eReference.isOrdered();
> > Why store it though if that stored order isn't important? It seems bad
> > to me that writing something and then reading it back in, doesn't
> > produce something that's EcoreUtil.equals.
>
> Ok, then has the isOrdered property any relevance?
Not a lot, no. I doubt I'd have put it there if it weren't in MOF when
we started...
>
> >>
> >> This statement would mean, that for metamodels using bidirectional
> >> references where order of references is not important no forward
> >> references would need to be recorded at all.
> >>
> >> Since reference are by default ordered, the new implementation would
> >> affect few projects, however would speed up for projects using the
> >> ordered=false feature. My goal is to write a CDO importer for XMI
> >> files of several gigs size, flushing already read elements at defined
> >> time intervals to the CDO database since CLEAN elements can be
> >> CDO-garbage collected (it doesnt matter if they get DIRTY later, when
> >> the opposite reference is read). And I fear the forward reference list
> >> could get very large.
> > I'm curious though why you need to persist it if order isn't important.
> > I guess that's more an issue how how CDO populates that feature's
> value?
>
> No I just was wondering if my speculation was right, that the
> statement has to do with ordering since it was not documented in the
> code. So if I would write an OR mapping, then I am also required to
> preserve order for the isOrdred=False case in order not to break with
> EcoreUtil.equals?
Yes, if it has a different order, it won't compare as equals. That's
not necessarily a problem. What do you need equals for anyway?
> I already did write an OR mapping where I stored references as a
> Postgres bit map, which cannot store order by nature (however this
> implementation is already history, like so much which not gets open
> source :-).
>
> Hm, just thinking: with my proposed implementation, doing a multiple
> read/write of the XMI file (without changing anything), would
> (1) some references go in circles
> (2) nothing would change
> (3) only the first write would change, the second no longer
>
> if (3) is the case, would my proposal be interesting for you again (i
> not know what case actually is true)?
After reading, all the references would end up in the order in which
their opposites are processed; if you saved and loaded that, it would
remain the same. Of course if you changed the structure in a way that
changed the order of the opposites, you'd be back at square one.
>
> my proposal also aims at the philosophy: cause as much chaos as you
> can, to find bugs early. maybe this is not the best option for
> important code like emf...
It's always good to explore the dark corners.
>
> >>
> > I'm curious though why you need to persist it if order isn't important.
> > I guess that's more an issue how how CDO populates that feature's
> value?
>
> I need to persist the order only if isOrdered=true, if
> isOrdered=false, then order is not relevant to me. I only wanted to
> say, that my proposed implementation would not break most of existing
> EMF applications since I guess 99.9% of them use isOrdered=true, as it
> is the default.
Yes, I imagine that's true.
>
> It all comes down to the point whether to decide, that two ELists with
> ordered=false shall be equal even if they have a different order.
No, not so much. EList is a List, and List has a very well defined
definition for equals that can't be violated; even the implementation
for hashCode is carefully spelled out.
> This would be my intuive understanding of the ordered-property, and if
> that is so, then the EcoreUtil.equals implementation has a severe bug.
Trying to do an unordered comparison in EcoreUtil.equals would turn this
into a problem like EMF compare. Establishing a two-way mapping would
become extremely difficult and expensive. I'd never attempt that...


Ed Merks
Professional Support: https://www.macromodeling.com/
(no subject) [message #686741 is a reply to message #686739] Fri, 13 May 2011 14:36 Go to previous messageGo to next message
John Smith is currently offline John SmithFriend
Messages: 137
Registered: July 2009
Senior Member
>Your idea of ignoring forward references when order isn't significant
>(isOrdered == false) isn't a bad one, but it begs the question of why
>have them (in the XMI/XML) in the first place if it's not >significant.

My proposed implementation is thought to be a speed tuning, it shall not
change any serialization issues, so e.g. omitting writing out forward
references. Though this would be also a possible optimization, but an
optimization changing too much in my opinion.

> And, as I point out, to process a serialization that has
>enough information to preserve order (even when not significant) but
>then doesn't use that information begs more questions.
I would agree if the overhead to preserve (maybe not domain-significant)
order is not high. I have given you the usecase where I plan to read
very large XMI files, and a bottle neck may be the unnecessary storing
of the forward list. It can be hundreds of megabyte large since the xmi
files store architectural information of very big models (10 gigs or so
not seldom). A SAX-style light-weight parsing will be bottlenecked by an
increasing forward list.

>As for doing equality comparison, if you look closely at how that's
>implemented, establishing a two-way mapping between the equal things
>is key in the processing. If we are to compare two lists, but have
>no idea how to go about establishing that mapping, i.e, the first
>element in the first list must be deep structurally compared to all
>elements in the other list, we aren't going to be able to produce such
>a map. Remember, we're not comparing things that are Java .equals so
>we can simply implement the comparison like we would we do compare two
>Sets for equality, i.e., check that each element of each set is
>contained in the other.

I am aware that comparing graphs is a complicated topic - as I remember
it is one of the very few problems where the mathematicians really not
know if its NP or P, since all clever algorithms seem to take polynomial
time however no one knows if there really exists an algorithm of
polynomial time.

However this fact is exactly no problem for EcoreUtil.equals() as it
treats sets as lists :). It should be as it is now, users with special
compare requests like set-comparison should use EMF Compare I guess,
since otherwise EcoreUtil.equals() would take days for graphs of no more
of 50 nodes.

It would however be nice if switching to my implementation could be
either controled using a flag mechanism, or maybe this single line
should be out-sourced in a method which could be overwritten easily by me:

mustAdd = eOpposite.isTransient() ||
requireForwardReferenceForNonTransientBidirectionalReference( eReference);

Or maybe I will jsut copy the whole method if everything is hopefully
protected.

By the way, I saw that comments are very rare, also in the JDT code
(which is not your topic). This is hard to understand since EMF and JDT
are so important parts of Eclipse. Is there any future plan to improve
this situation or are there not enough resources? This is a very general
question but I just have to ask it here, since this was one reason why I
started this thread: just to know what this line does.
(no subject) [message #686742 is a reply to message #686741] Fri, 13 May 2011 15:22 Go to previous message
Ed Merks is currently offline Ed MerksFriend
Messages: 33136
Registered: July 2009
Senior Member
Comments below.

exquisitus wrote:
>
> >Your idea of ignoring forward references when order isn't significant
> >(isOrdered == false) isn't a bad one, but it begs the question of why
> >have them (in the XMI/XML) in the first place if it's not >significant.
>
> My proposed implementation is thought to be a speed tuning, it shall
> not change any serialization issues, so e.g. omitting writing out
> forward references. Though this would be also a possible optimization,
> but an optimization changing too much in my opinion.
>
> > And, as I point out, to process a serialization that has
> >enough information to preserve order (even when not significant) but
> >then doesn't use that information begs more questions.
> I would agree if the overhead to preserve (maybe not
> domain-significant) order is not high. I have given you the usecase
> where I plan to read very large XMI files, and a bottle neck may be
> the unnecessary storing of the forward list.
The thing that wrote the XMI was able to keep all things in memory, I
assume, right?
> It can be hundreds of megabyte large since the xmi files store
> architectural information of very big models (10 gigs or so not
> seldom). A SAX-style light-weight parsing will be bottlenecked by an
> increasing forward list.
Best to worry about performance problems as they become apparent. More
often than not, the problems we think will be problems turn out to be
irrelevant and thing we never imagined to be problematic turn out to be
the bottle neck...
>
> >As for doing equality comparison, if you look closely at how that's
> >implemented, establishing a two-way mapping between the equal things
> >is key in the processing. If we are to compare two lists, but have
> >no idea how to go about establishing that mapping, i.e, the first
> >element in the first list must be deep structurally compared to all
> >elements in the other list, we aren't going to be able to produce
> such >a map. Remember, we're not comparing things that are Java
> .equals so >we can simply implement the comparison like we would we do
> compare two >Sets for equality, i.e., check that each element of each
> set is >contained in the other.
>
> I am aware that comparing graphs is a complicated topic - as I
> remember it is one of the very few problems where the mathematicians
> really not know if its NP or P, since all clever algorithms seem to
> take polynomial time however no one knows if there really exists an
> algorithm of polynomial time.
>
> However this fact is exactly no problem for EcoreUtil.equals() as it
> treats sets as lists :).
Yep.
> It should be as it is now, users with special compare requests like
> set-comparison should use EMF Compare I guess, since otherwise
> EcoreUtil.equals() would take days for graphs of no more of 50 nodes.
>
> It would however be nice if switching to my implementation could be
> either controled using a flag mechanism, or maybe this single line
> should be out-sourced in a method which could be overwritten easily by
> me:
>
> mustAdd = eOpposite.isTransient() ||
> requireForwardReferenceForNonTransientBidirectionalReference(
> eReference);
>
> Or maybe I will jsut copy the whole method if everything is hopefully
> protected.
There is very little privacy in EMF...
>
> By the way, I saw that comments are very rare, also in the JDT code
> (which is not your topic).
It depends a little on which classes. I think many are quite nicely
documented. The XMI code definitely isn't in that category though.
There have been many authors and none were included to document much...
> This is hard to understand since EMF and JDT are so important parts of
> Eclipse. Is there any future plan to improve this situation or are
> there not enough resources?
No, it's unlikely...
> This is a very general question but I just have to ask it here, since
> this was one reason why I started this thread: just to know what this
> line does.


Ed Merks
Professional Support: https://www.macromodeling.com/
(no subject) [message #686878 is a reply to message #686717] Thu, 12 May 2011 13:12 Go to previous message
Ed Merks is currently offline Ed MerksFriend
Messages: 33136
Registered: July 2009
Senior Member
Comments below.

exquisitus wrote:
>
> When a XMI file is loaded, a parsed reference from an element A to an
> element B may be recorded as a so called "forward reference" if B is
> not yet parsed (since located further down the file). However if the
> reference has an eOpposite, its not always required to record that
> forward reference, since the reference will be set when the opposite
> reference is parsed (together with element B).
Yes, but will that properly preserve the order recorded in the XMI?
> The local Java variable "mustAdd" in XMLHandler determines this
> necessity whether a forward reference must be recorded or not.
>
> So I do not understand the statement
>
> mustAdd = eOpposite.isTransient() || eReference.isMany();
>
> Of course, if the eOpposite reference is transient, then the opposite
> reference will never be parsed (together with element B) and a forward
> reference must be added. However why must a forward reference be added
> if its a many-valued reference?
To preserve order.
> My only idea why this could be wanted is to preserve the order of the
> references. But then this statement would be more senseful:
>
> mustAdd = eOpposite.isTransient() || eReference.isMany()
> && eReference.isOrdered();
Why store it though if that stored order isn't important? It seems bad
to me that writing something and then reading it back in, doesn't
produce something that's EcoreUtil.equals.
>
> This statement would mean, that for metamodels using bidirectional
> references where order of references is not important no forward
> references would need to be recorded at all.
>
> Since reference are by default ordered, the new implementation would
> affect few projects, however would speed up for projects using the
> ordered=false feature. My goal is to write a CDO importer for XMI
> files of several gigs size, flushing already read elements at defined
> time intervals to the CDO database since CLEAN elements can be
> CDO-garbage collected (it doesnt matter if they get DIRTY later, when
> the opposite reference is read). And I fear the forward reference list
> could get very large.
I'm curious though why you need to persist it if order isn't important.
I guess that's more an issue how how CDO populates that feature's value?


Ed Merks
Professional Support: https://www.macromodeling.com/
(no subject) [message #686897 is a reply to message #686721] Fri, 13 May 2011 13:33 Go to previous message
John Smith is currently offline John SmithFriend
Messages: 137
Registered: July 2009
Senior Member
>> When a XMI file is loaded, a parsed reference from an element A to an
>> element B may be recorded as a so called "forward reference" if B is
>> not yet parsed (since located further down the file). However if the
>> reference has an eOpposite, its not always required to record that
>> forward reference, since the reference will be set when the opposite
>> reference is parsed (together with element B).
> Yes, but will that properly preserve the order recorded in the XMI?
>> The local Java variable "mustAdd" in XMLHandler determines this
>> necessity whether a forward reference must be recorded or not.
>>
>> So I do not understand the statement
>>
>> mustAdd = eOpposite.isTransient() || eReference.isMany();
>>
>> Of course, if the eOpposite reference is transient, then the opposite
>> reference will never be parsed (together with element B) and a forward
>> reference must be added. However why must a forward reference be added
>> if its a many-valued reference?
> To preserve order.
>> My only idea why this could be wanted is to preserve the order of the
>> references. But then this statement would be more senseful:
>>
>> mustAdd = eOpposite.isTransient() || eReference.isMany() &&
>> eReference.isOrdered();
> Why store it though if that stored order isn't important? It seems bad
> to me that writing something and then reading it back in, doesn't
> produce something that's EcoreUtil.equals.

Ok, then has the isOrdered property any relevance?

>>
>> This statement would mean, that for metamodels using bidirectional
>> references where order of references is not important no forward
>> references would need to be recorded at all.
>>
>> Since reference are by default ordered, the new implementation would
>> affect few projects, however would speed up for projects using the
>> ordered=false feature. My goal is to write a CDO importer for XMI
>> files of several gigs size, flushing already read elements at defined
>> time intervals to the CDO database since CLEAN elements can be
>> CDO-garbage collected (it doesnt matter if they get DIRTY later, when
>> the opposite reference is read). And I fear the forward reference list
>> could get very large.
> I'm curious though why you need to persist it if order isn't important.
> I guess that's more an issue how how CDO populates that feature's value?

No I just was wondering if my speculation was right, that the statement
has to do with ordering since it was not documented in the code. So if I
would write an OR mapping, then I am also required to preserve order for
the isOrdred=False case in order not to break with EcoreUtil.equals? I
already did write an OR mapping where I stored references as a Postgres
bit map, which cannot store order by nature (however this implementation
is already history, like so much which not gets open source :-).

Hm, just thinking: with my proposed implementation, doing a multiple
read/write of the XMI file (without changing anything), would
(1) some references go in circles
(2) nothing would change
(3) only the first write would change, the second no longer

if (3) is the case, would my proposal be interesting for you again (i
not know what case actually is true)?

my proposal also aims at the philosophy: cause as much chaos as you can,
to find bugs early. maybe this is not the best option for important code
like emf...

>>
> I'm curious though why you need to persist it if order isn't important.
> I guess that's more an issue how how CDO populates that feature's value?

I need to persist the order only if isOrdered=true, if isOrdered=false,
then order is not relevant to me. I only wanted to say, that my proposed
implementation would not break most of existing EMF applications since I
guess 99.9% of them use isOrdered=true, as it is the default.

It all comes down to the point whether to decide, that two ELists with
ordered=false shall be equal even if they have a different order. This
would be my intuive understanding of the ordered-property, and if that
is so, then the EcoreUtil.equals implementation has a severe bug.
(no subject) [message #686898 is a reply to message #686739] Fri, 13 May 2011 14:01 Go to previous message
Ed Merks is currently offline Ed MerksFriend
Messages: 33136
Registered: July 2009
Senior Member
Comments below.

exquisitus wrote:
>
> >> When a XMI file is loaded, a parsed reference from an element A to an
> >> element B may be recorded as a so called "forward reference" if B is
> >> not yet parsed (since located further down the file). However if the
> >> reference has an eOpposite, its not always required to record that
> >> forward reference, since the reference will be set when the opposite
> >> reference is parsed (together with element B).
> > Yes, but will that properly preserve the order recorded in the XMI?
> >> The local Java variable "mustAdd" in XMLHandler determines this
> >> necessity whether a forward reference must be recorded or not.
> >>
> >> So I do not understand the statement
> >>
> >> mustAdd = eOpposite.isTransient() || eReference.isMany();
> >>
> >> Of course, if the eOpposite reference is transient, then the opposite
> >> reference will never be parsed (together with element B) and a forward
> >> reference must be added. However why must a forward reference be added
> >> if its a many-valued reference?
> > To preserve order.
> >> My only idea why this could be wanted is to preserve the order of the
> >> references. But then this statement would be more senseful:
> >>
> >> mustAdd = eOpposite.isTransient() || eReference.isMany() &&
> >> eReference.isOrdered();
> > Why store it though if that stored order isn't important? It seems bad
> > to me that writing something and then reading it back in, doesn't
> > produce something that's EcoreUtil.equals.
>
> Ok, then has the isOrdered property any relevance?
Not a lot, no. I doubt I'd have put it there if it weren't in MOF when
we started...
>
> >>
> >> This statement would mean, that for metamodels using bidirectional
> >> references where order of references is not important no forward
> >> references would need to be recorded at all.
> >>
> >> Since reference are by default ordered, the new implementation would
> >> affect few projects, however would speed up for projects using the
> >> ordered=false feature. My goal is to write a CDO importer for XMI
> >> files of several gigs size, flushing already read elements at defined
> >> time intervals to the CDO database since CLEAN elements can be
> >> CDO-garbage collected (it doesnt matter if they get DIRTY later, when
> >> the opposite reference is read). And I fear the forward reference list
> >> could get very large.
> > I'm curious though why you need to persist it if order isn't important.
> > I guess that's more an issue how how CDO populates that feature's
> value?
>
> No I just was wondering if my speculation was right, that the
> statement has to do with ordering since it was not documented in the
> code. So if I would write an OR mapping, then I am also required to
> preserve order for the isOrdred=False case in order not to break with
> EcoreUtil.equals?
Yes, if it has a different order, it won't compare as equals. That's
not necessarily a problem. What do you need equals for anyway?
> I already did write an OR mapping where I stored references as a
> Postgres bit map, which cannot store order by nature (however this
> implementation is already history, like so much which not gets open
> source :-).
>
> Hm, just thinking: with my proposed implementation, doing a multiple
> read/write of the XMI file (without changing anything), would
> (1) some references go in circles
> (2) nothing would change
> (3) only the first write would change, the second no longer
>
> if (3) is the case, would my proposal be interesting for you again (i
> not know what case actually is true)?
After reading, all the references would end up in the order in which
their opposites are processed; if you saved and loaded that, it would
remain the same. Of course if you changed the structure in a way that
changed the order of the opposites, you'd be back at square one.
>
> my proposal also aims at the philosophy: cause as much chaos as you
> can, to find bugs early. maybe this is not the best option for
> important code like emf...
It's always good to explore the dark corners.
>
> >>
> > I'm curious though why you need to persist it if order isn't important.
> > I guess that's more an issue how how CDO populates that feature's
> value?
>
> I need to persist the order only if isOrdered=true, if
> isOrdered=false, then order is not relevant to me. I only wanted to
> say, that my proposed implementation would not break most of existing
> EMF applications since I guess 99.9% of them use isOrdered=true, as it
> is the default.
Yes, I imagine that's true.
>
> It all comes down to the point whether to decide, that two ELists with
> ordered=false shall be equal even if they have a different order.
No, not so much. EList is a List, and List has a very well defined
definition for equals that can't be violated; even the implementation
for hashCode is carefully spelled out.
> This would be my intuive understanding of the ordered-property, and if
> that is so, then the EcoreUtil.equals implementation has a severe bug.
Trying to do an unordered comparison in EcoreUtil.equals would turn this
into a problem like EMF compare. Establishing a two-way mapping would
become extremely difficult and expensive. I'd never attempt that...


Ed Merks
Professional Support: https://www.macromodeling.com/
(no subject) [message #686899 is a reply to message #686739] Fri, 13 May 2011 14:36 Go to previous message
John Smith is currently offline John SmithFriend
Messages: 137
Registered: July 2009
Senior Member
>Your idea of ignoring forward references when order isn't significant
>(isOrdered == false) isn't a bad one, but it begs the question of why
>have them (in the XMI/XML) in the first place if it's not >significant.

My proposed implementation is thought to be a speed tuning, it shall not
change any serialization issues, so e.g. omitting writing out forward
references. Though this would be also a possible optimization, but an
optimization changing too much in my opinion.

> And, as I point out, to process a serialization that has
>enough information to preserve order (even when not significant) but
>then doesn't use that information begs more questions.
I would agree if the overhead to preserve (maybe not domain-significant)
order is not high. I have given you the usecase where I plan to read
very large XMI files, and a bottle neck may be the unnecessary storing
of the forward list. It can be hundreds of megabyte large since the xmi
files store architectural information of very big models (10 gigs or so
not seldom). A SAX-style light-weight parsing will be bottlenecked by an
increasing forward list.

>As for doing equality comparison, if you look closely at how that's
>implemented, establishing a two-way mapping between the equal things
>is key in the processing. If we are to compare two lists, but have
>no idea how to go about establishing that mapping, i.e, the first
>element in the first list must be deep structurally compared to all
>elements in the other list, we aren't going to be able to produce such
>a map. Remember, we're not comparing things that are Java .equals so
>we can simply implement the comparison like we would we do compare two
>Sets for equality, i.e., check that each element of each set is
>contained in the other.

I am aware that comparing graphs is a complicated topic - as I remember
it is one of the very few problems where the mathematicians really not
know if its NP or P, since all clever algorithms seem to take polynomial
time however no one knows if there really exists an algorithm of
polynomial time.

However this fact is exactly no problem for EcoreUtil.equals() as it
treats sets as lists :). It should be as it is now, users with special
compare requests like set-comparison should use EMF Compare I guess,
since otherwise EcoreUtil.equals() would take days for graphs of no more
of 50 nodes.

It would however be nice if switching to my implementation could be
either controled using a flag mechanism, or maybe this single line
should be out-sourced in a method which could be overwritten easily by me:

mustAdd = eOpposite.isTransient() ||
requireForwardReferenceForNonTransientBidirectionalReference( eReference);

Or maybe I will jsut copy the whole method if everything is hopefully
protected.

By the way, I saw that comments are very rare, also in the JDT code
(which is not your topic). This is hard to understand since EMF and JDT
are so important parts of Eclipse. Is there any future plan to improve
this situation or are there not enough resources? This is a very general
question but I just have to ask it here, since this was one reason why I
started this thread: just to know what this line does.
(no subject) [message #686900 is a reply to message #686741] Fri, 13 May 2011 15:22 Go to previous message
Ed Merks is currently offline Ed MerksFriend
Messages: 33136
Registered: July 2009
Senior Member
Comments below.

exquisitus wrote:
>
> >Your idea of ignoring forward references when order isn't significant
> >(isOrdered == false) isn't a bad one, but it begs the question of why
> >have them (in the XMI/XML) in the first place if it's not >significant.
>
> My proposed implementation is thought to be a speed tuning, it shall
> not change any serialization issues, so e.g. omitting writing out
> forward references. Though this would be also a possible optimization,
> but an optimization changing too much in my opinion.
>
> > And, as I point out, to process a serialization that has
> >enough information to preserve order (even when not significant) but
> >then doesn't use that information begs more questions.
> I would agree if the overhead to preserve (maybe not
> domain-significant) order is not high. I have given you the usecase
> where I plan to read very large XMI files, and a bottle neck may be
> the unnecessary storing of the forward list.
The thing that wrote the XMI was able to keep all things in memory, I
assume, right?
> It can be hundreds of megabyte large since the xmi files store
> architectural information of very big models (10 gigs or so not
> seldom). A SAX-style light-weight parsing will be bottlenecked by an
> increasing forward list.
Best to worry about performance problems as they become apparent. More
often than not, the problems we think will be problems turn out to be
irrelevant and thing we never imagined to be problematic turn out to be
the bottle neck...
>
> >As for doing equality comparison, if you look closely at how that's
> >implemented, establishing a two-way mapping between the equal things
> >is key in the processing. If we are to compare two lists, but have
> >no idea how to go about establishing that mapping, i.e, the first
> >element in the first list must be deep structurally compared to all
> >elements in the other list, we aren't going to be able to produce
> such >a map. Remember, we're not comparing things that are Java
> .equals so >we can simply implement the comparison like we would we do
> compare two >Sets for equality, i.e., check that each element of each
> set is >contained in the other.
>
> I am aware that comparing graphs is a complicated topic - as I
> remember it is one of the very few problems where the mathematicians
> really not know if its NP or P, since all clever algorithms seem to
> take polynomial time however no one knows if there really exists an
> algorithm of polynomial time.
>
> However this fact is exactly no problem for EcoreUtil.equals() as it
> treats sets as lists :).
Yep.
> It should be as it is now, users with special compare requests like
> set-comparison should use EMF Compare I guess, since otherwise
> EcoreUtil.equals() would take days for graphs of no more of 50 nodes.
>
> It would however be nice if switching to my implementation could be
> either controled using a flag mechanism, or maybe this single line
> should be out-sourced in a method which could be overwritten easily by
> me:
>
> mustAdd = eOpposite.isTransient() ||
> requireForwardReferenceForNonTransientBidirectionalReference(
> eReference);
>
> Or maybe I will jsut copy the whole method if everything is hopefully
> protected.
There is very little privacy in EMF...
>
> By the way, I saw that comments are very rare, also in the JDT code
> (which is not your topic).
It depends a little on which classes. I think many are quite nicely
documented. The XMI code definitely isn't in that category though.
There have been many authors and none were included to document much...
> This is hard to understand since EMF and JDT are so important parts of
> Eclipse. Is there any future plan to improve this situation or are
> there not enough resources?
No, it's unlikely...
> This is a very general question but I just have to ask it here, since
> this was one reason why I started this thread: just to know what this
> line does.


Ed Merks
Professional Support: https://www.macromodeling.com/
Previous Topic:(no subject)
Next Topic:(no subject)
Goto Forum:
  


Current Time: Fri Apr 19 00:31:01 GMT 2024

Powered by FUDForum. Page generated in 0.02331 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top