Eclipse Community Forums: EMF » Validation of invalid EMF ID's by default?

Help

Home

Home » Modeling » EMF » Validation of invalid EMF ID's by default?

Show: Today's Messages :: Show Polls :: Message Navigator

Validation of invalid EMF ID's by default? [message #429773]

Thu, 30 April 2009 23:39

Matt Seashore

Messages: 58
Registered: July 2009

Member

Earlier I was playing around with the reflective editor, getting ready
to show off some of the advantages of EMF over editing raw XML/JAXB to
my team. I created an example Ecore model with some devices identified
by MAC address and other objects with references to them. For the MAC
Address ID's, I used something like '01:23:45:67:89:ab'. It took me a
while to realize that the ':' in the ID above was invalid. The
reflective editor validated, saved, and appeared to load properly, but
upon closer inspection I realized all references to the MAC Address ID's
were being silently ignored upon load!

It makes sense that ID's have reserved characters, but perhaps there
should be some validation around these reserved characters (unless I'm
just missing it)? I would think that if I can validate and save a model
with the reflective editor, I should also be able to load it in the same
state (or at least get warnings/errors if something goes wrong). This
doesn't seem to be the case when referencing objects invalid characters
in the ID's.

Thoughts on if this is a problem or the best way to fix? I know the
generated code could be tweaked to fix this (or it's easy enough to
avoid invalid chars), but having things just work out of the box conveys
all the right messages to potential users of EMF.

Keep up the great work!

-Matt

Report message to a moderator

Re: Validation of invalid EMF ID's by default? [message #429775 is a reply to message #429773]

Thu, 30 April 2009 23:52

Ed Merks

Messages: 33140
Registered: July 2009

Senior Member

Matt,

Comments below.

Matt Seashore wrote:
> Earlier I was playing around with the reflective editor, getting ready
> to show off some of the advantages of EMF over editing raw XML/JAXB to
> my team.
:-)
> I created an example Ecore model with some devices identified by MAC
> address and other objects with references to them. For the MAC
> Address ID's, I used something like '01:23:45:67:89:ab'. It took me a
> while to realize that the ':' in the ID above was invalid.
Yes, in XML Schema an ID must be an NCName....
> The reflective editor validated, saved, and appeared to load properly,
> but upon closer inspection I realized all references to the MAC
> Address ID's were being silently ignored upon load!
Depending on the resource implementation, any particular form of ID
might or might now work well. For XMLResources, things like ":" and "#"
end up acting as "hard" indicators that the string represent a QName or
an anyURI....
>
> It makes sense that ID's have reserved characters, but perhaps there
> should be some validation around these reserved characters (unless I'm
> just missing it)?
Yes, though it's really be behavior of the derived ResourceImpl (i.e,.
XMLResourceImpl) that determines if something that could work will
actually work. For example, using a number will work fine, though it's
not a valid XML ID...
> I would think that if I can validate and save a model with the
> reflective editor, I should also be able to load it in the same state
> (or at least get warnings/errors if something goes wrong).
It's not a perfect world.
> This doesn't seem to be the case when referencing objects invalid
> characters in the ID's.
Definitely ":" and "#" have a significant impact when serializing with
XMLResource, but other resource implementations won't necessarily have a
problem, e.g., BinaryResourceImpl.
>
> Thoughts on if this is a problem or the best way to fix? I know the
> generated code could be tweaked to fix this (or it's easy enough to
> avoid invalid chars), but having things just work out of the box
> conveys all the right messages to potential users of EMF.
As I said, it's just not a perfect world. Some people are quite happy
that their ill-formed IDs generally work with XML/XMIResourceImpl...
>
> Keep up the great work!
Flattery will get you everywhere. :-P
>
> -Matt

Ed Merks
Professional Support: https://www.macromodeling.com/

Report message to a moderator

Re: Validation of invalid EMF ID's by default? [message #429797 is a reply to message #429775]

Fri, 01 May 2009 17:34

Matt Seashore

Messages: 58
Registered: July 2009

Member

Hey Ed,

Thanks for the informative reply. A few comments/questions below.

Ed Merks wrote:
> Matt,
>
> Comments below.
>
> Matt Seashore wrote:
>> Earlier I was playing around with the reflective editor, getting ready
>> to show off some of the advantages of EMF over editing raw XML/JAXB to
>> my team.
> :-)
>> I created an example Ecore model with some devices identified by MAC
>> address and other objects with references to them. For the MAC
>> Address ID's, I used something like '01:23:45:67:89:ab'. It took me a
>> while to realize that the ':' in the ID above was invalid.
> Yes, in XML Schema an ID must be an NCName....
>> The reflective editor validated, saved, and appeared to load properly,
>> but upon closer inspection I realized all references to the MAC
>> Address ID's were being silently ignored upon load!
> Depending on the resource implementation, any particular form of ID
> might or might now work well. For XMLResources, things like ":" and "#"
> end up acting as "hard" indicators that the string represent a QName or
> an anyURI....
So, is the problem that ':' and '#' aren't truly invalid in all cases
and so the XMLResource can't figure out if these references are
'invalid' on either load or save? From a simplistic view, it would seem
XMLResource/XMLSaveImpl would have enough information on save to
determine if a referenced ID was invalid and log/throw an exception.

>>
>> It makes sense that ID's have reserved characters, but perhaps there
>> should be some validation around these reserved characters (unless I'm
>> just missing it)?
> Yes, though it's really be behavior of the derived ResourceImpl (i.e,.
> XMLResourceImpl) that determines if something that could work will
> actually work. For example, using a number will work fine, though it's
> not a valid XML ID...
>> I would think that if I can validate and save a model with the
>> reflective editor, I should also be able to load it in the same state
>> (or at least get warnings/errors if something goes wrong).
> It's not a perfect world.
:-)
>> This doesn't seem to be the case when referencing objects invalid
>> characters in the ID's.
> Definitely ":" and "#" have a significant impact when serializing with
> XMLResource, but other resource implementations won't necessarily have a
> problem, e.g., BinaryResourceImpl.
I guess I was considering just the use case of using the default Sample
Reflective Editor bundled with EMF+Eclipse which (I think?) is only
usable with XMI/XML. I suppose that's not the only way it's used and
thus customizing the Sample Editor to make any assumptions about the
Resource implementation would be a bad idea:)

>> Thoughts on if this is a problem or the best way to fix? I know the
>> generated code could be tweaked to fix this (or it's easy enough to
>> avoid invalid chars), but having things just work out of the box
>> conveys all the right messages to potential users of EMF.
> As I said, it's just not a perfect world. Some people are quite happy
> that their ill-formed IDs generally work with XML/XMIResourceImpl...
I would imagine users are happy they can store their crazy ID's with
XMI/XML...I don't think they'll be as excited when they try to reference
their invalid IDs, only have the references disappear silently upon
reload with XMI/XML. No ones happy about data loss :)

But, as you say, it's not a perfect world and we have to pick our
battles. If Ed Merks says that the cost of this minor usability
enhancement is too high or conflicts with other
functionality/priorities...I'll believe you. Thanks for your time!

-Matt
>>
>> Keep up the great work!
> Flattery will get you everywhere. :-Pd
>>
>> -Matt

Report message to a moderator

Re: Validation of invalid EMF ID's by default? [message #429803 is a reply to message #429797]

Fri, 01 May 2009 18:56

Ed Merks

Messages: 33140
Registered: July 2009

Senior Member

Matt,

Comments below.

Matt Seashore wrote:
> Hey Ed,
>
> Thanks for the informative reply. A few comments/questions below.
>
> Ed Merks wrote:
>> Matt,
>>
>> Comments below.
>>
>> Matt Seashore wrote:
>>> Earlier I was playing around with the reflective editor, getting
>>> ready to show off some of the advantages of EMF over editing raw
>>> XML/JAXB to my team.
>> :-)
>>> I created an example Ecore model with some devices identified by
>>> MAC address and other objects with references to them. For the MAC
>>> Address ID's, I used something like '01:23:45:67:89:ab'. It took me
>>> a while to realize that the ':' in the ID above was invalid.
>> Yes, in XML Schema an ID must be an NCName....
>>> The reflective editor validated, saved, and appeared to load
>>> properly, but upon closer inspection I realized all references to
>>> the MAC Address ID's were being silently ignored upon load!
>> Depending on the resource implementation, any particular form of ID
>> might or might now work well. For XMLResources, things like ":" and
>> "#" end up acting as "hard" indicators that the string represent a
>> QName or an anyURI....
> So, is the problem that ':' and '#' aren't truly invalid in all cases
> and so the XMLResource can't figure out if these references are
> 'invalid' on either load or save?
No, the problem is that XMLResource specifically, if it sees a # says, I
know, that must be a URI, not an ID, if it sees no # but a : it says,
hey this must be a QName indicating a type...
> From a simplistic view, it would seem XMLResource/XMLSaveImpl would
> have enough information on save to determine if a referenced ID was
> invalid and log/throw an exception.
It could, but pawing through all the characters is not free. Better the
model use a data type that converts bad characters to good ones...
>
>>>
>>> It makes sense that ID's have reserved characters, but perhaps there
>>> should be some validation around these reserved characters (unless
>>> I'm just missing it)?
>> Yes, though it's really be behavior of the derived ResourceImpl
>> (i.e,. XMLResourceImpl) that determines if something that could work
>> will actually work. For example, using a number will work fine,
>> though it's not a valid XML ID...
>>> I would think that if I can validate and save a model with the
>>> reflective editor, I should also be able to load it in the same
>>> state (or at least get warnings/errors if something goes wrong).
>> It's not a perfect world.
> :-)
>>> This doesn't seem to be the case when referencing objects invalid
>>> characters in the ID's.
>> Definitely ":" and "#" have a significant impact when serializing
>> with XMLResource, but other resource implementations won't
>> necessarily have a problem, e.g., BinaryResourceImpl.
> I guess I was considering just the use case of using the default
> Sample Reflective Editor bundled with EMF+Eclipse which (I think?) is
> only usable with XMI/XML. I suppose that's not the only way it's used
> and thus customizing the Sample Editor to make any assumptions about
> the Resource implementation would be a bad idea:)
Yep.
>
>>> Thoughts on if this is a problem or the best way to fix? I know the
>>> generated code could be tweaked to fix this (or it's easy enough to
>>> avoid invalid chars), but having things just work out of the box
>>> conveys all the right messages to potential users of EMF.
>> As I said, it's just not a perfect world. Some people are quite
>> happy that their ill-formed IDs generally work with
>> XML/XMIResourceImpl...
> I would imagine users are happy they can store their crazy ID's with
> XMI/XML...I don't think they'll be as excited when they try to
> reference their invalid IDs, only have the references disappear
> silently upon reload with XMI/XML. No ones happy about data loss :)
It's hard to make everyone happy!
>
> But, as you say, it's not a perfect world and we have to pick our
> battles. If Ed Merks says that the cost of this minor usability
> enhancement is too high or conflicts with other
> functionality/priorities...I'll believe you. Thanks for your time!
Unfortunately character at a time processing is generally expensive
already and additional passes just make it worse. If you define your
own data type that encodes and decodes characters like # and : then you
could still have those in the UI but have them serialized differently.
>
> -Matt
>>>
>>> Keep up the great work!
>> Flattery will get you everywhere. :-Pd
>>>
>>> -Matt

Ed Merks
Professional Support: https://www.macromodeling.com/

Report message to a moderator

Previous Topic:	Re: EAttribute value
Next Topic:	Re: EMF notification and TreeViewer

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Tue Apr 23 08:26:11 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter