Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF "Technology" (Ecore Tools, EMFatic, etc)  » Generating EMF from advanced XSDs
Generating EMF from advanced XSDs [message #522520] Mon, 22 March 2010 21:28 Go to next message
Igor Jacy Lino Campista is currently offline Igor Jacy Lino CampistaFriend
Messages: 34
Registered: July 2009
Member
Hello EMF experts,

A challenging one.

I have spent quite a lot of time trying to generate a high quality ECORE file out of the technical documentation DITA schemas (XSDs). XSD -> ECore


The DITA XSD set comes in two flavors: one with relative path dependencies, and the other simplified by using Apache Catalog Resolver (1st time I heard of it).


A zip file containing the XSD defintions is here:

http://www.oasis-open.org/committees/download.php/36335/xsd1 .2-20100209.zip


Some questions arises:

How can I make the apache catalog based XSDs be resolved by the Ecore generator ?

What best and simple strategy can I use to generate a good ecore file from the XSDs when there are too many dependencies? (Trying to do XSD -> Ecore from the set of XSDs that have relative dependencies is not working)



For the set of XSDs using apache catalog URLs, here is a snippet from the file technicalContent\xsd\topic.xsd

<xs:include schemaLocation="urn:oasis:names:tc:dita:xsd:metaDeclMod.xsd:1.2 "/>

The last question: Is there a way to generate the XSD mapping to Ecore from such an XSD structure?


Thanks in advance,
Igor
Re: Generating EMF from advanced XSDs [message #522525 is a reply to message #522520] Mon, 22 March 2010 21:52 Go to previous messageGo to next message
Igor Jacy Lino Campista is currently offline Igor Jacy Lino CampistaFriend
Messages: 34
Registered: July 2009
Member

Forgot to mention, that I want to be able to work with instances of the model described by those XSDs. Also those XSD may be derived or specialized.

I was thinking EMF can help me there. I'm not sure at what level of abstraction though. Would it make better sense to be at the level of XSD meta-model instance and not the XSD meta-model model instance?

There is a nice article about apache catalog, and the reason those URN namespaces exists for simplifying XSDs.

http://xml.apache.org/commons/components/resolver/resolver-a rticle.html


Thanks,
IGor
Re: Generating EMF from advanced XSDs [message #522661 is a reply to message #522520] Tue, 23 March 2010 14:08 Go to previous messageGo to next message
Ed Merks is currently offline Ed MerksFriend
Messages: 33142
Registered: July 2009
Senior Member
Igor,

Comments below.

Igor Jacy Lino Campista wrote:
> Hello EMF experts,
>
> A challenging one.
>
> I have spent quite a lot of time trying to generate a high quality
> ECORE file out of the technical documentation DITA schemas
> (XSDs). XSD -> ECore
>
>
> The DITA XSD set comes in two flavors: one with relative path
> dependencies, and the other simplified by using Apache Catalog
> Resolver (1st time I heard of it).
>
>
> A zip file containing the XSD defintions is here:
>
> http://www.oasis-open.org/committees/download.php/36335/xsd1 .2-20100209.zip
>
>
>
> Some questions arises:
>
> How can I make the apache catalog based XSDs be resolved by the Ecore
> generator ?
>
> What best and simple strategy can I use to generate a good ecore file
> from the XSDs when there are too many dependencies? (Trying to do
> XSD -> Ecore from the set of XSDs that have relative dependencies is
> not working)
>
>
>
> For the set of XSDs using apache catalog URLs, here is a snippet from
> the file technicalContent\xsd\topic.xsd
>
> <xs:include
> schemaLocation="urn:oasis:names:tc:dita:xsd:metaDeclMod.xsd:1.2 "/>
>
> The last question: Is there a way to generate the XSD mapping to
> Ecore from such an XSD structure?
Why aren't you using the schemas with relative references? Those should
just work.
>
>
> Thanks in advance,
> Igor
>


Ed Merks
Professional Support: https://www.macromodeling.com/
Re: Generating EMF from advanced XSDs [message #522691 is a reply to message #522661] Tue, 23 March 2010 15:43 Go to previous messageGo to next message
Igor Jacy Lino Campista is currently offline Igor Jacy Lino CampistaFriend
Messages: 34
Registered: July 2009
Member

Ed,

In the past I have used the XSD -> Ecore quite nicely. (With mappings enabled in every case)

Focusing on the relative set of XSDs, I can generate an ecore if I don't enable the schema to ecore mappings.

If I enable the schema to ecore mappings then it goes inside an infinite loop or so it seems. I tried with a Quad Core with 8Gb of RAM, and after 1 hours no result of finishing.

(if mappings are disabled I got a fast result)
Re: Generating EMF from advanced XSDs [message #522765 is a reply to message #522691] Tue, 23 March 2010 20:54 Go to previous messageGo to next message
Igor Jacy Lino Campista is currently offline Igor Jacy Lino CampistaFriend
Messages: 34
Registered: July 2009
Member

The XSDs of this specification represent a very particular and challenging scenario.

Taking the concept.xsd as an example..

In DITA terminology Concept redefines Topic.

concept.xsd has some xs:imports that are namespaced and are suitable for referenced genmodels. BUT it has many many direct xs:includes that are just part of the same namespace.

Similar to "concept", there is also, "task", "reference".

The DITA model is meant to be specialized. That means at some other point in time its expected that other specializations will appear, so for example "article" that specializes topic.

Specializes means it refines some elements, and add some elements extending some base ones. (i.e. as "concept" does)

At the end of the day, they are all models with some constraints.

My intentions:
1)
I want to work with instances of those models programmatically, while at the same time being able to handle those "specializations" that basically does not exist when creating a set of ecores.

2) Even if I create an ecore for "task", "concept", "topic", "reference". Since its using includes in the same namespace, each package needs an ecore that has more or less lots of same classes. (because essentially they are derivates mostly of "topic")

BTW: the topic.ecore is about 5.5 MB big. reference.ecore is about 6.9MB

3) I want to create UI bindings to the elements (targeting different client technologies).

Now, I have the feeling if I'm not using EMF for what is meant in the case of simply making the XSD->Ecore. And what I should do is simply work in the XML and XSD domain.

Perhaps the only benefit could be by customizing the ecore templates so that I could generate the UI bindings (using e4 concepts).

Any comments, suggestions or ideas?
Re: Generating EMF from advanced XSDs [message #522934 is a reply to message #522691] Wed, 24 March 2010 10:04 Go to previous messageGo to next message
Ed Merks is currently offline Ed MerksFriend
Messages: 33142
Registered: July 2009
Senior Member
Igor,

The code to create the mapping file has some nasty combinatorial
behavior that makes it unusable for large examples so don't use it.
It's only produced for browsing the mapping and is not otherwise needed.


Igor Jacy Lino Campista wrote:
>
> Ed,
>
> In the past I have used the XSD -> Ecore quite nicely. (With mappings
> enabled in every case)
>
> Focusing on the relative set of XSDs, I can generate an ecore if I
> don't enable the schema to ecore mappings.
>
> If I enable the schema to ecore mappings then it goes inside an
> infinite loop or so it seems. I tried with a Quad Core with 8Gb of
> RAM, and after 1 hours no result of finishing.
>
> (if mappings are disabled I got a fast result)
>


Ed Merks
Professional Support: https://www.macromodeling.com/
Re: Generating EMF from advanced XSDs [message #522936 is a reply to message #522765] Wed, 24 March 2010 15:01 Go to previous messageGo to next message
Ed Merks is currently offline Ed MerksFriend
Messages: 33142
Registered: July 2009
Senior Member
Igor,

comment below.

Igor Jacy Lino Campista wrote:
>
> The XSDs of this specification represent a very particular and
> challenging scenario.
> Taking the concept.xsd as an example..
>
> In DITA terminology Concept redefines Topic.
If you really mean xsd:redefine, unfortunately that's an abomination
which is effectively unimplementable. It's been deprecated in the the
XML Schema 1.1 specification and I've given up trying to implement it.
>
> concept.xsd has some xs:imports that are namespaced and are suitable
> for referenced genmodels. BUT it has many many direct xs:includes that
> are just part of the same namespace.
>
> Similar to "concept", there is also, "task", "reference".
> The DITA model is meant to be specialized. That means at some other
> point in time its expected that other specializations will appear, so
> for example "article" that specializes topic.
>
> Specializes means it refines some elements, and add some elements
> extending some base ones. (i.e. as "concept" does)
Extension should work fine.
>
> At the end of the day, they are all models with some constraints.
>
> My intentions:
> 1) I want to work with instances of those models programmatically,
> while at the same time being able to handle those "specializations"
> that basically does not exist when creating a set of ecores.
That sounds a bit like this:
http://ed-merks.blogspot.com/2008/01/creating-children-you-d idnt-know.html
>
> 2) Even if I create an ecore for "task", "concept", "topic",
> "reference". Since its using includes in the same namespace, each
> package needs an ecore that has more or less lots of same classes.
> (because essentially they are derivates mostly of "topic")
>
> BTW: the topic.ecore is about 5.5 MB big. reference.ecore is about
> 6.9MB
I'm never quite sure how standards organizations can create things that
exceed the capacity of the human mind.
>
> 3) I want to create UI bindings to the elements (targeting different
> client technologies).
>
> Now, I have the feeling if I'm not using EMF for what is meant in the
> case of simply making the XSD->Ecore.
It all sounds fine except redefine is just not supportable and has no
analog in Java.
> And what I should do is simply work in the XML and XSD domain.
Yes, but XML is just so much syntactic noise, so how to work with it?
>
> Perhaps the only benefit could be by customizing the ecore templates
> so that I could generate the UI bindings (using e4 concepts).
I expect next release we'd provide specialized support for e4.
> Any comments, suggestions or ideas?
I'm not sure exactly what the problem is. You should be able to
generate the models for all the schemas. For any given namespace, you'd
want to be sure to specify all the schemas that contribute to that
namespace so you end up with one EPackage per namespace.


Ed Merks
Professional Support: https://www.macromodeling.com/
Re: Generating EMF from advanced XSDs [message #522964 is a reply to message #522936] Wed, 24 March 2010 16:10 Go to previous messageGo to next message
Igor Jacy Lino Campista is currently offline Igor Jacy Lino CampistaFriend
Messages: 34
Registered: July 2009
Member
Thanks Ed for all the responses. I appreciate them a lot.

Yes, its the infamous XSD:redefine indeed. Confused

OASIS DITA spec is indeed very big. I assume that the reason why topics are being redefined within the same namespace is the trick that makes specialization work in XML editors.

My problem is simple: make much more with less. Twisted Evil

My main goal is to create an open source tooling to work with DITA and perhaps DocBook. Mad

I'm investigating (note that don't expect to get any answer, its just for the sake of completeness):
1)How to best work with the DITA model. (considering its derivative nature)
2)Enabling graphical specializations.
3)How to deal with UI bindings for a WYSIWYG-like editor
4)How to adapt e4 concepts into the UI mix.
5)Besides the WYSIWYG-like editor tab, a there will be a tab with an enhanced XML editor.


I will most likely have also a fancy tree view that binds UI units/elements that subsecuently bind to XML elements, or blocks of XML elements. XML comments have to be respected in the serialization. Unknown XML elements will be kept but marked as errors (They won't affect the over usability of the editor). Shocked


DITA spec has big models and as you may infer they are not of the best quality. I'm almost convinced that working with several/many 5MB> ecore files does not make sense. -> Therefore my idea to work directly with the XSD instances reusing the Eclipse XML editor functionality and will keep the solution practical.


Re: Generating EMF from advanced XSDs [message #523015 is a reply to message #522964] Wed, 24 March 2010 18:45 Go to previous message
Ed Merks is currently offline Ed MerksFriend
Messages: 33142
Registered: July 2009
Senior Member
Igor,

comments below.

Igor Jacy Lino Campista wrote:
> Thanks Ed for all the responses. I appreciate them a lot.
>
> Yes, its the infamous XSD:redefine indeed. :?
> OASIS DITA spec is indeed very big. I assume that the reason why
> topics are being redefined within the same namespace is the trick that
> makes specialization work in XML editors.
>
> My problem is simple: make much more with less. :twisted:
> My main goal is to create an open source tooling to work with DITA and
> perhaps DocBook. :x
> I'm investigating (note that don't expect to get any answer, its just
> for the sake of completeness):
> 1)How to best work with the DITA model. (considering its derivative
> nature)
> 2)Enabling graphical specializations.
> 3)How to deal with UI bindings for a WYSIWYG-like editor
> 4)How to adapt e4 concepts into the UI mix.
> 5)Besides the WYSIWYG-like editor tab, a there will be a tab with an
> enhanced XML editor.
You might be better in this case to use WTP's structured editing framework.
>
>
> I will most likely have also a fancy tree view that binds UI
> units/elements that subsecuently bind to XML elements, or blocks of
> XML elements. XML comments have to be respected in the serialization.
> Unknown XML elements will be kept but marked as errors (They won't
> affect the over usability of the editor). 8o
>
> DITA spec has big models and as you may infer they are not of the best
> quality. I'm almost convinced that working with several/many 5MB>
> ecore files does not make sense. -> Therefore my idea to work directly
> with the XSD instances reusing the Eclipse XML editor functionality
> and will keep the solution practical.
Yes, that might well be better for such massive schemas and where you
want to push the XML in the user's face rather than hide it from them.
>
>
>


Ed Merks
Professional Support: https://www.macromodeling.com/
Re: Generating EMF from advanced XSDs [message #622378 is a reply to message #522661] Tue, 23 March 2010 15:43 Go to previous message
Igor Jacy Lino Campista is currently offline Igor Jacy Lino CampistaFriend
Messages: 34
Registered: July 2009
Member
Ed,

In the past I have used the XSD -> Ecore quite nicely. (With mappings enabled in every case)

Focusing on the relative set of XSDs, I can generate an ecore if I don't enable the schema to ecore mappings.

If I enable the schema to ecore mappings then it goes inside an infinite loop or so it seems. I tried with a Quad Core with 8Gb of RAM, and after 1 hours no result of finishing.

(if mappings are disabled I got a fast result)
Re: Generating EMF from advanced XSDs [message #622383 is a reply to message #622378] Tue, 23 March 2010 20:54 Go to previous message
Igor Jacy Lino Campista is currently offline Igor Jacy Lino CampistaFriend
Messages: 34
Registered: July 2009
Member
The XSDs of this specification represent a very particular and challenging scenario.

Taking the concept.xsd as an example..

In DITA terminology Concept redefines Topic.

concept.xsd has some xs:imports that are namespaced and are suitable for referenced genmodels. BUT it has many many direct xs:includes that are just part of the same namespace.

Similar to "concept", there is also, "task", "reference".

The DITA model is meant to be specialized. That means at some other point in time its expected that other specializations will appear, so for example "article" that specializes topic.

Specializes means it refines some elements, and add some elements extending some base ones. (i.e. as "concept" does)

At the end of the day, they are all models with some constraints.

My intentions:
1)
I want to work with instances of those models programmatically, while at the same time being able to handle those "specializations" that basically does not exist when creating a set of ecores.

2) Even if I create an ecore for "task", "concept", "topic", "reference". Since its using includes in the same namespace, each package needs an ecore that has more or less lots of same classes. (because essentially they are derivates mostly of "topic")

BTW: the topic.ecore is about 5.5 MB big. reference.ecore is about 6.9MB

3) I want to create UI bindings to the elements (targeting different client technologies).

Now, I have the feeling if I'm not using EMF for what is meant in the case of simply making the XSD->Ecore. And what I should do is simply work in the XML and XSD domain.

Perhaps the only benefit could be by customizing the ecore templates so that I could generate the UI bindings (using e4 concepts).

Any comments, suggestions or ideas?
Re: Generating EMF from advanced XSDs [message #622386 is a reply to message #622378] Wed, 24 March 2010 14:54 Go to previous message
Ed Merks is currently offline Ed MerksFriend
Messages: 33142
Registered: July 2009
Senior Member
Igor,

The code to create the mapping file has some nasty combinatorial
behavior that makes it unusable for large examples so don't use it.
It's only produced for browsing the mapping and is not otherwise needed.


Igor Jacy Lino Campista wrote:
>
> Ed,
>
> In the past I have used the XSD -> Ecore quite nicely. (With mappings
> enabled in every case)
>
> Focusing on the relative set of XSDs, I can generate an ecore if I
> don't enable the schema to ecore mappings.
>
> If I enable the schema to ecore mappings then it goes inside an
> infinite loop or so it seems. I tried with a Quad Core with 8Gb of
> RAM, and after 1 hours no result of finishing.
>
> (if mappings are disabled I got a fast result)
>


Ed Merks
Professional Support: https://www.macromodeling.com/
Re: Generating EMF from advanced XSDs [message #622387 is a reply to message #522765] Wed, 24 March 2010 15:01 Go to previous message
Ed Merks is currently offline Ed MerksFriend
Messages: 33142
Registered: July 2009
Senior Member
Igor,

comment below.

Igor Jacy Lino Campista wrote:
>
> The XSDs of this specification represent a very particular and
> challenging scenario.
> Taking the concept.xsd as an example..
>
> In DITA terminology Concept redefines Topic.
If you really mean xsd:redefine, unfortunately that's an abomination
which is effectively unimplementable. It's been deprecated in the the
XML Schema 1.1 specification and I've given up trying to implement it.
>
> concept.xsd has some xs:imports that are namespaced and are suitable
> for referenced genmodels. BUT it has many many direct xs:includes that
> are just part of the same namespace.
>
> Similar to "concept", there is also, "task", "reference".
> The DITA model is meant to be specialized. That means at some other
> point in time its expected that other specializations will appear, so
> for example "article" that specializes topic.
>
> Specializes means it refines some elements, and add some elements
> extending some base ones. (i.e. as "concept" does)
Extension should work fine.
>
> At the end of the day, they are all models with some constraints.
>
> My intentions:
> 1) I want to work with instances of those models programmatically,
> while at the same time being able to handle those "specializations"
> that basically does not exist when creating a set of ecores.
That sounds a bit like this:
http://ed-merks.blogspot.com/2008/01/creating-children-you-d idnt-know.html
>
> 2) Even if I create an ecore for "task", "concept", "topic",
> "reference". Since its using includes in the same namespace, each
> package needs an ecore that has more or less lots of same classes.
> (because essentially they are derivates mostly of "topic")
>
> BTW: the topic.ecore is about 5.5 MB big. reference.ecore is about
> 6.9MB
I'm never quite sure how standards organizations can create things that
exceed the capacity of the human mind.
>
> 3) I want to create UI bindings to the elements (targeting different
> client technologies).
>
> Now, I have the feeling if I'm not using EMF for what is meant in the
> case of simply making the XSD->Ecore.
It all sounds fine except redefine is just not supportable and has no
analog in Java.
> And what I should do is simply work in the XML and XSD domain.
Yes, but XML is just so much syntactic noise, so how to work with it?
>
> Perhaps the only benefit could be by customizing the ecore templates
> so that I could generate the UI bindings (using e4 concepts).
I expect next release we'd provide specialized support for e4.
> Any comments, suggestions or ideas?
I'm not sure exactly what the problem is. You should be able to
generate the models for all the schemas. For any given namespace, you'd
want to be sure to specify all the schemas that contribute to that
namespace so you end up with one EPackage per namespace.


Ed Merks
Professional Support: https://www.macromodeling.com/
Re: Generating EMF from advanced XSDs [message #622390 is a reply to message #522936] Wed, 24 March 2010 16:10 Go to previous message
Igor Jacy Lino Campista is currently offline Igor Jacy Lino CampistaFriend
Messages: 34
Registered: July 2009
Member
Thanks Ed for all the responses. I appreciate them a lot.

Yes, its the infamous XSD:redefine indeed. :?

OASIS DITA spec is indeed very big. I assume that the reason why topics are being redefined within the same namespace is the trick that makes specialization work in XML editors.

My problem is simple: make much more with less. :twisted:

My main goal is to create an open source tooling to work with DITA and perhaps DocBook. :x

I'm investigating (note that don't expect to get any answer, its just for the sake of completeness):
1)How to best work with the DITA model. (considering its derivative nature)
2)Enabling graphical specializations.
3)How to deal with UI bindings for a WYSIWYG-like editor
4)How to adapt e4 concepts into the UI mix.
5)Besides the WYSIWYG-like editor tab, a there will be a tab with an enhanced XML editor.


I will most likely have also a fancy tree view that binds UI units/elements that subsecuently bind to XML elements, or blocks of XML elements. XML comments have to be respected in the serialization. Unknown XML elements will be kept but marked as errors (They won't affect the over usability of the editor). 8o


DITA spec has big models and as you may infer they are not of the best quality. I'm almost convinced that working with several/many 5MB> ecore files does not make sense. -> Therefore my idea to work directly with the XSD instances reusing the Eclipse XML editor functionality and will keep the solution practical.
Re: Generating EMF from advanced XSDs [message #622393 is a reply to message #622390] Wed, 24 March 2010 18:45 Go to previous message
Ed Merks is currently offline Ed MerksFriend
Messages: 33142
Registered: July 2009
Senior Member
Igor,

comments below.

Igor Jacy Lino Campista wrote:
> Thanks Ed for all the responses. I appreciate them a lot.
>
> Yes, its the infamous XSD:redefine indeed. :?
> OASIS DITA spec is indeed very big. I assume that the reason why
> topics are being redefined within the same namespace is the trick that
> makes specialization work in XML editors.
>
> My problem is simple: make much more with less. :twisted:
> My main goal is to create an open source tooling to work with DITA and
> perhaps DocBook. :x
> I'm investigating (note that don't expect to get any answer, its just
> for the sake of completeness):
> 1)How to best work with the DITA model. (considering its derivative
> nature)
> 2)Enabling graphical specializations.
> 3)How to deal with UI bindings for a WYSIWYG-like editor
> 4)How to adapt e4 concepts into the UI mix.
> 5)Besides the WYSIWYG-like editor tab, a there will be a tab with an
> enhanced XML editor.
You might be better in this case to use WTP's structured editing framework.
>
>
> I will most likely have also a fancy tree view that binds UI
> units/elements that subsecuently bind to XML elements, or blocks of
> XML elements. XML comments have to be respected in the serialization.
> Unknown XML elements will be kept but marked as errors (They won't
> affect the over usability of the editor). 8o
>
> DITA spec has big models and as you may infer they are not of the best
> quality. I'm almost convinced that working with several/many 5MB>
> ecore files does not make sense. -> Therefore my idea to work directly
> with the XSD instances reusing the Eclipse XML editor functionality
> and will keep the solution practical.
Yes, that might well be better for such massive schemas and where you
want to push the XML in the user's face rather than hide it from them.
>
>
>


Ed Merks
Professional Support: https://www.macromodeling.com/
Previous Topic:[EEF] What is the best way to update an EEF project?
Next Topic:Re: Need help on using EMF Compare feature
Goto Forum:
  


Current Time: Sat Apr 27 00:26:08 GMT 2024

Powered by FUDForum. Page generated in 0.24522 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top