Eclipse Community Forums: EMF » Reading CDATA content into an EMF model

Home » Modeling » EMF » Reading CDATA content into an EMF model

Reading CDATA content into an EMF model [message #425897]

Thu, 11 December 2008 06:26

Eclipse User

Hello EMF community,

we're using EMF to read XML files (provided by some other black box
software) into a model. The XML files look something like this one:
<?xml version="1.0" encoding="UTF-8"?>
<rn:structure xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI"
xmlns:rn="http://www.foo.bar/rn">
<node name="Foo">
<documentation><![CDATA[Some text with lots of special
chars.]]></documentation>
</node>
</rn:structure>

Now, I'm having some trouble getting the CDATA value into the model's
documentation feature. The relevant Ecore model part is as follows:
<eClassifiers xsi:type="ecore:EClass" name="Node">
<eStructuralFeatures xsi:type="ecore:EAttribute" name="documentation"
upperBound="-1"
eType="ecore:EDataType
http://www.eclipse.org/emf/2002/Ecore#//EString"/>
</eClassifiers>

With the above, I think EMF doesn't behave correctly or I did something
wrong in the Ecore-file. When opening the above XML model with my
generated Editor, EMF reads in the value of the CDATA as desired. In the
properties view I see "Some text with lots of special chars." as value of
the documentation feature.

Strange behavior #1:
When a alter the read in model using the EMF generated editor and save it,
the <![CDATA[]]> disappeared in the XML file.

Strange behavior #2:
Using an XML or text editor I remove the <documentation>-tags and then
reopen the XML file using my EMF model editor. There I set the
documentation feature's value to "<![CDATA[Some <really> fönny and wíred
text with lots of special chars.]]>" and save the file. Now, the '<' get
transformed to "%lt;", however, not so the '>' symbols. Additionally, the
special characters are transformed to some UTF-8 encoding. What's
definitely no good, is that the just saved file cannot be opened with the
EMF editor anymore.

Can anybody give me a hint on how to read in the CDATA stuff into my
model's documentation feature without loosing the "<![CDATA[" and "]]>"
characters?

Any help is appreciated.

I'm using Eclipse Version: 3.4.1 with EMF Ecore 2.4.1.

Thx for reading,
Rob

Re: Reading CDATA content into an EMF model [message #425905 is a reply to message #425897]

Thu, 11 December 2008 09:51

Eclipse User

This is a multi-part message in MIME format.
--------------050406030005040807050901
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit

Robert,

Long time no see...

Comments below.

Robert Wloch wrote:
> Hello EMF community,
>
> we're using EMF to read XML files (provided by some other black box
> software) into a model. The XML files look something like this one:
> <?xml version="1.0" encoding="UTF-8"?>
> <rn:structure xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI"
> xmlns:rn="http://www.foo.bar/rn">
> <node name="Foo">
> <documentation><![CDATA[Some text with lots of special
> chars.]]></documentation>
> </node>
> </rn:structure>
>
> Now, I'm having some trouble getting the CDATA value into the model's
> documentation feature. The relevant Ecore model part is as follows:
> <eClassifiers xsi:type="ecore:EClass" name="Node">
> <eStructuralFeatures xsi:type="ecore:EAttribute"
> name="documentation" upperBound="-1"
> eType="ecore:EDataType
> http://www.eclipse.org/emf/2002/Ecore#//EString"/>
> </eClassifiers>
>
>
> With the above, I think EMF doesn't behave correctly or I did
> something wrong in the Ecore-file. When opening the above XML model
> with my generated Editor, EMF reads in the value of the CDATA as
> desired. In the properties view I see "Some text with lots of special
> chars." as value of the documentation feature.
>
> Strange behavior #1:
> When a alter the read in model using the EMF generated editor and save
> it, the <![CDATA[]]> disappeared in the XML file.
Yes, the final EMF model has no idea whether the data was specified
using CDATA or entities.
>
> Strange behavior #2:
> Using an XML or text editor I remove the <documentation>-tags and then
> reopen the XML file using my EMF model editor. There I set the
> documentation feature's value to "<![CDATA[Some <really> f

Re: Reading CDATA content into an EMF model [message #425930 is a reply to message #425905]

Fri, 12 December 2008 04:33

Eclipse User

Ed,

Yes, long time no see. But I've been following your "Ed-around-the-world"
tour on your blog. ;-)

Answers and questions below.

Ed Merks wrote:

> Robert,

> Long time no see...

> Comments below.

> Robert Wloch wrote:
>> Hello EMF community,
>>
>> we're using EMF to read XML files (provided by some other black box
>> software) into a model. The XML files look something like this one:
>> <?xml version="1.0" encoding="UTF-8"?>
>> <rn:structure xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI"
>> xmlns:rn="http://www.foo.bar/rn">
>> <node name="Foo">
>> <documentation><![CDATA[Some text with lots of special
>> chars.]]></documentation>
>> </node>
>> </rn:structure>
>>
>> Now, I'm having some trouble getting the CDATA value into the model's
>> documentation feature. The relevant Ecore model part is as follows:
>> <eClassifiers xsi:type="ecore:EClass" name="Node">
>> <eStructuralFeatures xsi:type="ecore:EAttribute"
>> name="documentation" upperBound="-1"
>> eType="ecore:EDataType
>> http://www.eclipse.org/emf/2002/Ecore#//EString"/>
>> </eClassifiers>
>>
>>
>> With the above, I think EMF doesn't behave correctly or I did
>> something wrong in the Ecore-file. When opening the above XML model
>> with my generated Editor, EMF reads in the value of the CDATA as
>> desired. In the properties view I see "Some text with lots of special
>> chars." as value of the documentation feature.
>>
>> Strange behavior #1:
>> When a alter the read in model using the EMF generated editor and save
>> it, the <![CDATA[]]> disappeared in the XML file.
> Yes, the final EMF model has no idea whether the data was specified
> using CDATA or entities.
>>
>> Strange behavior #2:
>> Using an XML or text editor I remove the <documentation>-tags and then
>> reopen the XML file using my EMF model editor. There I set the
>> documentation feature's value to "<![CDATA[Some <really> fönny and
>> wíred text with lots of special chars.]]>" and save the file. Now, the
>> '<' get transformed to "%lt;", however, not so the '>' symbols.
> Yes, it's not actually necessary to escape the closing '>' in an
> attribute value.
Confirmed.

> Doesn't EMF write out a value as an attribute by default.
Is this actually a question? If so: In my case I did the trick by
specifying upper bound = -1 resulting the attribute being written out as a
child tag instead of an attribute of the parent tag.

>> Additionally, the special characters are transformed to some UTF-8
>> encoding. What's definitely no good, is that the just saved file
>> cannot be opened with the EMF editor anymore.
> There are some characters that simply aren't allowed in XML at all.
> Like the zero byte ASCII character.
Okay, I'll provide this to the black box developers just to make sure.

>>
>> Can anybody give me a hint on how to read in the CDATA stuff into my
>> model's documentation feature without loosing the "<![CDATA[" and
>> "]]>" characters?
> This option might produce what you want:

> /**
> * Serialized element content that needs escaping and doesn't
> contain <code>"]]>"</code>, will be escaped using CDATA.
> * The default value is false.
> * @since {@link #OPTION_SKIP_ESCAPE}
> * @since 2.4
> */
> String OPTION_ESCAPE_USING_CDATA = "ESCAPE_USING_CDATA";
I've read about this option somewhere else yesterday. However, I didn't
find the right place to set it. Can I do this in the genmodel, or do I
need to modify a generated class or else?

> Is the actual model derived from a schema and you're using the generated
> resource for load and save?
I designed an ecore file via reverse engineering since there's no schema
available for the model. But we do use the generated resource for load in
combination with oaW to generate an Xtext editor.
Since I'm able to read in the CDATA stuff everything's okay for now. I was
just evaluating how to save the resource as that's a very likely
requirement that may come up by our customer in the near future.

>>
>> Any help is appreciated.
>>
>> I'm using Eclipse Version: 3.4.1 with EMF Ecore 2.4.1.
>>
>> Thx for reading,
>> Rob
>>
>>
>>

Re: Reading CDATA content into an EMF model [message #425942 is a reply to message #425930]

Fri, 12 December 2008 16:49

Eclipse User

Robert,

Comments below.

Robert Wloch wrote:
> Ed,
>
> Yes, long time no see. But I've been following your
> "Ed-around-the-world" tour on your blog. ;-)
>
> Answers and questions below.
>
>
> Ed Merks wrote:
>
>> Robert,
>
>> Long time no see...
>
>> Comments below.
>
>
>> Robert Wloch wrote:
>>> Hello EMF community,
>>>
>>> we're using EMF to read XML files (provided by some other black box
>>> software) into a model. The XML files look something like this one:
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <rn:structure xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI"
>>> xmlns:rn="http://www.foo.bar/rn">
>>> <node name="Foo">
>>> <documentation><![CDATA[Some text with lots of special
>>> chars.]]></documentation>
>>> </node>
>>> </rn:structure>
>>>
>>> Now, I'm having some trouble getting the CDATA value into the
>>> model's documentation feature. The relevant Ecore model part is as
>>> follows:
>>> <eClassifiers xsi:type="ecore:EClass" name="Node">
>>> <eStructuralFeatures xsi:type="ecore:EAttribute"
>>> name="documentation" upperBound="-1"
>>> eType="ecore:EDataType
>>> http://www.eclipse.org/emf/2002/Ecore#//EString"/>
>>> </eClassifiers>
>>>
>>>
>>> With the above, I think EMF doesn't behave correctly or I did
>>> something wrong in the Ecore-file. When opening the above XML model
>>> with my generated Editor, EMF reads in the value of the CDATA as
>>> desired. In the properties view I see "Some text with lots of
>>> special chars." as value of the documentation feature.
>>>
>>> Strange behavior #1:
>>> When a alter the read in model using the EMF generated editor and
>>> save it, the <![CDATA[]]> disappeared in the XML file.
>> Yes, the final EMF model has no idea whether the data was specified
>> using CDATA or entities.
>>>
>>> Strange behavior #2:
>>> Using an XML or text editor I remove the <documentation>-tags and
>>> then reopen the XML file using my EMF model editor. There I set the
>>> documentation feature's value to "<![CDATA[Some <really> fönny and
>>> wíred text with lots of special chars.]]>" and save the file. Now,
>>> the '<' get transformed to "%lt;", however, not so the '>' symbols.
>> Yes, it's not actually necessary to escape the closing '>' in an
>> attribute value.
> Confirmed.
>
>> Doesn't EMF write out a value as an attribute by default.
> Is this actually a question? If so: In my case I did the trick by
> specifying upper bound = -1 resulting the attribute being written out
> as a child tag instead of an attribute of the parent tag.
Yes, multi-valued features will by default be elements...
>
>>> Additionally, the special characters are transformed to some UTF-8
>>> encoding. What's definitely no good, is that the just saved file
>>> cannot be opened with the EMF editor anymore.
>> There are some characters that simply aren't allowed in XML at all.
>> Like the zero byte ASCII character.
> Okay, I'll provide this to the black box developers just to make sure.
>
>>>
>>> Can anybody give me a hint on how to read in the CDATA stuff into my
>>> model's documentation feature without loosing the "<![CDATA[" and
>>> "]]>" characters?
>> This option might produce what you want:
>
>> /**
>> * Serialized element content that needs escaping and doesn't
>> contain <code>"]]>"</code>, will be escaped using CDATA.
>> * The default value is false.
>> * @since {@link #OPTION_SKIP_ESCAPE}
>> * @since 2.4
>> */
>> String OPTION_ESCAPE_USING_CDATA = "ESCAPE_USING_CDATA";
> I've read about this option somewhere else yesterday. However, I
> didn't find the right place to set it. Can I do this in the genmodel,
> or do I need to modify a generated class or else?
You'd generally modify the generated XyzResourceFactoryImpl. If you
don't have one, you'd change the GenPackage's Resource Type, delegate
the plugin.xml so it's regenerated, and regenerate so you do have one.
>
>> Is the actual model derived from a schema and you're using the
>> generated resource for load and save?
> I designed an ecore file via reverse engineering since there's no
> schema available for the model. But we do use the generated resource
> for load in combination with oaW to generate an Xtext editor.
> Since I'm able to read in the CDATA stuff everything's okay for now. I
> was just evaluating how to save the resource as that's a very likely
> requirement that may come up by our customer in the near future.
No application should ever care whether CDATA or entities. Users might
be bothered, but applications shouldn't care...
>
>>>
>>> Any help is appreciated.
>>>
>>> I'm using Eclipse Version: 3.4.1 with EMF Ecore 2.4.1.
>>>
>>> Thx for reading,
>>> Rob
>>>
>>>
>>>
>
>

Re: Reading CDATA content into an EMF model [message #425976 is a reply to message #425942]

Mon, 15 December 2008 03:17

Eclipse User

Thanks for your help an clarification, Ed!

Re: Reading CDATA content into an EMF model [message #426049 is a reply to message #425976]

Mon, 15 December 2008 07:34

Eclipse User

Hi

I have a same problem reading the comments in a xml file. When I read a
XML file using the EMF generated editor and save it, all the comments
disappeared in the XML file saved. Do you know how could I keep the
comments?

Thanks a lot

Mabu

Re: Reading CDATA content into an EMF model [message #426053 is a reply to message #426049]

Mon, 15 December 2008 07:52

Eclipse User

Mabu,

Comments will only be preserved within complex types with mixed content
and only then within the mixed content itself.

Mabu wrote:
> Hi
>
> I have a same problem reading the comments in a xml file. When I
> read a XML file using the EMF generated editor and save it, all the
> comments disappeared in the XML file saved. Do you know how could I
> keep the comments?
> Thanks a lot
>
> Mabu
>

Re: Reading CDATA content into an EMF model [message #426060 is a reply to message #426053]

Mon, 15 December 2008 09:13

Eclipse User

Could you explain about it a bit more. Do you have an example ?

Re: Reading CDATA content into an EMF model [message #426066 is a reply to message #426060]

Mon, 15 December 2008 10:50

Eclipse User

This is a multi-part message in MIME format.
--------------080109050101030600050203
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit

Mabu,

This article will likely help:

Binding XML to Java
< http://www.theserverside.com/tt/articles/article.tss?l=Bindi ngXMLJava>

Mabu wrote:
> Could you explain about it a bit more. Do you have an example ?

--------------080109050101030600050203
Content-Type: text/html; charset=ISO-8859-15
Content-Transfer-Encoding: 7bit

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-15"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Mabu, 
 
This article will likely help: 
<blockquote><a target="_out"
href=" http://www.theserverside.com/tt/articles/article.tss?l=Bindi ngXMLJava">Binding
XML to Java</a> 
</blockquote>
 
Mabu wrote:
<blockquote
cite="mid:fa8f618629aed7ea97dfa473ac375134$1@www.eclipse.org"
type="cite">Could you explain about it a bit more. Do you have an
example ? 
</blockquote>
</body>
</html>

--------------080109050101030600050203--

Re: Reading CDATA content into an EMF model [message #426074 is a reply to message #426060]

Mon, 15 December 2008 11:40

Eclipse User

I found the solution, just add the attribute mixed="true" in the schema
and re-generate it. Thanks

Re: Reading CDATA content into an EMF model [message #426076 is a reply to message #426074]

Mon, 15 December 2008 12:33

Eclipse User

Mabu,

Alternatively you can use ecore:mixed="true" so you don't actually
change the schema itself.

Mabu wrote:
> I found the solution, just add the attribute mixed="true" in the
> schema and re-generate it. Thanks
>

Re: Reading CDATA content into an EMF model [message #427624 is a reply to message #425905]

Tue, 24 February 2009 12:14

Eclipse User

Hello,

Ed Merks wrote:
> Robert,
>
> Long time no see...
>
> Comments below.
>
>
> Robert Wloch wrote:
>> Hello EMF community,
>>
>> we're using EMF to read XML files (provided by some other black box
>> software) into a model. The XML files look something like this one:
>> <?xml version="1.0" encoding="UTF-8"?>
>> <rn:structure xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI"
>> xmlns:rn="http://www.foo.bar/rn">
>> <node name="Foo">
>> <documentation><![CDATA[Some text with lots of special
>> chars.]]></documentation>
>> </node>
>> </rn:structure>
>>
>> Now, I'm having some trouble getting the CDATA value into the model's
>> documentation feature. The relevant Ecore model part is as follows:
>> <eClassifiers xsi:type="ecore:EClass" name="Node">
>> <eStructuralFeatures xsi:type="ecore:EAttribute"
>> name="documentation" upperBound="-1"
>> eType="ecore:EDataType
>> http://www.eclipse.org/emf/2002/Ecore#//EString"/>
>> </eClassifiers>
>>
>>
>> With the above, I think EMF doesn't behave correctly or I did
>> something wrong in the Ecore-file. When opening the above XML model
>> with my generated Editor, EMF reads in the value of the CDATA as
>> desired. In the properties view I see "Some text with lots of special
>> chars." as value of the documentation feature.
>>
>> Strange behavior #1:
>> When a alter the read in model using the EMF generated editor and save
>> it, the <![CDATA[]]> disappeared in the XML file.
> Yes, the final EMF model has no idea whether the data was specified
> using CDATA or entities.

I have the same problem: I add CDATA to the FeatureMap, save xml, then
load it and result FeatureMap loose information about CDATA. I don't
completely understand why it is correct, and why EMF couldn't determine
that data is specified using CDATA, because text and CDATA is a
different node types. And strictly speaking saved and loaded documents
are not equal (at least in terms of XMLUnit).

Thanks in advance,
Alexey

Re: Reading CDATA content into an EMF model [message #427625 is a reply to message #427624]

Tue, 24 February 2009 12:40

Eclipse User

This is a multi-part message in MIME format.
--------------090208050105040804030501
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit

Do you have this XMLResource option mapped true in the resource factory
used to create your resource?

/**
* Determines whether comments and CDATA will be preserved in any
mixed text processing.
* This option is applicable for loading XML resources (from DOM
node or an InputStream)
*/
String OPTION_USE_LEXICAL_HANDLER = "USE_LEXICAL_HANDLER";

koloale wrote:
> Hello,
>
> Ed Merks wrote:
>> Robert,
>>
>> Long time no see...
>>
>> Comments below.
>>
>>
>> Robert Wloch wrote:
>>> Hello EMF community,
>>>
>>> we're using EMF to read XML files (provided by some other black box
>>> software) into a model. The XML files look something like this one:
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <rn:structure xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI"
>>> xmlns:rn="http://www.foo.bar/rn">
>>> <node name="Foo">
>>> <documentation><![CDATA[Some text with lots of special
>>> chars.]]></documentation>
>>> </node>
>>> </rn:structure>
>>>
>>> Now, I'm having some trouble getting the CDATA value into the
>>> model's documentation feature. The relevant Ecore model part is as
>>> follows:
>>> <eClassifiers xsi:type="ecore:EClass" name="Node">
>>> <eStructuralFeatures xsi:type="ecore:EAttribute"
>>> name="documentation" upperBound="-1"
>>> eType="ecore:EDataType
>>> http://www.eclipse.org/emf/2002/Ecore#//EString"/>
>>> </eClassifiers>
>>>
>>>
>>> With the above, I think EMF doesn't behave correctly or I did
>>> something wrong in the Ecore-file. When opening the above XML model
>>> with my generated Editor, EMF reads in the value of the CDATA as
>>> desired. In the properties view I see "Some text with lots of
>>> special chars." as value of the documentation feature.
>>>
>>> Strange behavior #1:
>>> When a alter the read in model using the EMF generated editor and
>>> save it, the <![CDATA[]]> disappeared in the XML file.
>> Yes, the final EMF model has no idea whether the data was specified
>> using CDATA or entities.
>
> I have the same problem: I add CDATA to the FeatureMap, save xml, then
> load it and result FeatureMap loose information about CDATA. I don't
> completely understand why it is correct, and why EMF couldn't
> determine that data is specified using CDATA, because text and CDATA
> is a different node types. And strictly speaking saved and loaded
> documents are not equal (at least in terms of XMLUnit).
>
> Thanks in advance,
> Alexey

--------------090208050105040804030501
Content-Type: text/html; charset=ISO-8859-15
Content-Transfer-Encoding: 8bit

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-15"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Do you have this XMLResource option mapped true in the resource factory
used to create your resource? 
<blockquote>

Re: Reading CDATA content into an EMF model [message #427637 is a reply to message #427625]

Wed, 25 February 2009 02:20

Eclipse User

Oh, I'm sorry I didn't have this option set true. Thank you Ed.

Ed Merks wrote:
> Do you have this XMLResource option mapped true in the resource factory
> used to create your resource?
>
> /**
> * Determines whether comments and CDATA will be preserved in any
> mixed text processing.
> * This option is applicable for loading XML resources (from DOM
> node or an InputStream)
> */
> String OPTION_USE_LEXICAL_HANDLER = "USE_LEXICAL_HANDLER";
>
>
> koloale wrote:
>> Hello,
>>
>> Ed Merks wrote:
>>> Robert,
>>>
>>> Long time no see...
>>>
>>> Comments below.
>>>
>>>
>>> Robert Wloch wrote:
>>>> Hello EMF community,
>>>>
>>>> we're using EMF to read XML files (provided by some other black box
>>>> software) into a model. The XML files look something like this one:
>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>> <rn:structure xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI"
>>>> xmlns:rn="http://www.foo.bar/rn">
>>>> <node name="Foo">
>>>> <documentation><![CDATA[Some text with lots of special
>>>> chars.]]></documentation>
>>>> </node>
>>>> </rn:structure>
>>>>
>>>> Now, I'm having some trouble getting the CDATA value into the
>>>> model's documentation feature. The relevant Ecore model part is as
>>>> follows:
>>>> <eClassifiers xsi:type="ecore:EClass" name="Node">
>>>> <eStructuralFeatures xsi:type="ecore:EAttribute"
>>>> name="documentation" upperBound="-1"
>>>> eType="ecore:EDataType
>>>> http://www.eclipse.org/emf/2002/Ecore#//EString"/>
>>>> </eClassifiers>
>>>>
>>>>
>>>> With the above, I think EMF doesn't behave correctly or I did
>>>> something wrong in the Ecore-file. When opening the above XML model
>>>> with my generated Editor, EMF reads in the value of the CDATA as
>>>> desired. In the properties view I see "Some text with lots of
>>>> special chars." as value of the documentation feature.
>>>>
>>>> Strange behavior #1:
>>>> When a alter the read in model using the EMF generated editor and
>>>> save it, the <![CDATA[]]> disappeared in the XML file.
>>> Yes, the final EMF model has no idea whether the data was specified
>>> using CDATA or entities.
>>
>> I have the same problem: I add CDATA to the FeatureMap, save xml, then
>> load it and result FeatureMap loose information about CDATA. I don't
>> completely understand why it is correct, and why EMF couldn't
>> determine that data is specified using CDATA, because text and CDATA
>> is a different node types. And strictly speaking saved and loaded
>> documents are not equal (at least in terms of XMLUnit).
>>
>> Thanks in advance,
>> Alexey

Previous Topic:	How to create custom constructors of model objects ?
Next Topic:	[CDO] ERROR: Could not find feature CDOFeature ...

Goto Forum:

-=] Back to Top [=-

Current Time: Sat Jul 05 11:58:38 EDT 2025

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter