Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » M2T (model-to-text transformation) » Re: Force UTF-8 for JET templates
Re: Force UTF-8 for JET templates [message #59086] Wed, 01 April 2009 09:21 Go to next message
Ed Merks is currently offline Ed Merks
Messages: 25999
Registered: July 2009
Senior Member
Joachim,

I think your question is about JET2 because I don't recognize <ws:file>
as JET1 syntax. I've added the m2t newsgroup on the "to" list of the
reply, since that's the place to ask about JET2.


Joachim Rietz wrote:
> Hi,
>
> I have a problem getting chinese characters correct within JET.
>
> I force the encoding to "UTF-8" through the 'encoding' attribute
> within <ws:file> tag.
> <ws:file template="templates/psr.jet" path="{$psrOutFilePath}"
> encoding="UTF-8"/>
>
> For some parts in JET template text strings are collected from EMF
> models, hence getting a java String, while some text strings are
> written directly in the JET template.
> For those text strings retrieved through EMF models (java String) the
> chinese characters are displayed nice in the output file, BUT for
> those text strings entered directly within the JET template cryptic
> symbols are displayed in output file!
> I have workspace setting "UTF-8" encoding also set for JET template.
>
> How do I manage to maintain the chinese characters from JET template
> throughout to output file??
>
> Regards,
> Joachim
>
Re: Force UTF-8 for JET templates [message #59111 is a reply to message #59086] Thu, 02 April 2009 11:47 Go to previous messageGo to next message
Paul Elder is currently offline Paul Elder
Messages: 849
Registered: July 2009
Senior Member
Joachim:

Internally, JET (like any Java program using char or String) uses Unicode,
so the problem has to be in one of the following three areas:

1) the template result is being persisted to an encoding that does not
support your characters.
2) static text in the templates is being corrupted.
3) text from you 'model' is being corrupted

For #1, setting the encoding attribute on ws:file is the way to go.

For #2, you can set the JET template encoding either globally, or by
project, folder or individual file.

For #3, you must ensure the model correctly represents things, and is
getting loaded correctly.


From your description, the problem is with #2 - static text in the JET
template is getting corrupted. Things to verify:

1) Right-click the .jet file and click Properties. Ensure that the
encoding is reported as UTF-8.

2) Find the corresponding .java file generated by the JET compiler, and
check its encoding. It, too, should be UTF-8. Double-check, by looking at
the Java file contents - you should find the Chinese characters there, in
the source.

I have just tried this all on a Mac and a Windows (in Japanese), and it
works fine. And, I know of a user that makes very significant use of JET
to generated Japanese and other Asian languages.

I think you will find that, when you do this, everything looks OK. That
leaves one more culprit: PDE Build.


I'd bet you a glass of your favourite beverage, that you are compiling JET
generated Java code using PDE build (the ant-based build environment),
rather than with Eclipse's incremental builder mechanism. In this case,
things are a bit more complicated...

PDE Build fact # 1. The generated <javac> ant task is unaware of the
workspace, and hence any non-standard encodings used on .java files.
<javac> uses the JVM's default encoding to read the .java files. I can
corrupt my Japanese characters by building my JET plug-in with PDE build.

You can overcome this as follows:

1) use a uniform encoding for .java files in your project/build. This
means using a uniform encoding for your .jet files, too. Best way to do
this is to right click the project and click Properties. On the Resource
page, set the 'Text file encoding' to UTF-8.

2) invoke your PDE build ant script setting the property 'compilerArg' to
'-encoding UTF-8'

Another possible workaround would be to move your 'exotic' characters out
of your templates and into files that you know can be loaded and written
correctly. Some ideas:

1) Put the text into an .xml file (with encoding set to UTF-8). Place the
file somewhere in your templates directory, and use <c:load> to load it.
Then, in your templates, access those values via c:get tags.

2) If you are using the Galileo M6 build of JET, there are new tags
<f:message>, <f:setBundle> and <f:bundle> that allow you to use standard
Java .properties files for your text. Be warned: the convention for such
files is to encode them with ISO8859-1, so Chinese characters would have
to be expressed as \uXXXX. (Not very appealing.)

I'm not very happy forcing you to do such work-arounds, so I have submitted

https://bugs.eclipse.org/bugs/show_bug.cgi?id=270985

I plan to provide a fix in the Galileo M7 build.

Paul
Re: Force UTF-8 for JET templates [message #59241 is a reply to message #59111] Mon, 06 April 2009 08:01 Go to previous messageGo to next message
J. Rietz is currently offline J. Rietz
Messages: 40
Registered: July 2009
Member
Hi Paul,

Thanks a lot for your great feedback, really useful!

I tried different solutions, but finally went for the one using an XML
file and retrieve the contents via <c:load>/<c:get>.
This works perfect and solves my problems. Thanks!!


However, just for curiosity...
I tried to have the chinese characters directly within the JET template
file but had two different outcomes for this.
I should inform that I have a JET project during development, but for
runtime I export it as "Deployable plug-ins and fragments", resulting in a
JAR-file. Both JET project and project containing resulting JAR-file has
UTF-8 encoding set.

When running directly against my JET project everything works great, the
chinese characters are preserved.
BUT, when exporting it to JAR-file the characters are corrupted!
Is there a way to force specific encoding when exporting here, or how is
this handled??


Regards,
Joachim



Paul Elder wrote:

> Joachim:

> Internally, JET (like any Java program using char or String) uses Unicode,
> so the problem has to be in one of the following three areas:

> 1) the template result is being persisted to an encoding that does not
> support your characters.
> 2) static text in the templates is being corrupted.
> 3) text from you 'model' is being corrupted

> For #1, setting the encoding attribute on ws:file is the way to go.

> For #2, you can set the JET template encoding either globally, or by
> project, folder or individual file.

> For #3, you must ensure the model correctly represents things, and is
> getting loaded correctly.


> From your description, the problem is with #2 - static text in the JET
> template is getting corrupted. Things to verify:

> 1) Right-click the .jet file and click Properties. Ensure that the
> encoding is reported as UTF-8.

> 2) Find the corresponding .java file generated by the JET compiler, and
> check its encoding. It, too, should be UTF-8. Double-check, by looking at
> the Java file contents - you should find the Chinese characters there, in
> the source.

> I have just tried this all on a Mac and a Windows (in Japanese), and it
> works fine. And, I know of a user that makes very significant use of JET
> to generated Japanese and other Asian languages.

> I think you will find that, when you do this, everything looks OK. That
> leaves one more culprit: PDE Build.


> I'd bet you a glass of your favourite beverage, that you are compiling JET
> generated Java code using PDE build (the ant-based build environment),
> rather than with Eclipse's incremental builder mechanism. In this case,
> things are a bit more complicated...

> PDE Build fact # 1. The generated <javac> ant task is unaware of the
> workspace, and hence any non-standard encodings used on .java files.
> <javac> uses the JVM's default encoding to read the .java files. I can
> corrupt my Japanese characters by building my JET plug-in with PDE build.

> You can overcome this as follows:

> 1) use a uniform encoding for .java files in your project/build. This
> means using a uniform encoding for your .jet files, too. Best way to do
> this is to right click the project and click Properties. On the Resource
> page, set the 'Text file encoding' to UTF-8.

> 2) invoke your PDE build ant script setting the property 'compilerArg' to
> '-encoding UTF-8'

> Another possible workaround would be to move your 'exotic' characters out
> of your templates and into files that you know can be loaded and written
> correctly. Some ideas:

> 1) Put the text into an .xml file (with encoding set to UTF-8). Place the
> file somewhere in your templates directory, and use <c:load> to load it.
> Then, in your templates, access those values via c:get tags.

> 2) If you are using the Galileo M6 build of JET, there are new tags
> <f:message>, <f:setBundle> and <f:bundle> that allow you to use standard
> Java .properties files for your text. Be warned: the convention for such
> files is to encode them with ISO8859-1, so Chinese characters would have
> to be expressed as uXXXX. (Not very appealing.)

> I'm not very happy forcing you to do such work-arounds, so I have submitted

> https://bugs.eclipse.org/bugs/show_bug.cgi?id=270985

> I plan to provide a fix in the Galileo M7 build.

> Paul
Re: Force UTF-8 for JET templates [message #59265 is a reply to message #59241] Mon, 06 April 2009 08:30 Go to previous message
Paul Elder is currently offline Paul Elder
Messages: 849
Registered: July 2009
Senior Member
Joachim:

Exporting a deployable feature/plug-in uses PDE build under the covers.
And, I see no way of telling PDE that the default .java encoding is
different from the default.

I'm going to try and get the defect mentioned in the previous build fixed
this week.

Paul
Previous Topic:UML2 to Java class without any ECore dependencies
Next Topic:JMerge
Goto Forum:
  


Current Time: Thu Aug 21 12:04:50 EDT 2014

Powered by FUDForum. Page generated in 0.01762 seconds