I'm developing a plugin, that writes xml files. Since I'm not using xerces
or any other xml API I have a problem dealing with Unicode characters. For
instance the german "Ö" will be written like "ÃÂ€". But I need the
decompsoed form like "\u006F". Is there any class in the webtools that
could do this transformation?
I don't believe so. I suspect that the decomposed form you've shown
is really just one method supported by various parsers or compilers
to express Unicode characters and not an official representation to
use yourself. I would suggest switching to APIs such as JAXP (found
in Java SE 1.4 and higher) instead of reading and writing files
yourself, or else always reading and writing your files with a
Structured Source Editing
IBM Emerging Technologies
> Since I'm not using xerces
> or any other xml API I have a problem dealing with Unicode characters. For
> instance the german "Ö" will be written like "ÃÂ€". But I need the
> decompsoed form like "\u006F". Is there any class in the webtools that
> could do this transformation?
Just to be sure, you do know you don't need XML APIs to read and write UTF-8, right?
There are plain 'ol Java io API's where you can specify the charset to use.
Also, one step better :) if you are using Eclipse IFile's, they have some capabilities
to automatically detect and use the appropriate charset, depending on the IFile's contentType.