[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [equinox-dev] Equinox and UTF-8

Well you should not be getting bytes from a String. A String is a set of Characters. Some characters may fit into bytes, but some are wider.

Also, remember that the length of  a String is the number of characters not the number of bytes into which those characters may be encoded.

BJ Hargrave
Senior Technical Staff Member, IBM
OSGi Fellow and CTO of the
OSGi Alliance

office: +1 386 848 1781
mobile: +1 386 848 3788

From: Holger Mense <mail@xxxxxxxxxxxxxxx>
To: Equinox development mailing list <equinox-dev@xxxxxxxxxxx>
Date: 2008/07/10 02:37 AM
Subject: Re: [equinox-dev] Equinox and UTF-8


On Wed, 9 Jul 2008 15:45:03 -0400, Oleg Besedin <obesedin@xxxxxxxxxx>

> To get more consistent results, use String.getBytes("UTF8"). The
> getBytes() method uses default encoding.

using String.getBytes("UTF-8") does not change the behaviour.

Running the code as a bundle inside Eclipse leads to

=== cut ===
§ length() = 1
§ cast to byte = -89
§ getBytes() = -62 -89
=== cut ===

while starting it as a bundle in a running Equinox framework outside
Eclipse leads to

=== cut ===
+é-º length() = 2
+é-º cast to byte = -62 -89
+é-º getBytes() = -61 -126 -62 -89
=== cut ===

> I've read that Windows has
> different default encodings for GUI and console applications. If that is
> true, it might explain why you see different outputs.

I don't think that this is the reason of the watched behaviour. Executing
the same
code without a running Equinox framework leads to the correct identical
result in- and outside
Eclipse. When running it in an Equinox instance, both results differ. So it
looks for me like an
Equinox issue.

Currently I am migrating a bigger application on top of Equinox. The last
days I was working to get
it to start again outside my development environment. That was the time
where a detected the watched

Holger Mense

equinox-dev mailing list