Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Language IDEs » Java Development Tools (JDT) » UTF-8 and BOM
UTF-8 and BOM [message #115683] Tue, 11 November 2003 15:43 Go to next message
André-John Mas is currently offline André-John MasFriend
Messages: 2
Registered: July 2009
Junior Member
I tried opening a file in UTF-16 with Eclipse, and all I saw was
binary data. The file does start with a unicode BOM, so I was
surprised to so this. Checking in the peoperties of the file there
was no way to explicitly specify the encoding of the file.

IMO, this is something that should be looked at. In the short
term I am probably in the minority, though in the long term, as
unicode becomes more commomn, this should be supported.

regards

Andre
Re: UTF-8 and BOM [message #115736 is a reply to message #115683] Tue, 11 November 2003 16:10 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: hcs33.egon.gyaloglo.hu

Hi,

Currently 'using of different encodings' support is rather weak in Eclipse.
However, you can view the file (see main menu: Edit-->Encoding), but you
cannot save in that format. Save is always done in the encoding specified in
Preferences-->Workbench-->Editors-->Text file encoding.
There is a plan item in Eclipse 3.0 to extend this scheme.
Also, if your UTF-16 file is an xml you should try the xmlbuddy plugin
(http://www.xmlbuddy.com). It can read and write xml (and so... plain text)
files in different encodings and you can select encoding on a file-per basis
if you want.

HTH,
Regards,
Csaba

"Andre-John Mas" <ajmas@newtradetech.com> wrote in message
news:bor00n$o2v$1@eclipse.org...
> I tried opening a file in UTF-16 with Eclipse, and all I saw was
> binary data. The file does start with a unicode BOM, so I was
> surprised to so this. Checking in the peoperties of the file there
> was no way to explicitly specify the encoding of the file.
>
> IMO, this is something that should be looked at. In the short
> term I am probably in the minority, though in the long term, as
> unicode becomes more commomn, this should be supported.
>
> regards
>
> Andre
Re: UTF-8 and BOM [message #120091 is a reply to message #115736] Mon, 24 November 2003 14:20 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: Andre_Weinand.oti.com

On 11.11.2003 17:10 Uhr, in article bor1j0$q9j$1@eclipse.org, "Horváth,
Csaba" <hcs33@egon.gyaloglo.hu> wrote:

> Hi,
>
> Currently 'using of different encodings' support is rather weak in Eclipse.
> However, you can view the file (see main menu: Edit-->Encoding), but you
> cannot save in that format. Save is always done in the encoding specified in
> Preferences-->Workbench-->Editors-->Text file encoding.

Yes, support for 'non-uniform file encodings' is weak in Eclipse.
However, the editors support to save in a different encoding than the global
workbench encoding. But be aware that some components of Eclipse (most
notably the compiler) aren't aware of the encoding and won't be able to read
these files correctly.

--andre
Re: UTF-8 and BOM [message #120162 is a reply to message #120091] Mon, 24 November 2003 15:30 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: hcs33.egon.gyaloglo.hu

Hi,

Yes, you are right, editor support save using non-workbench encoding. I
tried the M5 build and really the save was done :). However, I noticed the
following problems:
1. The editor does not "convert" the file content, so if I have a file with
e.g. iso-8859-2 encoding, using Edit->Encoding->UTF-8 and save lead to a
wrong file.
2. If I save using 'Save As...' the file will be saved using the workbench
encoding, however the workbench display a notification of saving using a
different encoding.

So the support is usable only if I want to edit files from scratch or the
file content is already in the encoding I set. I entered bug reports on
these two issues: https://bugs.eclipse.org/bugs/show_bug.cgi?id=47346 and
https://bugs.eclipse.org/bugs/show_bug.cgi?id=47348

Thanks and regards,
Csaba

"Andre Weinand" <Andre_Weinand@oti.com> wrote in message
news:BBE7D24E.21407%Andre_Weinand@oti.com...
> On 11.11.2003 17:10 Uhr, in article bor1j0$q9j$1@eclipse.org, "Horv
Re: UTF-8 and BOM [message #122274 is a reply to message #120162] Wed, 26 November 2003 13:04 Go to previous message
Eclipse UserFriend
Originally posted by: Andre_Weinand.oti.com

On 24.11.2003 16:30 Uhr, in article bpt82c$mgs$1@eclipse.org, "Horváth,
Csaba" <hcs33@egon.gyaloglo.hu> wrote:

> Hi,
>
> Yes, you are right, editor support save using non-workbench encoding. I
> tried the M5 build and really the save was done :). However, I noticed the
> following problems:
> 1. The editor does not "convert" the file content, so if I have a file with
> e.g. iso-8859-2 encoding, using Edit->Encoding->UTF-8 and save lead to a
> wrong file.

Yes, the editor's encoding option is more something like a 'read file from
disk using this encoding'; it's not a 'Convert Encoding'.

So the encoding is just the encoding that is passed to the InputStream and
OutputStream on reading and writing the file's content.

If you want to convert the encoding of a file's content you can do this:
- open the file in an editor
- optionally set the encoding of the editor so that the file is displayed
correctly
- 'Select all' text and 'Cut' it to clipboard
- save the file to flush any changes
- select the new encoding
- 'Paste' the clipboard
- 'Save'

--andre
Previous Topic:compiling
Next Topic:How do I specify vmargs on Mac?
Goto Forum:
  


Current Time: Thu Apr 25 18:57:07 GMT 2024

Powered by FUDForum. Page generated in 0.02983 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top