[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [egit-dev] Re: jgit problems for file paths with non-ASCII characters
|
Robin Rosenberg <robin.rosenberg@xxxxxxxxxx> wrote:
> onsdag 25 november 2009 14:47:25 skrev Marc Strapetz:
> > I have noticed that jgit converts file paths to UTF-8 when querying the
> > repository.
...
> > Is this a bug or a misconfiguration of my repository? I'm using jgit
> > (commit e16af839e8a0cc01c52d3648d2d28e4cb915f80f) on Windows.
>
> A bug.
>
> The problem here is that we need to allow multiple encodings since there
> is no reliable encoding specified anywhere.
This is a design fault of both Linux and git. git gets a byte
sequence from readdir and stores that as-is into the repository.
We have no way of knowing what that encoding is. So now everyone
touching a Git repository is screwed.
> The approach I advocate is
> the one we use for handling encoding in general. I.e. if it looks like UTF-8,
> treat it like that else fallback. This is expensive however
We should try to work harder with the git-core folks to get character
set encoding for file names worked out. We might be able to use a
configuration setting in the repository to tell us what the proper
encoding should be, and if not set, assume UTF-8.
> and then we have
> all the other issues with case insensitive name and the funny property that
> unicode has when it allows characters to be encoding using multiple sequences
> of code points as empoloyed by Apple.
But as you said, this still doesn't make the Apple normal form
any easier. Though if we know we are on such a strange filesystem
we might be able to assume the paths in the repository are equally
damaged. Or not.
--
Shawn.