[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
[platform-vcm-dev] Question to provider writers: Text/Binary default
|
Dear repository providers,
As you know, Team supports API available to all providers which will tell
you if a file is believed to be text or binary. This determination is
based on a table of file types, some of which we contribute, and the rest
which other plugins would contribute.
At present, Team is agnostic wrt. whether files of unknown type should be
considered text or binary. We believed it was incorrect of us to assume
for a provider how this should be handled. We are question this assumption
though.
For CVS we've assumed binary because we're concered about errant EOL
conversion on gif's, etc. This is a very bad failure because it results in
corrupting of data and potentially lost work/data.
The counter argument is that for the most part people only version control
text files. Furthermore, our support for marking files as derived and not
version controlling them means we catch a lot of the binary cases.
Generally speaking the remaining set of known binary file types to be
version control is relatively small and we could probably reliably list
most as defaults in Team. By contrast, its much harder to come up with a
list of known text file types.
Problem 1:
For CVS users this is tedious and they must ensure they've updated the list
of text files, otherwise they don't get EOL conversion. Presumably will be
true for other providers too.
Problem 2:
The problem becomes more interesting with code that reads/writes files.
Because we (CVS) don't convert EOL on unknown file types (assumed binary),
files generated using the platform encoding will show up in compare as
having every line in conflict, unless the person thinks to turn on ignoring
whitespce.
Problem 3:
When someone intoduces a new file type and writes code that generates
content, they *must* always add that new file type to the Team global list.
They can't assume that it will either be interpreted as text or binary.
Worse, they may make assumptions about the default based on how their
provider interprets unkown file types, which could be different when used
against a different provider.
Thus the list of text/binary files must be complete. It is unreasonable to
expect plugin writers to be such good Team citizens. If the default was
known, and was text, then its more believable that someone generating
binary files that aren't derived would think to add them to the Team type
list, although the failure case is still there.
Problem 4:
Our (Team's) current default list only has the text files, and this is
wrong since Team is agnostic for unknown types. That is, we made the exact
error described in problem #3.
My question to you:
Q1: Should Team.getType(IFile) return a hardcoded "text" or "binary" for
unknown files?
Q2: If yes, should it be "text"?
I believe #1 should be "yes". I think #2 should be yes (text).
This discussion is occuring much later in the cycle than we would like, but
we've only recently fully understood the problem. If we make any changes
we need to do them next week.
Thanks for your time,
The Team team