[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[p2-dev] URLs, URIs, and IDs (oh my)
- From: Scott Lewis <slewis@xxxxxxxxx>
- Date: Mon, 29 Sep 2008 13:21:30 -0700
- Delivered-to: email@example.com
- Organization: Code9
- User-agent: Thunderbird 188.8.131.52 (Windows/20080708)
I wanted to throw out some thoughts WRT using URLs, URIs, and IDs based
a) the brief discussion this morning on the p2 weekly call;
b) some related things being discussed on the e4 mailing list around
common connection management 
P2 Context: The immediate issue with our current usage of URLs is
defined by the p2 plan item: "Convert from using URL to URI where
possible", aka bug #237776 
E4 Context: See 
a) For both P2 and E4 it's it's necessary to move away from only using
URLs....primarily, I would say, because the URL (as specified in RFC
2396 ) is insufficiently general to support out-of-process
identification of 'entities' (resources like files of course, but also
services...for p2 repositories, for e4 connections, etc). There are
other reasons not to use URLs as well (e.g. ), of course.
b) The natural 1st alternative to consider is java.net.URI (first
introduced in 1.4.2 I believe, also defined by RFC 2396). There is also
the EMF URI class  , which is almost certainly a better
implementation...i.e. requiring fewer resources, and providing more
functionality. Further, there are other implementations of the
URI...either implementations of just RFC 2396 and/or having other
functionality. IMHO, there will likely be more in the future e.g. .
c) ECF created it's ID interface  in order to define an
out-of-process unique identifier (for resource ids, but also for user
ids, service ids, process ids, etc) that we could use within our own
transport-independent connection framework . When this identity API
 was created, it was created to support the many ways of addressing
and connecting to external processes that would be necessary for
ECF-based communication. At the time of creation we wanted to allow
clients to run ECF-based code on the CDC 1.0 OSGi profile, which does
not support URI...so there is (so far) no direct reference to URI class
currently in the ecf.identity package(s).
I would like to assert that it would be a good idea to consider using
ECF's ID/Namespace API (contained fully within
org.eclipse.ecf.core.identity bundle), for both p2 and e4. I say this
not (just) because ECF created ID/Namespace interfaces, but rather
because I think there would be weaknesses in any implementation based
upon a non-extensible URI class. BTW, I say 'non-extensible' because
the java.net.URI and emf.URI classes are (quite correctly) marked
final...for security as well as other reasons.
a) The java.net.URI class is pretty clearly written specifically to
implement the RFC, meaning that where the RFC is vague, the
implementation is vague. This means, for example, that there would
likely have to be a lot of code duplicated around manipulating paths in
file URIs, creating URIs, etc.
b) java.net.URI is pretty weak in a number of respects where emf URI is
strong. e.g. resource usage, functionality, etc. OTOH obviously
java.net.URI is 'out there' as it's in the JRE for 1.4+.
c) It would be useful for p2 repository implementers and connection
implementers [e4] to be able to define their own identifier syntax.
That is, since URI cannot be extended, everyone is basically required to
use one implementation or do a lot of converting back and forth (e.g.
consider clients built that use p2, and emf, and ecf, and DSDP, etc).
Further, it's a pain (particularly with java.net.URI) to introduce new
schemes...because it means adding *greater* constraints upon the
construction of URIs (being more specific than the RFC)...e.g. like
requiring some set of properties for your new scheme, or defining a
whole new syntax....you can't create subclasses and new constructors.
Of course you can create factories that return URI instances (of some
URI impl, and do their own String parsing).
d) There are others, but this note is getting too long already.
e) The ECF ID/Namespace API can easily *use* either/both/many URI
implementations. There is an extension point called
org.eclipse.ecf.namespace that allows plugins to define new Namespaces,
and these Namespaces basically have control over creating IDs for the
given Namespace). So, for example, it would be trivial to add a
URINamespace, that constructed ID's that wrapped URI instances (either
emf URI or java.net.URI) and used them for ID instances. And
alternative impls could be swapped out later. Further, either ID
subinterfaces (e.g. IFileID...see ) OR adapters could be defined that
exposed the public methods of either/both of java.net.URI and emf URIs.
This would allow clients to either case or adapt the ID implementations
in order to use the URI-specific methods (e.g.
emf.URI.hasRelativePath()). It's my expectation for some time that such
interfaces will be added to ECF identity or dependent plugins so adding
them is likely in the cards whether IDs/Namespaces are used for other
things or not.
This note is long enough so I'll stop there. Even if this doesn't make
sense for whatever reason we (ECF) will introduce support for
java.net.URI-based providers as well as emf.URI-based providers so that
URIs of either type can be used to construct IDs, and the underlying URI
can be accessed via the associated ID.
Thanks for reading. Now I suggest having a beverage of your choice
before responding...but then, please respond :).