Optimizing the storage will of course bring benefits. Why for
instance, do we use XML at all? There are far more efficient ways to
store the data. Another way to reduce the pain would be to introduce
smarter protocols that would enable p2 queries to be sent and
evaluated on a server rather than copying the "database" in its
entirety to the client. All such improvements are however
significant efforts. And even if we do manage to reduce the
meta-data, it will still be three times larger than necessary for
the vast majority of users.
The simple solution that is feasible short-term is to maintain
separate repositories. One that provides only the latest and also is
the default for the installer, and another that is the composite,
suitable for builds or for consumers that want older releases. It
would be easy to fix and be a huge win for a very large number of
On 02/27/2012 07:58 PM, Ian Bull wrote:
What about reducing the size of the data? As you know, the
content.jar contains multiple copies of the same license file.
We've also noticed that stripping whitespace (before jar'ing up
xml file) has a positive impact on the resulting file.
We haven't had the time to investigate this fully, but this
might be the best-of-both worlds that we need. (keep the old
IUs, but improve the download time).
Of course we still have the same number of IUs (although I
have some fixes for our queries, hint hint, nudge nudge) that
might help too.
On Sun, Feb 26, 2012 at 1:39 PM, Thomas Hallgren <thomas@xxxxxxx>
David, I think there's a great need for the kind of the
"once a URL works for something, it should always work"
notion. But I'm fairly convinced that the main use for it is
in the build domain and that a very small percentage of the
day to day users really care about older releases.
Optimization in p2 sounds good but it doesn't really help my
use-case. I will still have to wait ten minutes (sounds
long? It's not at all uncommon) for the first feature to
A p2 repository is primarily designed for ease of install.
Right now we're hampering the technology in a bad way in an
attempt to provide a "one size fits all". We need to rethink
that approach and instead let the default be the easy
lightweight thing that is what the vast majority needs.
It seems to me that providing the release train in two
shapes would be an easy thing to do. And it's not just the
release train b.t.w. Many other projects, including the
Eclipse platform itself, uses the same approach.
I guess I'm a bit tired always waiting for those downloads
well aware that about 70% of the data is completely
On 02/26/2012 07:46 PM, David M Williams wrote:
I've noticed that too ... now that we have three
releases at one URL! ...
bummer. And ... I seem to recall you have suggested
this same improvement
before, Thomas :) ... but please don't stop trying to
One counter view is that many people also want the
"once a URL works for something, it should always
work" (i.e. repositories
URLs are persistent, and "forever") ... for example,
product or project or build might really needs SR0
content, and they have
printed books or tutorials on "how to install their
product" or doing
maintenance on their SR0 based product and it would
stop working if SR0
content moved to a ...releases/indigo-with-legacy URL.
(And, keep in mind,
for what ever reason, they might really did need some
very specific SR0
content constraint, and something would break for them
if users started
Plus, I suspect many people would think that p2 should
be in charge of p2
optimization, not the repository authors or creators
(my personal view is
that it could be a balance ... just stating a counter
example, I have heard rumors that p2 does not
currently take advantage of
the "timestamps" that are part of every p2 repository
something like that could be used (by client-side p2)
to better know "what
to try first" and only fetch more, older stuff, if the
most recent stuff
doesn't satisfy constraints?
Another possible approach ... but kind of a stretch
... might be some
optimization from the opposite direction. We currently
our .../releases/indigo composite made up of
date-time stamp directory name. I have always
maintained these should not
be "published" and should not be considered "permeant"
... they might move
to archives some day, there could times that even once
"final date-time sub-repo" has to change to correct an
IP issue, or,
perhaps be "added to" to provide security patches, or
But ... after all those qualifiers ... perhaps we
sub-repositories named such as SR0, SR1, SR2 that
would be "published" and
considered permanent, so those that really wanted only
the very specific
content could "manually" add those more specific
software sites to their
lists? This would mean, though, that each of those SRx
would need to be a composite, pointing to dated
repositories (that could
move to archives, or be changed "last minute" ... and,
the problem with
that, is that having "too many indirect children" is
already a problem for
some users (in a different way than what you are
describing, having to do
with frequency of HTPP "round trip" requests).
Lastly, (in the least-likely-to-happen-anytime-soon
category) ... if we
could get people to make sure their version/qualifiers
only changed when it
was really required (i.e. their content substantially
changed) we might do
a better job of "pruning" the release repositories and
of changes. But this would take more work and effort
from each project
(both to produce the right stuff, and to test the
final result), and more
of a effort from the "common repo release engineer"
(currently, me :) to
maintain and quality-check the common (central)
I don't mean to dampen discussions of improvements ...
improvements are possible and would love to see some
and I would be willing
to help (well, a little, if effort is small :) ...
but, thought I'd
mention these counter arguments to simply "moving old
stuff to a different
URL". I think that'd cause more problems than it would
solve. My intuition
is that any substantial improvements will take some
Not to mention, I could be misunderstanding your
suggestion, so do please
keep the discussion going in any case, or, open a
cross-project bug, if you
think there is a problem that could be improved.
From: Thomas Hallgren<thomas@xxxxxxx>
To: P2 developer discussions<p2-dev@xxxxxxxxxxx>,
Date: 02/26/2012 08:23 AM
Subject: [p2-dev] A typical use-case
Sent by: p2-dev-bounces@xxxxxxxxxxx
1. I'm downloading and installing Eclipse Classic
2. I'm installing something from Indigo.
At step 2, I'm forced to wait while all children of
the Indigo release are
downloaded. Why? I'm only interested in SR2.
The previous releases will just a) take a long time to
download, and 2)
consume a lot of memory when perused.
I know there are cases where someone would like to
revert to a previous
install etc. and that all previous releases
therefore must be available, but what I can't
understand is why this must
control the default behavior for the 99% of
the users that never needs that. Why not let
releases/indigo appoint the
last release and have something like
releases/indigo-with-legacy appoint the composite?
p2-dev mailing list
p2-dev mailing list
p2-dev mailing list
R. Ian Bull | EclipseSource Victoria | +1 250 477 7484
p2-dev mailing list