[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [cross-project-issues-dev] Download stats and p2


There's been some good ideas raised, and worth discussion, and possibly 
worth (someone) implementing ... but the timing for implementing it is not 
now at end of release. The timing is what's causing the resistive tone of 
some comments ... normally we all like to be very helpful to people who 
contribute to Eclipse. 

But I find my self -- I can not believe I am saying this -- wanting to 
come to the defense of the p2 team. :) 

There's something about the tone of some comments that implies there was a 
"loss of function" introduced by p2, but I think that's unfair to the p2 
team. The hack used in update manager was just that, a hack, and just 
happened to work, and it was never planned for and never intended as an 
API, as far as I know (but, I was never part of that team, so others will 
correct me if I am wrong). It was a cool trick, that worked by 
coincidence, and worked sufficiently for some (It never work well for us 
in WTP, so we didn't use it, perhaps because we have more massive content 
in more smaller pieces and/or installed more frequently, and/or have more 
users behind unusual re-direction firewalls). 

But, the point I want to make is that update manager itself was so slow, 
and so unreliable, that I suspect a hack which introduced a little 
slowness or unreliability would not have been noticed in the noise. 

So, that's my concern ... just a lot of unknowns. But, now that p2 is 
faster and more reliable, perhaps you could contribute some performance 
tests to measure the effect of re-introducing the hack file. 

A few last thoughts ... I hope constructive (if not, just ignore me). 

It would be helpful if you studied and reported back how other 
provisioning systems provided statistics. My guess is that they do not, 
but instead rely on IT staff (server logs) and/or opt-in data collection. 

And, maybe you could help in contacting the mirror providers? For example, 
it would be helpful if you looked into contributing a report-back script 
that mirrors could run at same time they run the one to pull from Eclipse. 
That one would push back logs, filtering out anything that was not a 
request for something with "org.eclipse" . Maybe the mirrors providers 
would resist ... maybe for good reason ... but maybe not. I think we are 
running on assumptions. (Someone correct me if it has been tried in the 
last few years.) And, the report-back stuff would still be perfectly 
accurate, but you say you don't need absolute accuracy, just relative 
trends. [And, honestly, the "totals" from those mirrors would be 
interesting too, for other reasons, having to do with weighing which of 
the mirrors were doing us any good!]. An intriguing possibility about the 
mirror logs, is that since mirrors are selected based on geography, you'd 
get some additional geographic trends you didn't get before ... maybe you 
have a growing market in Brazil and didn't even know it? 

Lastly, perhaps most important, it has worked sometimes in the past to 
form a "cross project team" that, say, meets once a week working on the 
issue that effects multiple projects, solves the problem after a few 
months, and then dissolves. So, you can sign me up to help sort through 
some issues (if you think I can be of help) ... but, not until next 


"Wenfeng Li" <wli@xxxxxxxxxxx>
"Cross project issues" <cross-project-issues-dev@xxxxxxxxxxx>
06/12/2009 07:57 PM
RE: [cross-project-issues-dev] Download stats and p2
Sent by:

If user uses "http://download.eclipse.org/releases/galileo"; as the update 
site URL, wouldn't it require eclipse.org to be up already?  Are we not 
introducing a new single point of failure?
Can  p2  retrieve  the feature definition from the domain where the site 
URL is, and then use mirrors to download the bundle binaries?   There 
should be no redirect outside of  update site URL that lives behind 
firewalls, right?
Tacking update stats via update manager is important to the BIRT project 
to gauge adoption. While the absolute number might not be accurate, the 
trend can be used as an indication of adoption growth or decline.  BIRT 
project was able to get update manager download stats in 
Callisto/Europa/Ganymede.   Update manager stats stops only since Genymede 
SR1 (Oct. 2008), and Bugzilla 251907 was logged on October 23, 2008. 
From: cross-project-issues-dev-bounces@xxxxxxxxxxx [
mailto:cross-project-issues-dev-bounces@xxxxxxxxxxx] On Behalf Of John 
Sent: Friday, June 12, 2009 2:21 PM
To: Cross project issues
Subject: Re: [cross-project-issues-dev] Download stats and p2

I just want to emphasize how risky it is to be attempting this at this 
point in our release. The single point of failure problem is quite 
significant - not only making downloads from Galileo repositories 
impossible if eclipse.org is unresponsive, but also adding round-trips 
through the Ottawa-based Foundation servers which could add significant 
overhead to install times for users in China, for example. In practice 
very few public or corporate mirrors will figure out how to "turn off" 
this rerouting, especially in the short term near the release. This means 
some very large coporate mirror traffic will soon be redirected your way, 
which in the past never would have hit the foundation servers. You may get 
the stats you want, at the expense of people having to wait a few weeks 
after the release before downloads become usable (and completely breaking 
corporate mirrors that live behind firewalls that won't permit redirecting 

This "single file hack" was possible in Update Manager as well for 
gathering download stats, but it was never used in the Callisto/Europa 
simultaneous releases (just look for references to download.php in the 
site.xml for previous release train sites). So after three simultaneous 
releases with no such stats, I'm wondering why it is so urgent to attempt 
introducing this roughly a week before the release date. It seems to me if 
members of the Board have technical requirements, their best route is to 
fund developers to work on them early in the release cycle, rather than 
requesting lask minute hacks that will put the entire release at risk. 


Wayne Beaton <wayne@xxxxxxxxxxx> 
Sent by: cross-project-issues-dev-bounces@xxxxxxxxxxx 
06/12/2009 02:51 PM 

Please respond to
Cross project issues <cross-project-issues-dev@xxxxxxxxxxx>

cross project issues <cross-project-issues-dev@xxxxxxxxxxx> 

[cross-project-issues-dev] Download stats and p2

Greetings all. We have a small problem. Actually, I guess that the 
problem is as big as you choose to decide it is...

The Eclipse Foundation tracks downloads that go through the download.php 


This includes things like the packages and direct downloads provided by 
projects (assuming that everybody is using the script in their download 

Downloads that occur through p2 do not go through this script. They go 
directly to our download server and to our mirrors. The mirrors do not 
(and arguably cannot reasonably) provide us with download stats.

So... if somebody, for example, downloads the "Eclipse IDE for PHP 
Developers" we will know that we have one more download of PDT. If they 
instead download the "Eclipse IDE for Java Developers" and then use p2 
to add PDT to their configuration, we currently do not have any way of 
tracking that download of PDT.

Inability to accurately track downloads is a huge concern for the 
Eclipse Board.

We have explored several mechanisms for tracking this download. 
Unfortunately, we've not been holding these conversations as publicly as 
I'd like, so I'll summarize them briefly below...

1. Get mirrors to give us their download stats. We could ask. But most 
will not give them to us. Besides, their logs probably contain 
information about everything they mirror, which will be way more 
information than we need. And it'll be a heck of a lot of information 
for our webmasters to weed through.

2. Add a plug-in that gathers information from p2 post install and send 
that information to eclipse.org. Effectively, this is a call-home 
mechanism that will require some additional UI elements and considerable 
effort awfully late in our development cycle. Ultimately, it will 
require some kind of opt-in from the user; many of whom will refuse 
leaving us with incomplete data. FWIW, we could use the UDC for this, 
but it has the same problem.

3. All p2 downloads go through eclipse.org. Denis is concerned that the 
download.php script and--to some degree--the rest of our infrastructure 
will not be able to scale to handle the value that can potentially come 
from p2 downloads. FWIW, we're not increasing our bandwidth for Galileo; 
instead, we're depending very heavily on mirrors.

Bug 239668 [1] has been open for some time to discuss this issue.

We've decided that the best approach is something that we've been 
calling the "Single File Hack". In this hack, we configure the p2 
metadata (artifacts.xml) to send requests for some small subset of the 
files to eclipse.org. Ideally, we send requests for one plug-in or 
feature for each thing that we need to track. The number of files needs 
to be kept relatively small.

There are problems with this hack. For one, eclipse.org becomes a single 
point of failure for all downloads. Further, we will have to let 
organizations that mirror our downloads for internal consumption know 
how to turn it off.

What we're going to need from each project is the names of the files 
that we need to be tracking.

I'd love to hear your thoughts on this topic.


cross-project-issues-dev mailing list
cross-project-issues-dev mailing list