Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cross-project-issues-dev] Download stats and p2


I just want to emphasize how risky it is to be attempting this at this point in our release. The single point of failure problem is quite significant - not only making downloads from Galileo repositories impossible if eclipse.org is unresponsive, but also adding round-trips through the Ottawa-based Foundation servers which could add significant overhead to install times for users in China, for example. In practice very few public or corporate mirrors will figure out how to "turn off" this rerouting, especially in the short term near the release. This means some very large coporate mirror traffic will soon be redirected your way, which in the past never would have hit the foundation servers. You may get the stats you want, at the expense of people having to wait a few weeks after the release before downloads become usable (and completely breaking corporate mirrors that live behind firewalls that won't permit redirecting outside).

This "single file hack" was possible in Update Manager as well for gathering download stats, but it was never used in the Callisto/Europa simultaneous releases (just look for references to download.php in the site.xml for previous release train sites). So after three simultaneous releases with no such stats, I'm wondering why it is so urgent to attempt introducing this roughly a week before the release date. It seems to me if members of the Board have technical requirements, their best route is to fund developers to work on them early in the release cycle, rather than requesting lask minute hacks that will put the entire release at risk.

John




Wayne Beaton <wayne@xxxxxxxxxxx>
Sent by: cross-project-issues-dev-bounces@xxxxxxxxxxx

06/12/2009 02:51 PM

Please respond to
Cross project issues <cross-project-issues-dev@xxxxxxxxxxx>

To
cross project issues <cross-project-issues-dev@xxxxxxxxxxx>
cc
Subject
[cross-project-issues-dev] Download stats and p2





Greetings all. We have a small problem. Actually, I guess that the
problem is as big as you choose to decide it is...

The Eclipse Foundation tracks downloads that go through the download.php
script:

http://www.eclipse.org/downloads/download.php?file=[...]

This includes things like the packages and direct downloads provided by
projects (assuming that everybody is using the script in their download
links).

Downloads that occur through p2 do not go through this script. They go
directly to our download server and to our mirrors. The mirrors do not
(and arguably cannot reasonably) provide us with download stats.

So... if somebody, for example, downloads the "Eclipse IDE for PHP
Developers" we will know that we have one more download of PDT. If they
instead download the "Eclipse IDE for Java Developers" and then use p2
to add PDT to their configuration, we currently do not have any way of
tracking that download of PDT.

Inability to accurately track downloads is a huge concern for the
Eclipse Board.

We have explored several mechanisms for tracking this download.
Unfortunately, we've not been holding these conversations as publicly as
I'd like, so I'll summarize them briefly below...

1. Get mirrors to give us their download stats. We could ask. But most
will not give them to us. Besides, their logs probably contain
information about everything they mirror, which will be way more
information than we need. And it'll be a heck of a lot of information
for our webmasters to weed through.

2. Add a plug-in that gathers information from p2 post install and send
that information to eclipse.org. Effectively, this is a call-home
mechanism that will require some additional UI elements and considerable
effort awfully late in our development cycle. Ultimately, it will
require some kind of opt-in from the user; many of whom will refuse
leaving us with incomplete data. FWIW, we could use the UDC for this,
but it has the same problem.

3. All p2 downloads go through eclipse.org. Denis is concerned that the
download.php script and--to some degree--the rest of our infrastructure
will not be able to scale to handle the value that can potentially come
from p2 downloads. FWIW, we're not increasing our bandwidth for Galileo;
instead, we're depending very heavily on mirrors.

Bug 239668 [1] has been open for some time to discuss this issue.

We've decided that the best approach is something that we've been
calling the "Single File Hack". In this hack, we configure the p2
metadata (artifacts.xml) to send requests for some small subset of the
files to eclipse.org. Ideally, we send requests for one plug-in or
feature for each thing that we need to track. The number of files needs
to be kept relatively small.

There are problems with this hack. For one, eclipse.org becomes a single
point of failure for all downloads. Further, we will have to let
organizations that mirror our downloads for internal consumption know
how to turn it off.

What we're going to need from each project is the names of the files
that we need to be tracking.

I'd love to hear your thoughts on this topic.

Wayne

[1]https://bugs.eclipse.org/bugs/show_bug.cgi?id=239668
_______________________________________________
cross-project-issues-dev mailing list
cross-project-issues-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev


Back to the top