[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [cross-project-issues-dev] Download stats and p2

Yes, I believe the content.jar and artifacts.jar are single points of 
failure, see for example bug 236939
https://bugs.eclipse.org/bugs/show_bug.cgi?id=236939

But some accommodation has been made for that, such as see bug 234515
https://bugs.eclipse.org/bugs/show_bug.cgi?id=234515

I'm assuming the more requests made to a single point, the greater the 
opportunity for failure, that is, increases the probability of failure ... 
perhaps others know better than I, maybe it doesn't.

But it's not just the failure I'm concerned about, but also the 
bottle-neck effect. I am sure there is some amount of increased traffic 
that eclipse.org would handle ... but not sure now is the time to find 
out. 

Thanks, 





From:
"Wenfeng Li" <wli@xxxxxxxxxxx>
To:
"Cross project issues" <cross-project-issues-dev@xxxxxxxxxxx>, "Cross 
project issues" <cross-project-issues-dev@xxxxxxxxxxx>
Date:
06/13/2009 03:14 PM
Subject:
RE: [cross-project-issues-dev] Download stats and p2
Sent by:
cross-project-issues-dev-bounces@xxxxxxxxxxx



David
 
I am also concern if any solution to collect stats introduce a single 
point of failure.  Hence the question:  If user uses "
http://download.eclipse.org/releases/galileo"; as the update site URL, Does 
P2 retrieve metadata of features from download.eclipse.org?  If so, are we 
not introducing a single point of failure?    I am sorry you see this 
question as an unfair comment to the P2 team.   I was only checking if my 
understanding of P2 is correct or not.
 
wenfeng

From: cross-project-issues-dev-bounces@xxxxxxxxxxx on behalf of David M 
Williams
Sent: Fri 6/12/2009 8:39 PM
To: Cross project issues
Subject: RE: [cross-project-issues-dev] Download stats and p2

Wenfeng,

There's been some good ideas raised, and worth discussion, and possibly
worth (someone) implementing ... but the timing for implementing it is not
now at end of release. The timing is what's causing the resistive tone of
some comments ... normally we all like to be very helpful to people who
contribute to Eclipse.

But I find my self -- I can not believe I am saying this -- wanting to
come to the defense of the p2 team. :)

There's something about the tone of some comments that implies there was a
"loss of function" introduced by p2, but I think that's unfair to the p2
team. The hack used in update manager was just that, a hack, and just
happened to work, and it was never planned for and never intended as an
API, as far as I know (but, I was never part of that team, so others will
correct me if I am wrong). It was a cool trick, that worked by
coincidence, and worked sufficiently for some (It never work well for us
in WTP, so we didn't use it, perhaps because we have more massive content
in more smaller pieces and/or installed more frequently, and/or have more
users behind unusual re-direction firewalls).

But, the point I want to make is that update manager itself was so slow,
and so unreliable, that I suspect a hack which introduced a little
slowness or unreliability would not have been noticed in the noise.

So, that's my concern ... just a lot of unknowns. But, now that p2 is
faster and more reliable, perhaps you could contribute some performance
tests to measure the effect of re-introducing the hack file.

A few last thoughts ... I hope constructive (if not, just ignore me).

It would be helpful if you studied and reported back how other
provisioning systems provided statistics. My guess is that they do not,
but instead rely on IT staff (server logs) and/or opt-in data collection.

And, maybe you could help in contacting the mirror providers? For example,
it would be helpful if you looked into contributing a report-back script
that mirrors could run at same time they run the one to pull from Eclipse.
That one would push back logs, filtering out anything that was not a
request for something with "org.eclipse" . Maybe the mirrors providers
would resist ... maybe for good reason ... but maybe not. I think we are
running on assumptions. (Someone correct me if it has been tried in the
last few years.) And, the report-back stuff would still be perfectly
accurate, but you say you don't need absolute accuracy, just relative
trends. [And, honestly, the "totals" from those mirrors would be
interesting too, for other reasons, having to do with weighing which of
the mirrors were doing us any good!]. An intriguing possibility about the
mirror logs, is that since mirrors are selected based on geography, you'd
get some additional geographic trends you didn't get before ... maybe you
have a growing market in Brazil and didn't even know it?

Lastly, perhaps most important, it has worked sometimes in the past to
form a "cross project team" that, say, meets once a week working on the
issue that effects multiple projects, solves the problem after a few
months, and then dissolves. So, you can sign me up to help sort through
some issues (if you think I can be of help) ... but, not until next
release.

Thanks,












From:
"Wenfeng Li" <wli@xxxxxxxxxxx>
To:
"Cross project issues" <cross-project-issues-dev@xxxxxxxxxxx>
Date:
06/12/2009 07:57 PM
Subject:
RE: [cross-project-issues-dev] Download stats and p2
Sent by:
cross-project-issues-dev-bounces@xxxxxxxxxxx



If user uses "http://download.eclipse.org/releases/galileo"; as the update
site URL, wouldn't it require eclipse.org to be up already?  Are we not
introducing a new single point of failure?

Can  p2  retrieve  the feature definition from the domain where the site
URL is, and then use mirrors to download the bundle binaries?   There
should be no redirect outside of  update site URL that lives behind
firewalls, right?

Tacking update stats via update manager is important to the BIRT project
to gauge adoption. While the absolute number might not be accurate, the
trend can be used as an indication of adoption growth or decline.  BIRT
project was able to get update manager download stats in
Callisto/Europa/Ganymede.   Update manager stats stops only since Genymede
SR1 (Oct. 2008), and Bugzilla 251907 was logged on October 23, 2008.

Wenfeng


From: cross-project-issues-dev-bounces@xxxxxxxxxxx [
mailto:cross-project-issues-dev-bounces@xxxxxxxxxxx] On Behalf Of John
Arthorne
Sent: Friday, June 12, 2009 2:21 PM
To: Cross project issues
Subject: Re: [cross-project-issues-dev] Download stats and p2


I just want to emphasize how risky it is to be attempting this at this
point in our release. The single point of failure problem is quite
significant - not only making downloads from Galileo repositories
impossible if eclipse.org is unresponsive, but also adding round-trips
through the Ottawa-based Foundation servers which could add significant
overhead to install times for users in China, for example. In practice
very few public or corporate mirrors will figure out how to "turn off"
this rerouting, especially in the short term near the release. This means
some very large coporate mirror traffic will soon be redirected your way,
which in the past never would have hit the foundation servers. You may get
the stats you want, at the expense of people having to wait a few weeks
after the release before downloads become usable (and completely breaking
corporate mirrors that live behind firewalls that won't permit redirecting
outside).

This "single file hack" was possible in Update Manager as well for
gathering download stats, but it was never used in the Callisto/Europa
simultaneous releases (just look for references to download.php in the
site.xml for previous release train sites). So after three simultaneous
releases with no such stats, I'm wondering why it is so urgent to attempt
introducing this roughly a week before the release date. It seems to me if
members of the Board have technical requirements, their best route is to
fund developers to work on them early in the release cycle, rather than
requesting lask minute hacks that will put the entire release at risk.

John




Wayne Beaton <wayne@xxxxxxxxxxx>
Sent by: cross-project-issues-dev-bounces@xxxxxxxxxxx
06/12/2009 02:51 PM


Please respond to
Cross project issues <cross-project-issues-dev@xxxxxxxxxxx>



To
cross project issues <cross-project-issues-dev@xxxxxxxxxxx>
cc

Subject
[cross-project-issues-dev] Download stats and p2









Greetings all. We have a small problem. Actually, I guess that the
problem is as big as you choose to decide it is...

The Eclipse Foundation tracks downloads that go through the download.php
script:

http://www.eclipse.org/downloads/download.php?file=[...]

This includes things like the packages and direct downloads provided by
projects (assuming that everybody is using the script in their download
links).

Downloads that occur through p2 do not go through this script. They go
directly to our download server and to our mirrors. The mirrors do not
(and arguably cannot reasonably) provide us with download stats.

So... if somebody, for example, downloads the "Eclipse IDE for PHP
Developers" we will know that we have one more download of PDT. If they
instead download the "Eclipse IDE for Java Developers" and then use p2
to add PDT to their configuration, we currently do not have any way of
tracking that download of PDT.

Inability to accurately track downloads is a huge concern for the
Eclipse Board.

We have explored several mechanisms for tracking this download.
Unfortunately, we've not been holding these conversations as publicly as
I'd like, so I'll summarize them briefly below...

1. Get mirrors to give us their download stats. We could ask. But most
will not give them to us. Besides, their logs probably contain
information about everything they mirror, which will be way more
information than we need. And it'll be a heck of a lot of information
for our webmasters to weed through.

2. Add a plug-in that gathers information from p2 post install and send
that information to eclipse.org. Effectively, this is a call-home
mechanism that will require some additional UI elements and considerable
effort awfully late in our development cycle. Ultimately, it will
require some kind of opt-in from the user; many of whom will refuse
leaving us with incomplete data. FWIW, we could use the UDC for this,
but it has the same problem.

3. All p2 downloads go through eclipse.org. Denis is concerned that the
download.php script and--to some degree--the rest of our infrastructure
will not be able to scale to handle the value that can potentially come
from p2 downloads. FWIW, we're not increasing our bandwidth for Galileo;
instead, we're depending very heavily on mirrors.

Bug 239668 [1] has been open for some time to discuss this issue.

We've decided that the best approach is something that we've been
calling the "Single File Hack". In this hack, we configure the p2
metadata (artifacts.xml) to send requests for some small subset of the
files to eclipse.org. Ideally, we send requests for one plug-in or
feature for each thing that we need to track. The number of files needs
to be kept relatively small.

There are problems with this hack. For one, eclipse.org becomes a single
point of failure for all downloads. Further, we will have to let
organizations that mirror our downloads for internal consumption know
how to turn it off.

What we're going to need from each project is the names of the files
that we need to be tracking.

I'd love to hear your thoughts on this topic.

Wayne

[1]https://bugs.eclipse.org/bugs/show_bug.cgi?id=239668
_______________________________________________
cross-project-issues-dev mailing list
cross-project-issues-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev
_______________________________________________
cross-project-issues-dev mailing list
cross-project-issues-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev



_______________________________________________
cross-project-issues-dev mailing list
cross-project-issues-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev
_______________________________________________
cross-project-issues-dev mailing list
cross-project-issues-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cross-project-issues-dev