On 05/24/2013 06:29 AM, Oberhuber,
Martin wrote:
Ø
same here, can't access
hudson. IMHO this happens too often.
+1
I’m
really wondering what is special about the Eclipse.org
infrastructure that causes these failures
After discussions with the owners of CI infrastructures of similar
and larger scale, the answer is that the CI system itself was not
originally designed with this type of scalability in mind. Since
our CI system is open to the world, it's exposed to the Anonymous
User and all the search engines out there that constantly fetch
content (like build artifacts, logs, and such) which use up valuable
threads.
As I write this, there was one IP address in France that has been
reloading the /husdon home page continuously for the last 5 hours,
and of the last 5000 page requests, they own 2300. That consumes
valuable threads, memory and CPU cycles. I've blocked their IP.
… the current “single point of failure” is a problem. And
even if it doesn’t fail, with so many jobs on the initial
page it’s often very slow (categorization would help
probably).
There is categorization, and some time ago, the initial page would
default to the EPP view. But someone opened up a bug, and here we
are back to square one. We could implement caches, but often
committers like seeing information the second it is updated and
available.
I’m
really looking forward to the
HIPP
initiative [1] announced by webmaster, or whatever
else can be done to improve robustness.
So am I. There are many things we could do to improve the One Big
CI for All but in the end I'm still convinced HIPP will be a better
solution for us.
Thanh has already begun deploying HIPP for LTS and other Eclipse
Working Groups. I suspect he will begin working on the Eclipse side
soon.
Thanks for your patience.
Denis
|