Greetings,
In light of the unusual site outages we've experienced recently, I
thought it would be good to explain them, as well as outline the
corrective steps we've taken to prevent further service disruptions.
Tuesday Sept 4: www.eclipse.org was hit by a distributed
denial-of-service (DDoS) attack on our Forums. Many IP addresses
were accessing forum pages that would take several seconds to load.
This strained our database servers and caused our web servers to
eventually exhaust available RAM and swap themselves into oblivion.
To help resolve the issue, we've restricted access to the forums
pages, implemented query timeouts on our database servers, decreased
the amount of Apache handlers and increased available RAM to
www.eclipse.orgservers. We'll also likely provision a fourth
www.eclipse.org server to better tolerate this type of DDoS activity
in the future.
Monday, Sept. 10: eclipsecon.org exhausted its available RAM,
and swapped itself into oblivion. We've increased available RAM on
the server and will enable a second load-balanced server for
increased capacity and fault tolerance.
Tuesday, Sept. 11: www, download and a few other services
became unresponsive as the NFS server hosting the downloads/archives
crashed. Many of our sites, including www.eclipse.org, access the
downloads area to query file names and file sizes for the download
pages. The crash was caused by the same Linux kernel fault that
caused an outage on Feb 15 [1]. We'll need to investigate further
into what can be done to resolve this issue since, last I checked,
the kernel bug was still open.
I apologize for the disruptions these outages have caused; we'll
work on improving the stability of these critical services.
Denis
[1]
http://dev.eclipse.org/mhonarc/lists/eclipse.org-committers/msg00879.html
|