Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [cross-project-issues-dev] Build2 Offline?

I will second that. I don't think there is anything inherently unstable
about Hudson slaves feature. We are running a 14 node cluster for our
Eclipse tooling efforts at Oracle. We have seen Hudson crashes, which
required a reboot of the entire cluster to correct, but they have all been
attributed to one of two things: (a) running out of disk space (especially
on the controller) and (b) unstable behavior of the source control system
plugin. We are running Perforce, so exact same problems will not apply here,
but considering how many different SCM systems Hudson installation
integrates with at, I would look there first. What I've noticed
is that an SCM plugin can make the cluster controller unresponsive. Slaves
need a continuous responsive link to the controller (for streaming shell
output, for instance). Once the link becomes unresponsive, the slaves die
quickly. Sometimes, the controller will recover eventually, but slaves do
not. If the Hudson dash is still responsive, I try restarting dead slaves
first. That often takes care of the problem without restarting the entire

Oh and one more thing... We found that it works better for cluster stability
if the controller is not tasked with running heavy-duty jobs. We use
virtualization to segment the available hardware and only run builds on
slave nodes.

For what it's worth...

- Konstantin

-----Original Message-----
From: cross-project-issues-dev-bounces@xxxxxxxxxxx
[mailto:cross-project-issues-dev-bounces@xxxxxxxxxxx] On Behalf Of David
Sent: Tuesday, June 01, 2010 6:40 AM
To: Cross project issues
Subject: Re: [cross-project-issues-dev] Build2 Offline?

If we are having issues, I'll suggest what we in the eclipse community 
always suggest to our users/adopters.   File a bug against the Hudson 
project itself.   Work with the Hudson developers to address the situation.

Apache is using Hudson with one master server, and 14 slave machines:

So I would suspect that if there were major issues with Slaves, Apache 
would be experiencing them as well.    If we are having connection and 
communication issues the first place to start is the Forums:

The second is to open bugs reports:

So let's work with the Hudson community to find out what is the cause of 
the issues.


On 05/31/2010 06:47 AM, Denis Roy wrote:
> So let me get this straight:
> - starting a Hudson Slave using SSH is problematic
> - starting a Hudson Slave with JNLP is problematic
> It's beginning to sound like Hudson Slaves are a great idea on paper, 
> but in the Real World they don't work.  Perhaps we would be better 
> served if the build2 Hudson Slave was simply a separate master server?
> I'm truly disappointed in how unstable and unpredictable Hudson is.
> Denis
> On 05/31/2010 09:39 AM, Webmaster(Matt Ward) wrote:
>> I've restarted the JNLP service, which looked like it was stuck.
>> -Matt.
>> Eike Stepper wrote:
>>> Hi,
>>> Is there a reason for build2 slave being offline?
>>> Cheers
>>> /Eike
>>> ----
>>> _______________________________________________
>>> cross-project-issues-dev mailing list
>>> cross-project-issues-dev@xxxxxxxxxxx
> _______________________________________________
> cross-project-issues-dev mailing list
> cross-project-issues-dev@xxxxxxxxxxx

cross-project-issues-dev mailing list

Back to the top