multi-threading an IncrementalProjectBuilder?? [message #326954] |
Thu, 03 April 2008 12:17  |
Eclipse User |
|
|
|
I am a performance analyst for an adopting product. As the number of cores
and CPUs increases, we find the need to parallelize our expensive building
that occurs during a build job. That is, we want to be able to spin off
various jobs and then join with them. Unfortunately, I am finding this
very difficult given the fact that the build job holds a workspace rule.
I haven't been able to find ANY documentation or previous posts on this
topic.
Is there some recommended way to do this? It seems that as builders get
more complex, the demand for this type of mechanism would grow.
A few ideas I've had:
(1) Implement my own solution based on IJobManager.transferRule(..). In
this case I would anticipate transferring the workspace rule to and from
child jobs only when necessary to avoid deadlock. However, I really have
no experience using this method, so I don't even know if that could be
made to work.
(2) Implement my own scheduling rules that basically lie, since I know
that the build job will be holding the workspace rule. Obviously, though,
breaking the API contract is never a good idea. Also, coordinating rules
among my child jobs becomes a challenge (this reminds me of real
life--trying to keep track of who you tell the truth to and who you lie
to...).
(3) Use IJobManager.endRule(currentRule) in the build job right before I
try to join with the child jobs, then call
IJobManager.beginRule(currentRule) after the child jobs have completed. I
am really unsure about what things might kick off after I end the
workspace build rule--perhaps things I don't want.
Any comments/suggestions?
|
|
|
|
|
Re: multi-threading an IncrementalProjectBuilder?? [message #326982 is a reply to message #326974] |
Thu, 03 April 2008 16:39   |
Eclipse User |
|
|
|
Originally posted by: wharley.bea.com
"Tom Schindl" <tom.schindl@bestsolution.at> wrote in message
news:ft3av8$kl8$1@build.eclipse.org...
>I thought there was work done to make the eclipse builder use multi-core
>support in 3.4.
There has been ongoing work to make the Java *compiler* multi-core; I don't
know about the builder. However, the team has been finding that adding
processors does not help much because it very quickly becomes I/O-bound, in
part because the Java I/O libraries are not good at parallelizing access to
many small files (evidently they're better at dealing with synchronous
access to single large files, ala databases).
I suspect the OP might quickly run into the same problem. If compilation is
I/O bound, it's hard to imagine what other build steps wouldn't be.
It reminds me of the problems we ran into in the early 90s with a
distributed make project that an employer of mine at the time was working
on. Good idea in principle, but in practice getting the necessary files to
the right place at the right time was hard enough that it destroyed any
speedup due to parallel processing. Makes only *seem* parallelizable - in
practice, they're much more tightly coupled than one would think, because of
file dependencies.
I suppose that if a build process involved CPU-intensive processing of many
small but independent files, it could work? I'm having a hard time thinking
of an example; almost by definition, a small independent file doesn't
contain enough information to make it expensive to process. Elliptical
hashes?
|
|
|
Re: multi-threading an IncrementalProjectBuilder?? [message #327015 is a reply to message #326982] |
Fri, 04 April 2008 13:28   |
Eclipse User |
|
|
|
Walter Harley wrote:
> "Tom Schindl" <tom.schindl@bestsolution.at> wrote in message
> news:ft3av8$kl8$1@build.eclipse.org...
>>I thought there was work done to make the eclipse builder use multi-core
>>support in 3.4.
> There has been ongoing work to make the Java *compiler* multi-core; I don't
> know about the builder. However, the team has been finding that adding
> processors does not help much because it very quickly becomes I/O-bound, in
> part because the Java I/O libraries are not good at parallelizing access to
> many small files (evidently they're better at dealing with synchronous
> access to single large files, ala databases).
> I suspect the OP might quickly run into the same problem. If compilation is
> I/O bound, it's hard to imagine what other build steps wouldn't be.
> It reminds me of the problems we ran into in the early 90s with a
> distributed make project that an employer of mine at the time was working
> on. Good idea in principle, but in practice getting the necessary files to
> the right place at the right time was hard enough that it destroyed any
> speedup due to parallel processing. Makes only *seem* parallelizable - in
> practice, they're much more tightly coupled than one would think, because of
> file dependencies.
> I suppose that if a build process involved CPU-intensive processing of many
> small but independent files, it could work? I'm having a hard time thinking
> of an example; almost by definition, a small independent file doesn't
> contain enough information to make it expensive to process. Elliptical
> hashes?
Basically, we have multiple builders (well, actually multiple commands
inside a single Eclipse builder) that often need to operate on the same
model (EMF), thus I/O isn't performed each time. We are often CPU bound
and would like to improve response time by taking advantage of
parallelism. Shouldn't Eclipse, as a platform, provide a way to
parallelize some of the work during a build? It seems that requiring the
workspace to be locked on one thread during a build is very prohibitive to
builder implementations.
My ideal mechanism would allow me to set up various child jobs and then
have the build job schedule them and wait for them (join). The build job
would retain the workspace rule and prevent conflicting non-child jobs,
but would allow the child jobs access to the workspace rule--contending
for containing rules among themselves.
This could easily be made generic and put into the Jobs API, though I'd
guess it'd probably go against some sort of "keep it simple" effort.
|
|
|
|
|
Re: multi-threading an IncrementalProjectBuilder?? [message #327159 is a reply to message #327085] |
Fri, 11 April 2008 15:11   |
Eclipse User |
|
|
|
Philippe Mulet wrote:
> Randall Theobald wrote:
>> In order to do my own rule contention, I would really need to be able to
>> find out what rules a job currently owns (somehow get notified of
>> begin/end rule calls). I don't see any way to currently do this. Am I
>> wrong?
>>
>> I can basically count on the ISchedulingRule.contains(..) method getting
>> called on the base rules of my child jobs to tell me when it needs to
>> begin a rule, but the only way to know that it is done with a rule is to
>> listen for the job to be done (which, since that is potentially a long
>> time, will probably end up defeating the whole purpose of creating
>> parallel jobs since some actions require the workspace root).
>>
>> Any suggestions here?
>>
>>
> IMHO, changing the build manager to perform in parallel would likely be
> an Eclipse 4.0 evolution. Many builders are making assumptions on the
> current state of things.
> I would imagine improvements to the lock system as well, to allow
> concurrent readers etc... so until builders would actually write, they
> could read the same files in parallel.
What about just enhancing the Jobs API slightly to allow listeners to be
notified when a Job begins and ends a rule? The fact that I can't find out
when a job is done with a rule [when the rule gets popped off the
ThreadJob's rule stack] is really killing our efforts. Like I said above,
I am using the ISchedulingRule.contains(..) method to determine when a Job
needs a rule, but having explicit 'beginningRule'/'endingRule' methods on
the IJobChangeListener interface (or creating a new IJobRuleChangeListener
or something to not break implementing classes) would be cleaner.
|
|
|
|
|
Re: multi-threading an IncrementalProjectBuilder?? [message #327405 is a reply to message #327241] |
Thu, 17 April 2008 13:15  |
Eclipse User |
|
|
|
I opened Bug 227025 to address this. However, I have also noted in the bug
that a more general approach to a multi-threaded build strategy would be
needed as the AutoBuildJob doesn't allow resource modifications outside of
its thread without interruption and rescheduling, which can be prohibitive
in its cost (due to other jobs, resource notifications, etc.).
|
|
|
Powered by
FUDForum. Page generated in 0.10633 seconds