Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[platform-dev] How to best use JobGroups in a recursive task?

Hi all,

In the context of https://wiki.eclipse.org/E4/UI/Smart_Import , the main operation that crawls the filesystem to identify patterns and then configure Eclipse projects.
I believe that, when on a directory, crawling and configuring direct children can be treated in parallel. I was thinking of using JobGroups to enable parallelism for this task, and more specifically, I was thinking that I could simply make this loop parallel: http://git.eclipse.org/c/e4/org.eclipse.e4.ui.git/tree/bundles/org.eclipse.e4.ui.importer/src/org/eclipse/ui/internal/wizards/datatransfer/OpenFolderCommand.java#n158 . It's the one that iterates on children.
However, since it's in a recursive operation, I don't believe I can assign 1 job for each iteration, or it will assign all thread to "parent" tasks, without leaving some for children. Example:

Imagine this filesystem structure:
* P
** P1
*** P1a
*** P1b
** P2
*** P2a
*** P2b
** P3
*** P3a
....
And that I use 2 threads to throttle my jobs. Then those 2 threads will be assigned to the job responsible of crawling P1 and P2, and when trying to work on P3, P1a, P1b, P2a or P2b, the new job for this branch will wait for a new thread forever.
I'm not a parallelism expert, so that may be a dumb question, but how can I take benefit from throttling in a recursive algorithm?
A naive implementation would be to only parallelize the execution for the 1st-level branches, so that P1 // P2 // P3, and keep the subdirectories on the same thread, but this is obviously sub-optimal in case P1 and P2 finish quite fast, sub-jobs of P3 should be allowed to use the additional thread.

Cheers,
--
Mickael Istria
Eclipse developer at JBoss, by Red Hat
My blog - My Tweets

Back to the top