[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [p2-dev] query performance
|
Hi Thomas,
Thank you for analyzing the bottleneck of my query.
RequiredCapability doesn't support 'namespace' member. So I used below
query expression for testing,
"$0.traverse(set(), _, { cache, parent |
parent.requirements.unique(cache).collect(rc | select(iu | iu ~=
rc)).flatten()})"
However the result is amazing. It costs much more time to query the
great number of IUs. I updated the document and benchmark spreadsheet
listed below. What do you think?
Thanks.
Mengxin Zhu
On 08/09/2011 04:04 PM, Thomas Hallgren wrote:
Hi Mengxin,
I took a look at your query. Here are some comments and hints that
might help you speed things up:
1. You use the method toSet() on the result instead of
toUnmodifiableSet(). This will yield an extra (and probably
unnecessary copy). Please use toUnmodifiableSet() where possible.
2. You create an array from the incoming collection before you pass it
to the query. You can avoid this extra copy by passing the Collection
directly.
3. The statement:
"select(iu | $0.exists(iu2 | iu2.requirements.exists(r | iu ~= r )))"
suggests that you want to find all IU's that are required by some IU
in the incoming collection. That's a one step traversal. All those new
IU's will introduce new requirements and in order to find them all the
way the planner does, you must continue evaluating this query until no
more units are found. A better way to resolve this is to use a
traverse query:
"$0.traverse(parent | parent.requirements.collect(rc | select(iu | iu
~= rc)).flatten())"
If $0 is a large collection then it's likely that an initial 'unique'
of all relevant requirements will improve performance significantly:
"$0.traverse(set(), _, { cache, parent |
parent.requirements.unique(cache).collect(rc | select(iu | iu ~=
rc)).flatten()})"
To really speed things up, you might also want to prune the unique
list of requirements to only include those that have the desired
namespace:
select(rc | rc.namespace == 'org.eclipse.equinox.p2.iu').
"$0.traverse(set(), _, { cache, parent |
parent.requirements.unique(cache).select(rc | rc.namespace ==
'org.eclipse.equinox.p2.iu').collect(rc | select(iu | iu ~=
rc)).flatten()})"
If you try this out, please publish your results.
HTH,
Thomas Hallgren
On 2011-08-09 09:13, Mengxin Zhu wrote:
I find the performance of using query language has great downgrade if
querying a repository with a great number of IUs. I'm not sure
whether it's a common case, at least it does in my case.
I already have a list of non-installed root and group IUs, I want to
query the non-installed IUs from repository that are required by
those root and group IUs.
I compare the different three methods to query different size of IUs.
They are using Provisioning planner to resolve and query the required
IUs, query language and a way to use for loop.
I publish my methods as a document[1], and query benchmark as a
spreadsheet[2].
Actually I prefer to use query language, the code looks like much
cleaner. Does anybody know why query language is quite slow to handle
with the great number of IUs, or how to tune my query expression?
[1]
https://docs.google.com/document/d/1wfnr2d2TF4vIYDCMmWPuYd0kQA32WiWaXTiaCoJovho/edit
[2]
https://spreadsheets.google.com/spreadsheet/ccc?key=0AmxBoq-n1R8KdEZ4czdpQk9lMEpvR3pUbzZaZzltTGc
_______________________________________________
p2-dev mailing list
p2-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/p2-dev