Eclipse Community Forums: BIRT » Data cubes and performance issues

Home » Archived » BIRT » Data cubes and performance issues(How can I filter the datacube to avoid computing aggregates which will never be used)

Show: Today's Messages :: Show Polls :: Message Navigator

Data cubes and performance issues [message #901303]

Fri, 10 August 2012 16:46

ing omini

Messages: 26
Registered: March 2012

Junior Member

Hi,
I have a perfomance issue with a BIRT report (Eclipse BIRT Designer Version 2.3.2.r232_20090202) which is putting a crosstab for each group of a list.

I cannot upload the original report because it contains sensitive data, but the following conceptual example is very similar:

1) I have a "control" dataset DS_COU containing:
"country","filter_column"
This dataset is parametric on "filter column" and only a few countries will be selected.

2) Another dataset DS_POPUL used by datacube contains:
"country","city","genre"(whose values are M or F);"population"

3) A datacube is built over DS_POPUL

3) the report will iterate over a list of countries returned by DS_COU dataset (NOT ALL countries but only a small subset!)

4) for each item of the list, a crosstab must be created using the cube built over DS_POPUL (but also other datasets will be used for other purposes)

Example:

------
SPAIN
------
CITY, MALE, FEMALE
Madrid, 10.000, 12.000
Barcelona, 8.000, 9.000

------
FRANCE
------
CITY, MALE, FEMALE
Paris, 5.000, 8.000

Please note that:
- my datacube is built upon a large dataset (millions of rows), but at the execution time only a small subset of it will be used.
- computing the aggregation for all countries cannot be done for performance issues.
- the datacube query would be very fast if filtered only for the current item of country list

Correct me if I am wrong. The sequence is:
1) by checking active sessions on Oracle, it seems that the datacube is built at the beginning, before iterating on the list, and that it's built over all possible combinations
2) crosstabs can be instantiated basing on the datacube
3) dynamic filters (basing, for example, on the current "country") can be applied

Now, my question is: is it possible to build the datacube only for the countries contained in the list (that is, on the result of parametric dataset DS_COU)?
Or in alternative, to build it N times, for each country?

Thanks in advance for helping me!

Report message to a moderator

Re: Data cubes and performance issues [message #901622 is a reply to message #901303]

Mon, 13 August 2012 17:14

Jason Weathersby

Messages: 9167
Registered: July 2009

Senior Member

If your country select list is a parameter why not cull your DS_POPUL by
using the parameter in a where clause (Let db filter it, it will be
faster). Look at the attached example that filters a dataset based on
parameter values and modifies the query in the beforeOpen script of the
dataset. If you build your cube off of this, it should be much quicker.
Then in your grouping of crosstabs you can filter the crosstab based
on the a particular country.

Jason

On 8/10/2012 12:46 PM, ing omini wrote:
> Hi,
> I have a perfomance issue with a BIRT report (Eclipse BIRT Designer
> Version 2.3.2.r232_20090202) which is putting a crosstab for each group
> of a list.
>
> I cannot upload the original report because it contains sensitive data,
> but the following conceptual example is very similar:
>
> 1) I have a "control" dataset DS_COU containing:
> "country","filter_column"
> This dataset is parametric on "filter column" and only a few countries
> will be selected.
>
> 2) Another dataset DS_POPUL used by datacube contains:
> "country","city","genre"(whose values are M or F);"population"
>
> 3) A datacube is built over DS_POPUL
>
> 3) the report will iterate over a list of countries returned by DS_COU
> dataset (NOT ALL countries but only a small subset!)
>
> 4) for each item of the list, a crosstab must be created using the cube
> built over DS_POPUL (but also other datasets will be used for other
> purposes)
>
> Example:
>
> ------
> SPAIN
> ------
> CITY, MALE, FEMALE
> Madrid, 10.000, 12.000
> Barcelona, 8.000, 9.000
>
> ------
> FRANCE
> ------
> CITY, MALE, FEMALE
> Paris, 5.000, 8.000
>
>
> Please note that:
> - my datacube is built upon a large dataset (millions of rows), but at
> the execution time only a small subset of it will be used.
> - computing the aggregation for all countries cannot be done for
> performance issues.
> - the datacube query would be very fast if filtered only for the current
> item of country list
>
> Correct me if I am wrong. The sequence is: 1) by checking active
> sessions on Oracle, it seems that the datacube is built at the
> beginning, before iterating on the list, and that it's built over all
> possible combinations
> 2) crosstabs can be instantiated basing on the datacube
> 3) dynamic filters (basing, for example, on the current "country") can
> be applied
>
> Now, my question is: is it possible to build the datacube only for the
> countries contained in the list (that is, on the result of parametric
> dataset DS_COU)?
> Or in alternative, to build it N times, for each country?
>
>
> Thanks in advance for helping me!
>

Attachment: MultiselectParameterQueryMod.zip
(Size: 2.58KB, Downloaded 220 times)

Report message to a moderator

Re: Data cubes and performance issues [message #901623 is a reply to message #901303]

Mon, 13 August 2012 17:14

Jason Weathersby

Messages: 9167
Registered: July 2009

Senior Member

Attachment: MultiselectParameterQueryMod.zip
(Size: 2.58KB, Downloaded 215 times)

Report message to a moderator

Re: Data cubes and performance issues [message #901624 is a reply to message #901303]

Mon, 13 August 2012 17:14

Jason Weathersby

Messages: 9167
Registered: July 2009

Senior Member

Attachment: MultiselectParameterQueryMod.zip
(Size: 2.58KB, Downloaded 210 times)

Report message to a moderator

Re: Data cubes and performance issues [message #901625 is a reply to message #901303]

Mon, 13 August 2012 17:14

Jason Weathersby

Messages: 9167
Registered: July 2009

Senior Member

Attachment: MultiselectParameterQueryMod.zip
(Size: 2.58KB, Downloaded 233 times)

Report message to a moderator

Re: Data cubes and performance issues [message #901626 is a reply to message #901303]

Mon, 13 August 2012 17:14

Jason Weathersby

Messages: 9167
Registered: July 2009

Senior Member

Attachment: MultiselectParameterQueryMod.zip
(Size: 2.58KB, Downloaded 211 times)

Report message to a moderator

Re: Data cubes and performance issues [message #901627 is a reply to message #901303]

Mon, 13 August 2012 17:14

Jason Weathersby

Messages: 9167
Registered: July 2009

Senior Member

Attachment: MultiselectParameterQueryMod.zip
(Size: 2.58KB, Downloaded 214 times)

Report message to a moderator

Re: Data cubes and performance issues [message #901628 is a reply to message #901303]

Mon, 13 August 2012 17:14

Jason Weathersby

Messages: 9167
Registered: July 2009

Senior Member

Attachment: MultiselectParameterQueryMod.zip
(Size: 2.58KB, Downloaded 207 times)

Report message to a moderator

Re: Data cubes and performance issues [message #901630 is a reply to message #901303]

Mon, 13 August 2012 17:14

Jason Weathersby

Messages: 9167
Registered: July 2009

Senior Member

Attachment: MultiselectParameterQueryMod.zip
(Size: 2.58KB, Downloaded 215 times)

Report message to a moderator

Re: Data cubes and performance issues [message #901632 is a reply to message #901303]

Mon, 13 August 2012 17:14

Jason Weathersby

Messages: 9167
Registered: July 2009

Senior Member

Attachment: MultiselectParameterQueryMod.zip
(Size: 2.58KB, Downloaded 314 times)

Report message to a moderator

Re: Data cubes and performance issues [message #901633 is a reply to message #901303]

Mon, 13 August 2012 17:14

Jason Weathersby

Messages: 9167
Registered: July 2009

Senior Member

Attachment: MultiselectParameterQueryMod.zip
(Size: 2.58KB, Downloaded 211 times)

Report message to a moderator

Re: Data cubes and performance issues [message #909093 is a reply to message #901622]

Thu, 06 September 2012 14:19

ing omini

Messages: 26
Registered: March 2012

Junior Member

Report generation times are still not so good (dataset logic is very complex and the raw records are so many!), but at least hardcoding parameters within a script, as suggested by you, has improved a little the situation.
Thank you again, Jason!

Report message to a moderator

Previous Topic:	scale on Y axis
Next Topic:	IRunTask run output to a database

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Thu Apr 25 05:32:42 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter