Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » BIRT » performance issue when sorting a group
performance issue when sorting a group [message #368353] Fri, 22 May 2009 17:27 Go to next message
Mike is currently offline MikeFriend
Messages: 8
Registered: July 2009
Junior Member
We have found that several of our reports take increasingly long to
complete as the data set size increases, but the time is more exponential
than linear. The common attributes of these reports are groups on a table
that have a sort key.

In one of our reports with a data set of 200k rows it will complete in
under 10 minutes without sorting. This time includes rendering the report
to a PDF. However adding a sort key will increase that time to over 18
hours to complete!

While profiling my Java application (with the BIRT engine) we have found
that the majority of the time is spent in
GroupInformationUtil.mergeTwoGroupBoundaryInfoGroups()

I have a very simple test case that I can provide. It includes a flat
file that is used as a datasource and two .rptdesign files, one with a
sort key and one without. In my Eclipse environment, viewing the unsorted
report as a PDF takes approximately 30 seconds. Attempting to view the
sorted report runs indefinitely.

This seems to indicate a performance issue in BIRT with sorting data sets.
Has anyone else experienced similar performance problems?
Re: performance issue when sorting a group [message #368371 is a reply to message #368353] Tue, 26 May 2009 15:09 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: jasonweathersby.windstream.net

Mike,

Can you open a bugzilla entry for this and provide your test case?

Jason

Mike wrote:
> We have found that several of our reports take increasingly long to
> complete as the data set size increases, but the time is more
> exponential than linear. The common attributes of these reports are
> groups on a table that have a sort key.
>
> In one of our reports with a data set of 200k rows it will complete in
> under 10 minutes without sorting. This time includes rendering the
> report to a PDF. However adding a sort key will increase that time to
> over 18 hours to complete!
>
> While profiling my Java application (with the BIRT engine) we have found
> that the majority of the time is spent in
> GroupInformationUtil.mergeTwoGroupBoundaryInfoGroups()
>
> I have a very simple test case that I can provide. It includes a flat
> file that is used as a datasource and two .rptdesign files, one with a
> sort key and one without. In my Eclipse environment, viewing the
> unsorted report as a PDF takes approximately 30 seconds. Attempting to
> view the sorted report runs indefinitely.
>
> This seems to indicate a performance issue in BIRT with sorting data
> sets. Has anyone else experienced similar performance problems?
>
Re: performance issue when sorting a group [message #368376 is a reply to message #368371] Tue, 26 May 2009 16:00 Go to previous messageGo to next message
Mike is currently offline MikeFriend
Messages: 8
Registered: July 2009
Junior Member
I have opened a bug and attached the test case.

https://bugs.eclipse.org/bugs/show_bug.cgi?id=277885

- mike
Re: performance issue when sorting a group [message #368432 is a reply to message #368371] Thu, 28 May 2009 19:35 Go to previous messageGo to next message
Mike is currently offline MikeFriend
Messages: 8
Registered: July 2009
Junior Member
It looks like the bug I filed has been marked for the 2.5 release, which
we will not be able to upgrade to before our product is released. Is
there any work around to the problem?

From what we have found the issue seems to happen when BIRT writes data to
disk. Looking at the DataEngine BIRT class it appears that there are some
constants whose values indicate to BIRT how it should manage datasets.

Is it possible to set value(s) for these constants in the appContext such
that BIRT will not use the disk cache? Or at lease increase the amount of
data or memory BIRT will use for data sets? This would greatly minimize
the impact of this bug to our customers.
Re: performance issue when sorting a group [message #368443 is a reply to message #368432] Fri, 29 May 2009 15:15 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: jasonweathersby.windstream.net

Mike,

Take a look at these two threads. Not certain if it will help though.

http://www.birt-exchange.org/forum/eclipse-birt-newsgroup-mi rror/13813-birt-memory-consumption-large-dataset.html
http://www.birt-exchange.org/forum/eclipse-birt-newsgroup-mi rror/8401-cache-configuration.html

Jason


Mike wrote:
> It looks like the bug I filed has been marked for the 2.5 release, which
> we will not be able to upgrade to before our product is released. Is
> there any work around to the problem?
>
> From what we have found the issue seems to happen when BIRT writes data
> to disk. Looking at the DataEngine BIRT class it appears that there are
> some constants whose values indicate to BIRT how it should manage datasets.
>
> Is it possible to set value(s) for these constants in the appContext
> such that BIRT will not use the disk cache? Or at lease increase the
> amount of data or memory BIRT will use for data sets? This would
> greatly minimize the impact of this bug to our customers.
>
Re: performance issue when sorting a group [message #368444 is a reply to message #368443] Fri, 29 May 2009 15:46 Go to previous messageGo to next message
Mike is currently offline MikeFriend
Messages: 8
Registered: July 2009
Junior Member
Jason, thanks for your reply. I did find that increasing the memory
buffer by adding an entry in the appContext with the key
org.eclipse.birt.data.query.ResultBufferSize and a value higher than the
default of 10 MB does allow us to prevent BIRT from using the disk cache
in some cases. But there will always be an even larger resultset that
will cause BIRT to use the disk.

However this particular setting does not seem to have any impact on my
problem. It does help delay the inevitable with very large resultsets,
and we have this bug opened for the large resultset problem:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=277908

The problem seems to be how the DiskCache moves the cursor through the
resultset. In that bug post you can see the exponential nature by looking
at the call stack in the report named DiskCache.html. The 654 invocations
of DiskCache.moveTo() result in nearly 31 million calls to
DiskCache.next()!

It seems like, at least in our test case, that DiskCache is attempting to
move the cursor to the end of the resultset each time it is called. This
causes a ton of disk IO thereby taking a very long time to complete.

The bug I entered - https://bugs.eclipse.org/bugs/show_bug.cgi?id=277885 -
is also a problem with BIRT accessing the disk millions of times. In this
case it is when it is grouping/sorting data. We can attempt to
"workaround" the problem by increasing the number of elements BIRT keeps
in memory in BasicCachedList.java. It is hard coded to 4000, and if we
increase this to 500,000 then my test cases that use to take several hours
will complete in around a minute.

We do not feel that either of these workarounds can be delivered because
we are bumping up against memory limitations. These are critical issues
for us and therefore I would think that other users in the BIRT community
have experienced these same issues. Is it possible to get these bugs into
the 2.3.2.2 release that is still in development?
Re: performance issue when sorting a group [message #368454 is a reply to message #368444] Fri, 29 May 2009 20:29 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: jasonweathersby.windstream.net

Mike,

Wenfeng has responded to both bugs.

Jason

Mike wrote:
> Jason, thanks for your reply. I did find that increasing the memory
> buffer by adding an entry in the appContext with the key
> org.eclipse.birt.data.query.ResultBufferSize and a value higher than the
> default of 10 MB does allow us to prevent BIRT from using the disk cache
> in some cases. But there will always be an even larger resultset that
> will cause BIRT to use the disk.
>
> However this particular setting does not seem to have any impact on my
> problem. It does help delay the inevitable with very large resultsets,
> and we have this bug opened for the large resultset problem:
> https://bugs.eclipse.org/bugs/show_bug.cgi?id=277908
>
> The problem seems to be how the DiskCache moves the cursor through the
> resultset. In that bug post you can see the exponential nature by
> looking at the call stack in the report named DiskCache.html. The 654
> invocations of DiskCache.moveTo() result in nearly 31 million calls to
> DiskCache.next()!
>
> It seems like, at least in our test case, that DiskCache is attempting
> to move the cursor to the end of the resultset each time it is called.
> This causes a ton of disk IO thereby taking a very long time to complete.
>
> The bug I entered - https://bugs.eclipse.org/bugs/show_bug.cgi?id=277885
> - is also a problem with BIRT accessing the disk millions of times. In
> this case it is when it is grouping/sorting data. We can attempt to
> "workaround" the problem by increasing the number of elements BIRT keeps
> in memory in BasicCachedList.java. It is hard coded to 4000, and if we
> increase this to 500,000 then my test cases that use to take several
> hours will complete in around a minute.
>
> We do not feel that either of these workarounds can be delivered because
> we are bumping up against memory limitations. These are critical issues
> for us and therefore I would think that other users in the BIRT
> community have experienced these same issues. Is it possible to get
> these bugs into the 2.3.2.2 release that is still in development?
>
Re: performance issue when sorting a group [message #542460 is a reply to message #368443] Thu, 24 June 2010 22:22 Go to previous messageGo to next message
Ryan  is currently offline Ryan Friend
Messages: 4
Registered: July 2009
Junior Member
It looks like the 2 links you posted are no longer working.

I tried to search around and find the links, but have not yet been able to find any.

Could you post a new set of lines on cache configuration?

Ryan
Re: performance issue when sorting a group [message #542677 is a reply to message #542460] Fri, 25 June 2010 15:09 Go to previous messageGo to next message
Jason Weathersby is currently offline Jason WeathersbyFriend
Messages: 9167
Registered: July 2009
Senior Member

On 6/24/2010 6:22 PM, Ryan wrote:
> It looks like the 2 links you posted are no longer working.
>
> I tried to search around and find the links, but have not yet been able
> to find any.
>
> Could you post a new set of lines on cache configuration?
>
> Ryan

Look at:
http://www.eclipse.org/forums/index.php?t=msg&th=120380& amp;

The cache configuration message I can not locate, but the contents of it
are listed below:

Hello,
Performance have been improved via a combination of memory and disk
cache ( http://www.eclipse.org/birt/phoenix/project/notable2.0.php#j ump_11).
But I can't find how to configure this; for exemple, disable it, use
only memory cache, etc...
Thanks
Nicolas

Nicolas,
You can set the memory cache size by using the following.

task.getAppContext().put("org.eclipse.birt.data.query.ResultBufferSize ",
new Integer(20));
Size is specified in MB.
You should be able to set this for the viewer using:
http://wiki.eclipse.org/Adding_an_Object_to_the_Application_ Context_for_the_Viewer_%28BIRT%29
Jason
Hi Jason,
can you please explain the other constants defined in
org.eclipse.birt.data.engine.api.DataEngine too:
DATA_SET_CACHE_ROW_LIMIT = "org.eclipse.birt.data.cache.RowLimit"
DATA_SET_CACHE_DELTA_FILE = "org.eclipse.birt.data.cache.DeltaFile"
MEMORY_DATA_SET_CACHE = "org.eclipse.birt.data.cache.memory"
How can I use them to improve the performance?
Thanks in advance!
Spunk
I believe the other settings have to do with setting the cache
configuration while in the designer. In 2.2.1 there is suppose to be
a project to improve the caching of datasets between containers.
Jason
Hello (again),
I looked how to use DataEngine.MEMORY_DATA_SET_CACHE, but I can't get it
working.
I can set it to 0 or 10 000, it doesn't seem to change anything.
The logs indicates that the memory cache is always used and it takes
always 0 sec (even on big reports)
That's why I'm wondering if this option is really working...
Also I found in DataEngineContext several constants to configure the
cache (CACHE_MODE_IN_DISK, CACHE_MODE_IN_MEMORY, CACHE_USE_DISABLE), but
I don't know how to use them, how to set a DataEngineContext and set it
to my EngineTask.
Thanks
Nicolas
Nicolas,
What version of BIRT are you using?
Jason
I'm using Birt 2.2.0 (and waiting for the 2.2.1 ... :-)
Nicolas
Nicolas,
I thought this was working properly in 2.2. I do not believe you can
set it below 1 MB. Create a bugzilla entry if you want to get more details.
Jason

Jason
Re: performance issue when sorting a group [message #1785656 is a reply to message #542677] Wed, 18 April 2018 05:48 Go to previous message
Naresh A is currently offline Naresh AFriend
Messages: 1
Registered: April 2018
Junior Member
Hi,

I have Written this Script in the table of OnCreate Time when I Changing the master page in the middle of the report
this.getStyle().pageBreakBefore="AVOID";
this.getStyle().masterPage="Skipped_Line_Image";
this.getStyle().pageBreakAfter="AVOID";

Master Page is Changing after Changing a new empty page is loading how to avoid this page empty page.

Regards,
Naresh.A
Previous Topic:Repeating report content based on the list of items selected in the multiselect report parameter
Next Topic:Metadata for dataset
Goto Forum:
  


Current Time: Sat Sep 22 11:24:08 GMT 2018

Powered by FUDForum. Page generated in 0.01624 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top