Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » BIRT » Disk cached datasets(goalFiles and data.data files)
Disk cached datasets [message #855233] Tue, 24 April 2012 16:21 Go to next message
john mcteague is currently offline john mcteagueFriend
Messages: 15
Registered: July 2009
Junior Member
Im experimenting with the DataEngine.MEMORY_BUFFER_SIZE settings in order to manage large datasets in a more memory efficent manner. I am using the java api to integrate BIRT.

I notice that when I set the MEMORY_BUFFER_SIZE I get both a goalFile and data.data file of approx equal sizes generated in my temporary directory. Given the size of my datasets these are in excess of 200MB each so keeping the under control would be good.

The directory structure that is generated is as follows:
Directory of C:\Users\joe\AppData\Local\Temp\DataEngine_619898374_4

24/04/2012 17:12 <DIR> .
24/04/2012 17:12 <DIR> ..
24/04/2012 17:12 <DIR> BirtDataTemp13352837620643
24/04/2012 17:12 <DIR> BirtDataTemp13352839510834
24/04/2012 17:12 <DIR> DataSetCacheObject_1558846527_3
0 File(s) 0 bytes

Directory of C:\Users\joe\AppData\Local\Temp\DataEngine_619898374_4\BirtDataTemp13352837620643

24/04/2012 17:12 <DIR> .
24/04/2012 17:12 <DIR> ..
0 File(s) 0 bytes

Directory of C:\Users\joe\AppData\Local\Temp\DataEngine_619898374_4\BirtDataTemp13352839510834

24/04/2012 17:12 <DIR> .
24/04/2012 17:12 <DIR> ..
24/04/2012 17:12 <DIR> session_13352839511324
0 File(s) 0 bytes

Directory of C:\Users\joe\AppData\Local\Temp\DataEngine_619898374_4\BirtDataTemp13352839510834\session_133528395113
24

24/04/2012 17:12 <DIR> .
24/04/2012 17:12 <DIR> ..
24/04/2012 17:12 214,996,498 goalFile
1 File(s) 214,996,498 bytes

Directory of C:\Users\joe\AppData\Local\Temp\DataEngine_619898374_4\DataSetCacheObject_1558846527_3

24/04/2012 17:12 <DIR> .
24/04/2012 17:12 <DIR> ..
24/04/2012 17:12 216,631,128 data.data
24/04/2012 17:12 6,809 meta.data
24/04/2012 17:12 21 time.data
3 File(s) 216,637,958 bytes


In addition, I am storing the rptdocument on disk to enable exporting in different formats and paging of HTML data (its over 450MB itself). So all together, each time I run the report I am consuming up to 1GB of temp space (some of which doesnt get deleted until the server reboots - bugs.eclipse.org/bugs/show_bug.cgi?id=369172)

So my first question is, what is the relationship between goalFile and data.data. What triggers the creation of these files and why are there two?

I have seen data.data and its related files created in other scenarios but only since I have started using the memory buffer option has goalFile appeared.

Thanks,
John
Re: Disk cached datasets [message #856344 is a reply to message #855233] Wed, 25 April 2012 15:24 Go to previous message
Jason Weathersby is currently offline Jason WeathersbyFriend
Messages: 9167
Registered: July 2009
Senior Member

John,

Caching only occurs when multiple passes of the data are required or
datasets are used multiple times

If you set this value below, to some value and your data set exceeds
it the data will start to write to disk.

/**

* Indicate the size of data cached for each result set.We only accept
non-negative integer as input,

* the unit of which would be MB.

* If this setting is 0, all temporary rows will be cached in memory
during query processing.

*/

public static String MEMORY_BUFFER_SIZE =
"org.eclipse.birt.data.query.ResultBufferSize";


This happens for grouping and aggregations and will create the files in
the temp directory under DataEngine_code/BirtDataTempcode where code is
unique. You set the value in MB.


By default if you re-use a data set a different cache is created of the
processed results in the directory:

DataEngine_code/DataSetCacheObject_code/multiple files

This operation is written to disk. If you set

reportContext.getAppContext().put("org.eclipse.birt.data.cache.memory",
new Integer(-1)) All of the rows will be written to memory instead of
the above files and nothing will get written to
DataEngine_code/DataSetCacheObject_code

public static String MEMORY_DATA_SET_CACHE =
"org.eclipse.birt.data.cache.memory";

0 value will just write to disk, postive value will only write that
number of rows to memory.


MEMORY_USAGE_AGGRESSIVE

This setting just initializes list maxes that will determine if large
files with many aggregation and grouping components will get written to
disk. Take a look at the

BasicCachedList class that is used for aggregations. If you use
crosstabs or other heavy aggregation items they get cached.

Jason







On 4/24/2012 12:21 PM, john mcteague wrote:
> Im experimenting with the DataEngine.MEMORY_BUFFER_SIZE settings in
> order to manage large datasets in a more memory efficent manner. I am
> using the java api to integrate BIRT.
>
> I notice that when I set the MEMORY_BUFFER_SIZE I get both a goalFile
> and data.data file of approx equal sizes generated in my temporary
> directory. Given the size of my datasets these are in excess of 200MB
> each so keeping the under control would be good.
>
> The directory structure that is generated is as follows:
> Directory of C:\Users\joe\AppData\Local\Temp\DataEngine_619898374_4
>
> 24/04/2012 17:12 <DIR> .
> 24/04/2012 17:12 <DIR> ..
> 24/04/2012 17:12 <DIR> BirtDataTemp13352837620643
> 24/04/2012 17:12 <DIR> BirtDataTemp13352839510834
> 24/04/2012 17:12 <DIR> DataSetCacheObject_1558846527_3
> 0 File(s) 0 bytes
>
> Directory of
> C:\Users\joe\AppData\Local\Temp\DataEngine_619898374_4\BirtDataTemp13352837620643
>
>
> 24/04/2012 17:12 <DIR> .
> 24/04/2012 17:12 <DIR> ..
> 0 File(s) 0 bytes
>
> Directory of
> C:\Users\joe\AppData\Local\Temp\DataEngine_619898374_4\BirtDataTemp13352839510834
>
>
> 24/04/2012 17:12 <DIR> .
> 24/04/2012 17:12 <DIR> ..
> 24/04/2012 17:12 <DIR> session_13352839511324
> 0 File(s) 0 bytes
>
> Directory of
> C:\Users\joe\AppData\Local\Temp\DataEngine_619898374_4\BirtDataTemp13352839510834\session_133528395113
>
> 24
>
> 24/04/2012 17:12 <DIR> .
> 24/04/2012 17:12 <DIR> ..
> 24/04/2012 17:12 214,996,498 goalFile
> 1 File(s) 214,996,498 bytes
>
> Directory of
> C:\Users\joe\AppData\Local\Temp\DataEngine_619898374_4\DataSetCacheObject_1558846527_3
>
>
> 24/04/2012 17:12 <DIR> .
> 24/04/2012 17:12 <DIR> ..
> 24/04/2012 17:12 216,631,128 data.data
> 24/04/2012 17:12 6,809 meta.data
> 24/04/2012 17:12 21 time.data
> 3 File(s) 216,637,958 bytes
>
>
> In addition, I am storing the rptdocument on disk to enable exporting in
> different formats and paging of HTML data (its over 450MB itself). So
> all together, each time I run the report I am consuming up to 1GB of
> temp space (some of which doesnt get deleted until the server reboots -
> bugs.eclipse.org/bugs/show_bug.cgi?id=369172)
>
> So my first question is, what is the relationship between goalFile and
> data.data. What triggers the creation of these files and why are there two?
>
> I have seen data.data and its related files created in other scenarios
> but only since I have started using the memory buffer option has
> goalFile appeared.
>
> Thanks,
> John
Previous Topic:Updating SQL query at the run-time
Next Topic:rotate a text inside a table cell ?
Goto Forum:
  


Current Time: Tue Jan 28 06:29:23 GMT 2020

Powered by FUDForum. Page generated in 0.01530 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top