[QVTO] advice needed for large models [message #854804] |
Tue, 24 April 2012 08:42 |
Eclipse User |
|
|
|
Hi all,
I'm using QVTO programmatically and have experienced that with larger
input data sets I easily run in to a hard "gc overhead limit reached"
error. I realize that this is also closely related to the transformation
script, but are there any general guidelines how to handle large data
sets? Are there any options or tweaks when configuring the
ExecutionContext so that it might create less runtime data? And what
happens if the model itself is too large to be all in memory at one
time? Is there any demand loading/unloading?
Any input or pointers on the matter will be greatly appreciated!
Thanks
Marius
|
|
|
Re: [QVTO] advice needed for large models [message #855092 is a reply to message #854804] |
Tue, 24 April 2012 14:10 |
Ed Willink Messages: 7680 Registered: July 2009 |
Senior Member |
|
|
Hi
Assuming that you have done the obvious things like increase the Java
heap, disable tracing, and that you do not use allInstances() or
unnavigable opposites....
A complex transformation may need access to all the data, so I don't
think there is a general solution for QVTo.
In practice your transformation may be localized so that it is amenable
to streaming the model through in fragments. Assuming that QVTo does not
support this directly, you could have a stream reader that passes model
fragments for local transformation, and then have a stream writer that
combines the result fragments.
Alternatively you might contrive to keep the model in a repository such
as CDO so that you only need a small portion in memory at any time.
One day, a declarative transformation language, such as QVTr, could have
streaming operation as one of its compilation strategies.
Regards
Ed Willink
On 24/04/2012 09:42, Marius Gröger wrote:
> Hi all,
>
> I'm using QVTO programmatically and have experienced that with larger
> input data sets I easily run in to a hard "gc overhead limit reached"
> error. I realize that this is also closely related to the transformation
> script, but are there any general guidelines how to handle large data
> sets? Are there any options or tweaks when configuring the
> ExecutionContext so that it might create less runtime data? And what
> happens if the model itself is too large to be all in memory at one
> time? Is there any demand loading/unloading?
>
> Any input or pointers on the matter will be greatly appreciated!
>
> Thanks
> Marius
|
|
|
Re: [QVTO/ATL] advice needed for large models [message #855132 is a reply to message #855092] |
Tue, 24 April 2012 14:45 |
Eclipse User |
|
|
|
On 24.04.2012 16:10, Ed Willink wrote:
> Assuming that you have done the obvious things like increase the Java
> heap, disable tracing, and that you do not use allInstances() or
> unnavigable opposites....
Thanks for answering. How do I disable tracing? Before posting, I had
done some research on this using google and the source code but found no
evidence that this is possible at all. Perhaps you could throw me a
pointer how to do that?
>
> A complex transformation may need access to all the data, so I don't
> think there is a general solution for QVTo.
Ok... I had chosen QVT over ATL after some evaluation because I found
that QVT has much better preparation for the programmatic usage which I
need. So, how does ATL handle large datasets?
Thanks
Marius
|
|
|
|
|
Re: [QVTO/ATL] advice needed for large models [message #855831 is a reply to message #855189] |
Wed, 25 April 2012 06:20 |
Eclipse User |
|
|
|
On 24.04.2012 17:40, Ed Willink wrote:
> Just leave the trace file blank in the launcher; I'm only guessing that
> this helps.
Hm... as I said I'm using QVT programmatically within an application,
and not using a launcher. From what I saw in the source code it appeared
to me that you can't disable tracing at all. It even seemed to me that
QVT's regular operation relies on those traces. It's just an option to
save them or not.
I may be very wrong here, so please correct me then.
Regards
Marius
|
|
|
Re: [QVTO] advice needed for large models [message #855849 is a reply to message #855550] |
Wed, 25 April 2012 06:36 |
Eclipse User |
|
|
|
On 25.04.2012 01:10, Alan McMorran wrote:
> How large are you talking about? I've found QVTo scales pretty well as
> the dataset size increases but we're using at most maybe 3-4 million
> objects as the input and maybe 1-2 million on the output. They can be
> pretty complex models though so we're seeing 8GB heap spaces in some
> cases to accomodate the full transformation process.
Ok, that is good to know. We will be working in roughly the same order
of magnitude. The final application will run on a well equipped server,
unfortunately my development machine is not as powerful so I can't
really test that.
> The big challenges we've had to overcome is that our model is
> essentially flat with no containment in it so there are parts of the
We have a very hierarchical model. I still wonder to what extent EMF and
QVTo at least try to let go of objects which are not needed anymore and
allow them to be garbage collected?
> Is the GC overhead limit not tied to the heap space limits of the JVM?
Apparently not, quoting
http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html:
"The concurrent collector will throw an OutOfMemoryError if too much
time is being spent in garbage collection: if more than 98% of the total
time is spent in garbage collection and less than 2% of the heap is
recovered, an OutOfMemoryError will be thrown. This feature is designed
to prevent applications from running for an extended period of time
while making little or no progress because the heap is too small. If
necessary, this feature can be disabled by adding the option
-XX:-UseGCOverheadLimit to the command line."
I will experiment a little bit with different GC's, namely the parallel GC.
Regards
Marius
|
|
|
|
Powered by
FUDForum. Page generated in 0.04302 seconds