Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Archived » IMP » Possible performance bug in LPG?
Possible performance bug in LPG? [message #28123] Wed, 24 June 2009 18:22 Go to next message
Eclipse UserFriend
Originally posted by: d.kleinrath.inode.at

While trying to improve the performance of my LPG parser I think I found=
a =

serious performance bug in LPG.
Actually I can not believe that this wasn't mentioned before so if this =
is =

known or if it is intended behaviour please appologize.

The problem arises if the same generated parser is reused to parse sever=
al =

files. When a file is parsed with lpg.runtime.DeterministicParser it fir=
st =

calls the method lpg.runtime.Stacks.reallocateStacks(). Here is the code=
=

from this method:

int old_stack_length =3D (stateStack =3D=3D null ? 0 : =

stateStack.length),
stack_length =3D old_stack_length + STACK_INCREMENT;

if (stateStack =3D=3D null)
{
stateStack =3D new int[stack_length];
locationStack =3D new int[stack_length];
parseStack =3D new Object[stack_length];
}
else
{
System.arraycopy(stateStack, 0, stateStack =3D new =

int[stack_length], 0, old_stack_length);
System.arraycopy(locationStack, 0, locationStack =3D new =

int[stack_length], 0, old_stack_length);
System.arraycopy(parseStack, 0, parseStack =3D new =

Object[stack_length], 0, old_stack_length);
}
return;

As you can see if the stateStack was already created, the parser will co=
py =

its contents into a new array with the size =

old_stack_length+STACK_INCREMENT. This is done because the =

reallocateStacks() method is also used to resize the stack when needed. =
=

This behaviour results in a memory leak and in serious performance =

problems when reusing a parser instance for compiling more than one file=
.. =

After I changed my code to not reuse the parser the compilation time for=
=

about 200 files was reduced from 10 seconds to under one second so it is=
=

now 10 times faster.


The ParseController that gets generated by IMP also lazily instantiates =
=

the internal DeterministicParser. So perhaps IMP suffers the same =

performance problems (I'm not sure if IMP reuses the ParseController).


If someone can confirm that this is a bug I'll create a bug report for i=
t.

Greetings,
Dieter
Re: Possible performance bug in LPG? [message #28163 is a reply to message #28123] Wed, 24 June 2009 20:29 Go to previous messageGo to next message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7655
Registered: July 2009
Senior Member
Hi Dieter

If you're reparsing you need to work quite hard to make sure that you
use a new parser and null out every possible reference to anything in the
old Token array, or AST tree; otherwise memory leaks very fast. Use a new parser.

Regards

Ed Willink

Dieter Kleinrath wrote:
> While trying to improve the performance of my LPG parser I think I found
> a serious performance bug in LPG.
> Actually I can not believe that this wasn't mentioned before so if this
> is known or if it is intended behaviour please appologize.
>
> The problem arises if the same generated parser is reused to parse
> several files. When a file is parsed with
> lpg.runtime.DeterministicParser it first calls the method
> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this method:
>
> int old_stack_length = (stateStack == null ? 0 :
> stateStack.length),
> stack_length = old_stack_length + STACK_INCREMENT;
>
> if (stateStack == null)
> {
> stateStack = new int[stack_length];
> locationStack = new int[stack_length];
> parseStack = new Object[stack_length];
> }
> else
> {
> System.arraycopy(stateStack, 0, stateStack = new
> int[stack_length], 0, old_stack_length);
> System.arraycopy(locationStack, 0, locationStack = new
> int[stack_length], 0, old_stack_length);
> System.arraycopy(parseStack, 0, parseStack = new
> Object[stack_length], 0, old_stack_length);
> }
> return;
>
> As you can see if the stateStack was already created, the parser will
> copy its contents into a new array with the size
> old_stack_length+STACK_INCREMENT. This is done because the
> reallocateStacks() method is also used to resize the stack when needed.
> This behaviour results in a memory leak and in serious performance
> problems when reusing a parser instance for compiling more than one
> file. After I changed my code to not reuse the parser the compilation
> time for about 200 files was reduced from 10 seconds to under one second
> so it is now 10 times faster.
>
>
> The ParseController that gets generated by IMP also lazily instantiates
> the internal DeterministicParser. So perhaps IMP suffers the same
> performance problems (I'm not sure if IMP reuses the ParseController).
>
>
> If someone can confirm that this is a bug I'll create a bug report for it.
>
> Greetings,
> Dieter
Re: Possible performance bug in LPG? [message #28201 is a reply to message #28163] Wed, 24 June 2009 21:42 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: d.kleinrath.inode.at

Ok. So if one actually wanted to reuse the parser one would have to null
out the arrays in lpg.runtime.Stacks manually. This should really be
documented somewhere in the parser class - it caused me quite some
headache.


Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:

> Hi Dieter
>
> If you're reparsing you need to work quite hard to make sure that you
> use a new parser and null out every possible reference to anything in the
> old Token array, or AST tree; otherwise memory leaks very fast. Use a
> new parser.
>
> Regards
>
> Ed Willink
>
> Dieter Kleinrath wrote:
>> While trying to improve the performance of my LPG parser I think I
>> found a serious performance bug in LPG.
>> Actually I can not believe that this wasn't mentioned before so if this
>> is known or if it is intended behaviour please appologize.
>> The problem arises if the same generated parser is reused to parse
>> several files. When a file is parsed with
>> lpg.runtime.DeterministicParser it first calls the method
>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>> method:
>> int old_stack_length = (stateStack == null ? 0 :
>> stateStack.length),
>> stack_length = old_stack_length + STACK_INCREMENT;
>> if (stateStack == null)
>> {
>> stateStack = new int[stack_length];
>> locationStack = new int[stack_length];
>> parseStack = new Object[stack_length];
>> }
>> else
>> {
>> System.arraycopy(stateStack, 0, stateStack = new
>> int[stack_length], 0, old_stack_length);
>> System.arraycopy(locationStack, 0, locationStack = new
>> int[stack_length], 0, old_stack_length);
>> System.arraycopy(parseStack, 0, parseStack = new
>> Object[stack_length], 0, old_stack_length);
>> }
>> return;
>> As you can see if the stateStack was already created, the parser will
>> copy its contents into a new array with the size
>> old_stack_length+STACK_INCREMENT. This is done because the
>> reallocateStacks() method is also used to resize the stack when needed.
>> This behaviour results in a memory leak and in serious performance
>> problems when reusing a parser instance for compiling more than one
>> file. After I changed my code to not reuse the parser the compilation
>> time for about 200 files was reduced from 10 seconds to under one
>> second so it is now 10 times faster.
>> The ParseController that gets generated by IMP also lazily
>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>> the same performance problems (I'm not sure if IMP reuses the
>> ParseController).
>> If someone can confirm that this is a bug I'll create a bug report
>> for it.
>> Greetings,
>> Dieter



--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
Re: Possible performance bug in LPG? [message #28240 is a reply to message #28201] Thu, 25 June 2009 19:59 Go to previous messageGo to next message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7655
Registered: July 2009
Senior Member
Hi Dieter

No it's 'your' own code that may be a problem.

If you accidentally hang on to some reference to some AST node, it
may reference all the other AST nodes, which may reference an
IToken node which may lock in all IToken nodes via perhaps an array.

One accidental non-nulled reference to the AST can really ripple.
You must null all your AST references so that the garbage collector
can clean up.

Regards

Ed Willink


Dieter Kleinrath wrote:
> Ok. So if one actually wanted to reuse the parser one would have to null
> out the arrays in lpg.runtime.Stacks manually. This should really be
> documented somewhere in the parser class - it caused me quite some
> headache.
>
>
> Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>
>> Hi Dieter
>>
>> If you're reparsing you need to work quite hard to make sure that you
>> use a new parser and null out every possible reference to anything in the
>> old Token array, or AST tree; otherwise memory leaks very fast. Use a
>> new parser.
>>
>> Regards
>>
>> Ed Willink
>>
>> Dieter Kleinrath wrote:
>>> While trying to improve the performance of my LPG parser I think I
>>> found a serious performance bug in LPG.
>>> Actually I can not believe that this wasn't mentioned before so if
>>> this is known or if it is intended behaviour please appologize.
>>> The problem arises if the same generated parser is reused to parse
>>> several files. When a file is parsed with
>>> lpg.runtime.DeterministicParser it first calls the method
>>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>>> method:
>>> int old_stack_length = (stateStack == null ? 0 :
>>> stateStack.length),
>>> stack_length = old_stack_length + STACK_INCREMENT;
>>> if (stateStack == null)
>>> {
>>> stateStack = new int[stack_length];
>>> locationStack = new int[stack_length];
>>> parseStack = new Object[stack_length];
>>> }
>>> else
>>> {
>>> System.arraycopy(stateStack, 0, stateStack = new
>>> int[stack_length], 0, old_stack_length);
>>> System.arraycopy(locationStack, 0, locationStack = new
>>> int[stack_length], 0, old_stack_length);
>>> System.arraycopy(parseStack, 0, parseStack = new
>>> Object[stack_length], 0, old_stack_length);
>>> }
>>> return;
>>> As you can see if the stateStack was already created, the parser
>>> will copy its contents into a new array with the size
>>> old_stack_length+STACK_INCREMENT. This is done because the
>>> reallocateStacks() method is also used to resize the stack when
>>> needed. This behaviour results in a memory leak and in serious
>>> performance problems when reusing a parser instance for compiling
>>> more than one file. After I changed my code to not reuse the parser
>>> the compilation time for about 200 files was reduced from 10 seconds
>>> to under one second so it is now 10 times faster.
>>> The ParseController that gets generated by IMP also lazily
>>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>>> the same performance problems (I'm not sure if IMP reuses the
>>> ParseController).
>>> If someone can confirm that this is a bug I'll create a bug report
>>> for it.
>>> Greetings,
>>> Dieter
>
>
>
Re: Possible performance bug in LPG? [message #28279 is a reply to message #28240] Sat, 27 June 2009 09:39 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: d.kleinrath.inode.at

Hi Ed,

Sorry to be annoying but I guess eather I don't understand you or you
don't understand me.

The memory leaks no matter if I keep references or not.

The method lpg.runtime.Stacks.reallocateStacks() does the following each
time a parser is reused:
It copies the arrays stateStack, locationStack and parseStack into new
arrays with the size "stack_length = old_stack_length + STACK_INCREMENT;".
The STACK_INCREMENT is by deafault 1024 so each time you reparse, the
stacks used by the parser will increase by this ammount. This extra stack
space is what leaks the memory and it is absolutly unnecessary because it
is never used. When compiling 200 files with the same parser there are
three stacks of size 205824 in the Parser and each of these stacks gets
copied into an even larger stack during the next parsing.

I just debugged the Stack class while using IMP and IMP suffers the same
memory leak during reconciling. If you keep changing an opened file, each
time IMP reconciles it the stacks will grow and grow and grow...

This bug can be fixed very easy by adding the following lines to the reset
method in the DeterministicParser:
locationStack = null;
parseStack= null;
stateStack= null;

I really hope I do not start to get too annoying with this issue but since
fixing it is so easy I thought I'll give it another try and I also think
that fixing it is really worth it.

best regards,
Dieter



Am 25.06.2009, 21:59 Uhr, schrieb Ed Willink <ed@willink.me.uk>:

> Hi Dieter
>
> No it's 'your' own code that may be a problem.
>
> If you accidentally hang on to some reference to some AST node, it
> may reference all the other AST nodes, which may reference an
> IToken node which may lock in all IToken nodes via perhaps an array.
>
> One accidental non-nulled reference to the AST can really ripple.
> You must null all your AST references so that the garbage collector
> can clean up.
>
> Regards
>
> Ed Willink
>
>
> Dieter Kleinrath wrote:
>> Ok. So if one actually wanted to reuse the parser one would have to
>> null out the arrays in lpg.runtime.Stacks manually. This should really
>> be documented somewhere in the parser class - it caused me quite some
>> headache.
>> Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>>
>>> Hi Dieter
>>>
>>> If you're reparsing you need to work quite hard to make sure that you
>>> use a new parser and null out every possible reference to anything in
>>> the
>>> old Token array, or AST tree; otherwise memory leaks very fast. Use a
>>> new parser.
>>>
>>> Regards
>>>
>>> Ed Willink
>>>
>>> Dieter Kleinrath wrote:
>>>> While trying to improve the performance of my LPG parser I think I
>>>> found a serious performance bug in LPG.
>>>> Actually I can not believe that this wasn't mentioned before so if
>>>> this is known or if it is intended behaviour please appologize.
>>>> The problem arises if the same generated parser is reused to parse
>>>> several files. When a file is parsed with
>>>> lpg.runtime.DeterministicParser it first calls the method
>>>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>>>> method:
>>>> int old_stack_length = (stateStack == null ? 0 :
>>>> stateStack.length),
>>>> stack_length = old_stack_length + STACK_INCREMENT;
>>>> if (stateStack == null)
>>>> {
>>>> stateStack = new int[stack_length];
>>>> locationStack = new int[stack_length];
>>>> parseStack = new Object[stack_length];
>>>> }
>>>> else
>>>> {
>>>> System.arraycopy(stateStack, 0, stateStack = new
>>>> int[stack_length], 0, old_stack_length);
>>>> System.arraycopy(locationStack, 0, locationStack = new
>>>> int[stack_length], 0, old_stack_length);
>>>> System.arraycopy(parseStack, 0, parseStack = new
>>>> Object[stack_length], 0, old_stack_length);
>>>> }
>>>> return;
>>>> As you can see if the stateStack was already created, the parser
>>>> will copy its contents into a new array with the size
>>>> old_stack_length+STACK_INCREMENT. This is done because the
>>>> reallocateStacks() method is also used to resize the stack when
>>>> needed. This behaviour results in a memory leak and in serious
>>>> performance problems when reusing a parser instance for compiling
>>>> more than one file. After I changed my code to not reuse the parser
>>>> the compilation time for about 200 files was reduced from 10 seconds
>>>> to under one second so it is now 10 times faster.
>>>> The ParseController that gets generated by IMP also lazily
>>>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>>>> the same performance problems (I'm not sure if IMP reuses the
>>>> ParseController).
>>>> If someone can confirm that this is a bug I'll create a bug report
>>>> for it.
>>>> Greetings,
>>>> Dieter
>>



--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
Re: Possible performance bug in LPG? [message #28353 is a reply to message #28279] Sun, 28 June 2009 05:38 Go to previous messageGo to next message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7655
Registered: July 2009
Senior Member
Hi Dieter

> This bug can be fixed very easy by adding the following lines to the
> reset method in the DeterministicParser:
> locationStack = null;
> parseStack= null;
> stateStack= null;

A bug with LPG should be raised against LPG and perhaps discussed
on an LPG newsgroup.

I am merely pointing out how difficult it is to avoid locking parsed
artefacts in memory and consequently how reuse of a parser perhaps
aggaravates the difficulties.

Regards

Ed Willink
Re: Possible performance bug in LPG? [message #28391 is a reply to message #28353] Sun, 28 June 2009 18:31 Go to previous messageGo to next message
Eclipse UserFriend
Originally posted by: d.kleinrath.inode.at

Hi Ed,

Sorry. For whatever reason I thougth IMP and LPG are somehow connected.
The report would certainly belong to the LPG newsgroup.
Thanks for pointing out the memory problems associated with reusing a
parser in general.

Dieter


Am 28.06.2009, 07:38 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
> Hi Dieter
>
> > This bug can be fixed very easy by adding the following lines to the
> > reset method in the DeterministicParser:
> > locationStack = null;
> > parseStack= null;
> > stateStack= null;
>
> A bug with LPG should be raised against LPG and perhaps discussed
> on an LPG newsgroup.
>
> I am merely pointing out how difficult it is to avoid locking parsed
> artefacts in memory and consequently how reuse of a parser perhaps
> aggaravates the difficulties.
>
> Regards
>
> Ed Willink



--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
Re: Possible performance bug in LPG? [message #29128 is a reply to message #28391] Tue, 14 July 2009 16:53 Go to previous message
Robert M. Fuhrer is currently offline Robert M. FuhrerFriend
Messages: 294
Registered: July 2009
Senior Member
Hi there,

Sorry for not having responded earlier; I was on vacation and recovering
from an important deadline. :-)

Yes, LPG issues are best brought up on the LPG forum on SourceForge.

[The connection between IMP and LPG is that the main LPG developer,
Philippe Charles, is also a member of the IMP team, but they're really
separate projects.]

Dieter Kleinrath wrote:
> Hi Ed,
>
> Sorry. For whatever reason I thougth IMP and LPG are somehow connected.
> The report would certainly belong to the LPG newsgroup.
> Thanks for pointing out the memory problems associated with reusing a
> parser in general.
>
> Dieter
>
>
> Am 28.06.2009, 07:38 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>> Hi Dieter
>>
>> > This bug can be fixed very easy by adding the following lines to the
>> > reset method in the DeterministicParser:
>> > locationStack = null;
>> > parseStack= null;
>> > stateStack= null;
>>
>> A bug with LPG should be raised against LPG and perhaps discussed
>> on an LPG newsgroup.
>>
>> I am merely pointing out how difficult it is to avoid locking parsed
>> artefacts in memory and consequently how reuse of a parser perhaps
>> aggaravates the difficulties.
>>
>> Regards
>>
>> Ed Willink
>
>
>


--
Cheers,
-- Bob

--------------------------------
Robert M. Fuhrer
Research Staff Member
Programming Technologies Dept.
IBM T.J. Watson Research Center

IDE Meta-tooling Platform Project Lead (http://www.eclipse.org/imp)
X10: Productive High-Performance Parallel Programming (http://x10.sf.net)
Re: Possible performance bug in LPG? [message #575596 is a reply to message #28123] Wed, 24 June 2009 20:29 Go to previous message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7655
Registered: July 2009
Senior Member
Hi Dieter

If you're reparsing you need to work quite hard to make sure that you
use a new parser and null out every possible reference to anything in the
old Token array, or AST tree; otherwise memory leaks very fast. Use a new parser.

Regards

Ed Willink

Dieter Kleinrath wrote:
> While trying to improve the performance of my LPG parser I think I found
> a serious performance bug in LPG.
> Actually I can not believe that this wasn't mentioned before so if this
> is known or if it is intended behaviour please appologize.
>
> The problem arises if the same generated parser is reused to parse
> several files. When a file is parsed with
> lpg.runtime.DeterministicParser it first calls the method
> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this method:
>
> int old_stack_length = (stateStack == null ? 0 :
> stateStack.length),
> stack_length = old_stack_length + STACK_INCREMENT;
>
> if (stateStack == null)
> {
> stateStack = new int[stack_length];
> locationStack = new int[stack_length];
> parseStack = new Object[stack_length];
> }
> else
> {
> System.arraycopy(stateStack, 0, stateStack = new
> int[stack_length], 0, old_stack_length);
> System.arraycopy(locationStack, 0, locationStack = new
> int[stack_length], 0, old_stack_length);
> System.arraycopy(parseStack, 0, parseStack = new
> Object[stack_length], 0, old_stack_length);
> }
> return;
>
> As you can see if the stateStack was already created, the parser will
> copy its contents into a new array with the size
> old_stack_length+STACK_INCREMENT. This is done because the
> reallocateStacks() method is also used to resize the stack when needed.
> This behaviour results in a memory leak and in serious performance
> problems when reusing a parser instance for compiling more than one
> file. After I changed my code to not reuse the parser the compilation
> time for about 200 files was reduced from 10 seconds to under one second
> so it is now 10 times faster.
>
>
> The ParseController that gets generated by IMP also lazily instantiates
> the internal DeterministicParser. So perhaps IMP suffers the same
> performance problems (I'm not sure if IMP reuses the ParseController).
>
>
> If someone can confirm that this is a bug I'll create a bug report for it.
>
> Greetings,
> Dieter
Re: Possible performance bug in LPG? [message #575639 is a reply to message #28163] Wed, 24 June 2009 21:42 Go to previous message
Dieter Kleinrath is currently offline Dieter KleinrathFriend
Messages: 5
Registered: July 2009
Junior Member
Ok. So if one actually wanted to reuse the parser one would have to null
out the arrays in lpg.runtime.Stacks manually. This should really be
documented somewhere in the parser class - it caused me quite some
headache.


Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:

> Hi Dieter
>
> If you're reparsing you need to work quite hard to make sure that you
> use a new parser and null out every possible reference to anything in the
> old Token array, or AST tree; otherwise memory leaks very fast. Use a
> new parser.
>
> Regards
>
> Ed Willink
>
> Dieter Kleinrath wrote:
>> While trying to improve the performance of my LPG parser I think I
>> found a serious performance bug in LPG.
>> Actually I can not believe that this wasn't mentioned before so if this
>> is known or if it is intended behaviour please appologize.
>> The problem arises if the same generated parser is reused to parse
>> several files. When a file is parsed with
>> lpg.runtime.DeterministicParser it first calls the method
>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>> method:
>> int old_stack_length = (stateStack == null ? 0 :
>> stateStack.length),
>> stack_length = old_stack_length + STACK_INCREMENT;
>> if (stateStack == null)
>> {
>> stateStack = new int[stack_length];
>> locationStack = new int[stack_length];
>> parseStack = new Object[stack_length];
>> }
>> else
>> {
>> System.arraycopy(stateStack, 0, stateStack = new
>> int[stack_length], 0, old_stack_length);
>> System.arraycopy(locationStack, 0, locationStack = new
>> int[stack_length], 0, old_stack_length);
>> System.arraycopy(parseStack, 0, parseStack = new
>> Object[stack_length], 0, old_stack_length);
>> }
>> return;
>> As you can see if the stateStack was already created, the parser will
>> copy its contents into a new array with the size
>> old_stack_length+STACK_INCREMENT. This is done because the
>> reallocateStacks() method is also used to resize the stack when needed.
>> This behaviour results in a memory leak and in serious performance
>> problems when reusing a parser instance for compiling more than one
>> file. After I changed my code to not reuse the parser the compilation
>> time for about 200 files was reduced from 10 seconds to under one
>> second so it is now 10 times faster.
>> The ParseController that gets generated by IMP also lazily
>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>> the same performance problems (I'm not sure if IMP reuses the
>> ParseController).
>> If someone can confirm that this is a bug I'll create a bug report
>> for it.
>> Greetings,
>> Dieter



--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
Re: Possible performance bug in LPG? [message #575662 is a reply to message #28201] Thu, 25 June 2009 19:59 Go to previous message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7655
Registered: July 2009
Senior Member
Hi Dieter

No it's 'your' own code that may be a problem.

If you accidentally hang on to some reference to some AST node, it
may reference all the other AST nodes, which may reference an
IToken node which may lock in all IToken nodes via perhaps an array.

One accidental non-nulled reference to the AST can really ripple.
You must null all your AST references so that the garbage collector
can clean up.

Regards

Ed Willink


Dieter Kleinrath wrote:
> Ok. So if one actually wanted to reuse the parser one would have to null
> out the arrays in lpg.runtime.Stacks manually. This should really be
> documented somewhere in the parser class - it caused me quite some
> headache.
>
>
> Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>
>> Hi Dieter
>>
>> If you're reparsing you need to work quite hard to make sure that you
>> use a new parser and null out every possible reference to anything in the
>> old Token array, or AST tree; otherwise memory leaks very fast. Use a
>> new parser.
>>
>> Regards
>>
>> Ed Willink
>>
>> Dieter Kleinrath wrote:
>>> While trying to improve the performance of my LPG parser I think I
>>> found a serious performance bug in LPG.
>>> Actually I can not believe that this wasn't mentioned before so if
>>> this is known or if it is intended behaviour please appologize.
>>> The problem arises if the same generated parser is reused to parse
>>> several files. When a file is parsed with
>>> lpg.runtime.DeterministicParser it first calls the method
>>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>>> method:
>>> int old_stack_length = (stateStack == null ? 0 :
>>> stateStack.length),
>>> stack_length = old_stack_length + STACK_INCREMENT;
>>> if (stateStack == null)
>>> {
>>> stateStack = new int[stack_length];
>>> locationStack = new int[stack_length];
>>> parseStack = new Object[stack_length];
>>> }
>>> else
>>> {
>>> System.arraycopy(stateStack, 0, stateStack = new
>>> int[stack_length], 0, old_stack_length);
>>> System.arraycopy(locationStack, 0, locationStack = new
>>> int[stack_length], 0, old_stack_length);
>>> System.arraycopy(parseStack, 0, parseStack = new
>>> Object[stack_length], 0, old_stack_length);
>>> }
>>> return;
>>> As you can see if the stateStack was already created, the parser
>>> will copy its contents into a new array with the size
>>> old_stack_length+STACK_INCREMENT. This is done because the
>>> reallocateStacks() method is also used to resize the stack when
>>> needed. This behaviour results in a memory leak and in serious
>>> performance problems when reusing a parser instance for compiling
>>> more than one file. After I changed my code to not reuse the parser
>>> the compilation time for about 200 files was reduced from 10 seconds
>>> to under one second so it is now 10 times faster.
>>> The ParseController that gets generated by IMP also lazily
>>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>>> the same performance problems (I'm not sure if IMP reuses the
>>> ParseController).
>>> If someone can confirm that this is a bug I'll create a bug report
>>> for it.
>>> Greetings,
>>> Dieter
>
>
>
Re: Possible performance bug in LPG? [message #575714 is a reply to message #28240] Sat, 27 June 2009 09:39 Go to previous message
Dieter Kleinrath is currently offline Dieter KleinrathFriend
Messages: 5
Registered: July 2009
Junior Member
Hi Ed,

Sorry to be annoying but I guess eather I don't understand you or you
don't understand me.

The memory leaks no matter if I keep references or not.

The method lpg.runtime.Stacks.reallocateStacks() does the following each
time a parser is reused:
It copies the arrays stateStack, locationStack and parseStack into new
arrays with the size "stack_length = old_stack_length + STACK_INCREMENT;".
The STACK_INCREMENT is by deafault 1024 so each time you reparse, the
stacks used by the parser will increase by this ammount. This extra stack
space is what leaks the memory and it is absolutly unnecessary because it
is never used. When compiling 200 files with the same parser there are
three stacks of size 205824 in the Parser and each of these stacks gets
copied into an even larger stack during the next parsing.

I just debugged the Stack class while using IMP and IMP suffers the same
memory leak during reconciling. If you keep changing an opened file, each
time IMP reconciles it the stacks will grow and grow and grow...

This bug can be fixed very easy by adding the following lines to the reset
method in the DeterministicParser:
locationStack = null;
parseStack= null;
stateStack= null;

I really hope I do not start to get too annoying with this issue but since
fixing it is so easy I thought I'll give it another try and I also think
that fixing it is really worth it.

best regards,
Dieter



Am 25.06.2009, 21:59 Uhr, schrieb Ed Willink <ed@willink.me.uk>:

> Hi Dieter
>
> No it's 'your' own code that may be a problem.
>
> If you accidentally hang on to some reference to some AST node, it
> may reference all the other AST nodes, which may reference an
> IToken node which may lock in all IToken nodes via perhaps an array.
>
> One accidental non-nulled reference to the AST can really ripple.
> You must null all your AST references so that the garbage collector
> can clean up.
>
> Regards
>
> Ed Willink
>
>
> Dieter Kleinrath wrote:
>> Ok. So if one actually wanted to reuse the parser one would have to
>> null out the arrays in lpg.runtime.Stacks manually. This should really
>> be documented somewhere in the parser class - it caused me quite some
>> headache.
>> Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>>
>>> Hi Dieter
>>>
>>> If you're reparsing you need to work quite hard to make sure that you
>>> use a new parser and null out every possible reference to anything in
>>> the
>>> old Token array, or AST tree; otherwise memory leaks very fast. Use a
>>> new parser.
>>>
>>> Regards
>>>
>>> Ed Willink
>>>
>>> Dieter Kleinrath wrote:
>>>> While trying to improve the performance of my LPG parser I think I
>>>> found a serious performance bug in LPG.
>>>> Actually I can not believe that this wasn't mentioned before so if
>>>> this is known or if it is intended behaviour please appologize.
>>>> The problem arises if the same generated parser is reused to parse
>>>> several files. When a file is parsed with
>>>> lpg.runtime.DeterministicParser it first calls the method
>>>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>>>> method:
>>>> int old_stack_length = (stateStack == null ? 0 :
>>>> stateStack.length),
>>>> stack_length = old_stack_length + STACK_INCREMENT;
>>>> if (stateStack == null)
>>>> {
>>>> stateStack = new int[stack_length];
>>>> locationStack = new int[stack_length];
>>>> parseStack = new Object[stack_length];
>>>> }
>>>> else
>>>> {
>>>> System.arraycopy(stateStack, 0, stateStack = new
>>>> int[stack_length], 0, old_stack_length);
>>>> System.arraycopy(locationStack, 0, locationStack = new
>>>> int[stack_length], 0, old_stack_length);
>>>> System.arraycopy(parseStack, 0, parseStack = new
>>>> Object[stack_length], 0, old_stack_length);
>>>> }
>>>> return;
>>>> As you can see if the stateStack was already created, the parser
>>>> will copy its contents into a new array with the size
>>>> old_stack_length+STACK_INCREMENT. This is done because the
>>>> reallocateStacks() method is also used to resize the stack when
>>>> needed. This behaviour results in a memory leak and in serious
>>>> performance problems when reusing a parser instance for compiling
>>>> more than one file. After I changed my code to not reuse the parser
>>>> the compilation time for about 200 files was reduced from 10 seconds
>>>> to under one second so it is now 10 times faster.
>>>> The ParseController that gets generated by IMP also lazily
>>>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>>>> the same performance problems (I'm not sure if IMP reuses the
>>>> ParseController).
>>>> If someone can confirm that this is a bug I'll create a bug report
>>>> for it.
>>>> Greetings,
>>>> Dieter
>>



--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
Re: Possible performance bug in LPG? [message #575754 is a reply to message #28279] Sun, 28 June 2009 05:38 Go to previous message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7655
Registered: July 2009
Senior Member
Hi Dieter

> This bug can be fixed very easy by adding the following lines to the
> reset method in the DeterministicParser:
> locationStack = null;
> parseStack= null;
> stateStack= null;

A bug with LPG should be raised against LPG and perhaps discussed
on an LPG newsgroup.

I am merely pointing out how difficult it is to avoid locking parsed
artefacts in memory and consequently how reuse of a parser perhaps
aggaravates the difficulties.

Regards

Ed Willink
Re: Possible performance bug in LPG? [message #575810 is a reply to message #28353] Sun, 28 June 2009 18:31 Go to previous message
Dieter Kleinrath is currently offline Dieter KleinrathFriend
Messages: 5
Registered: July 2009
Junior Member
Hi Ed,

Sorry. For whatever reason I thougth IMP and LPG are somehow connected.
The report would certainly belong to the LPG newsgroup.
Thanks for pointing out the memory problems associated with reusing a
parser in general.

Dieter


Am 28.06.2009, 07:38 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
> Hi Dieter
>
> > This bug can be fixed very easy by adding the following lines to the
> > reset method in the DeterministicParser:
> > locationStack = null;
> > parseStack= null;
> > stateStack= null;
>
> A bug with LPG should be raised against LPG and perhaps discussed
> on an LPG newsgroup.
>
> I am merely pointing out how difficult it is to avoid locking parsed
> artefacts in memory and consequently how reuse of a parser perhaps
> aggaravates the difficulties.
>
> Regards
>
> Ed Willink



--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
Re: Possible performance bug in LPG? [message #575984 is a reply to message #28391] Tue, 14 July 2009 16:53 Go to previous message
Robert M. Fuhrer is currently offline Robert M. FuhrerFriend
Messages: 294
Registered: July 2009
Senior Member
Hi there,

Sorry for not having responded earlier; I was on vacation and recovering
from an important deadline. :-)

Yes, LPG issues are best brought up on the LPG forum on SourceForge.

[The connection between IMP and LPG is that the main LPG developer,
Philippe Charles, is also a member of the IMP team, but they're really
separate projects.]

Dieter Kleinrath wrote:
> Hi Ed,
>
> Sorry. For whatever reason I thougth IMP and LPG are somehow connected.
> The report would certainly belong to the LPG newsgroup.
> Thanks for pointing out the memory problems associated with reusing a
> parser in general.
>
> Dieter
>
>
> Am 28.06.2009, 07:38 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>> Hi Dieter
>>
>> > This bug can be fixed very easy by adding the following lines to the
>> > reset method in the DeterministicParser:
>> > locationStack = null;
>> > parseStack= null;
>> > stateStack= null;
>>
>> A bug with LPG should be raised against LPG and perhaps discussed
>> on an LPG newsgroup.
>>
>> I am merely pointing out how difficult it is to avoid locking parsed
>> artefacts in memory and consequently how reuse of a parser perhaps
>> aggaravates the difficulties.
>>
>> Regards
>>
>> Ed Willink
>
>
>


--
Cheers,
-- Bob

--------------------------------
Robert M. Fuhrer
Research Staff Member
Programming Technologies Dept.
IBM T.J. Watson Research Center

IDE Meta-tooling Platform Project Lead (http://www.eclipse.org/imp)
X10: Productive High-Performance Parallel Programming (http://x10.sf.net)
Previous Topic:When will IMP leave the incubator state?
Next Topic:Error handling - where to define messages
Goto Forum:
  


Current Time: Thu Apr 18 20:47:33 GMT 2024

Powered by FUDForum. Page generated in 0.03167 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top