Home » Archived » IMP » Possible performance bug in LPG?
Possible performance bug in LPG? [message #28123] |
Wed, 24 June 2009 18:22 |
Eclipse User |
|
|
|
Originally posted by: d.kleinrath.inode.at
While trying to improve the performance of my LPG parser I think I found=
a =
serious performance bug in LPG.
Actually I can not believe that this wasn't mentioned before so if this =
is =
known or if it is intended behaviour please appologize.
The problem arises if the same generated parser is reused to parse sever=
al =
files. When a file is parsed with lpg.runtime.DeterministicParser it fir=
st =
calls the method lpg.runtime.Stacks.reallocateStacks(). Here is the code=
=
from this method:
int old_stack_length =3D (stateStack =3D=3D null ? 0 : =
stateStack.length),
stack_length =3D old_stack_length + STACK_INCREMENT;
if (stateStack =3D=3D null)
{
stateStack =3D new int[stack_length];
locationStack =3D new int[stack_length];
parseStack =3D new Object[stack_length];
}
else
{
System.arraycopy(stateStack, 0, stateStack =3D new =
int[stack_length], 0, old_stack_length);
System.arraycopy(locationStack, 0, locationStack =3D new =
int[stack_length], 0, old_stack_length);
System.arraycopy(parseStack, 0, parseStack =3D new =
Object[stack_length], 0, old_stack_length);
}
return;
As you can see if the stateStack was already created, the parser will co=
py =
its contents into a new array with the size =
old_stack_length+STACK_INCREMENT. This is done because the =
reallocateStacks() method is also used to resize the stack when needed. =
=
This behaviour results in a memory leak and in serious performance =
problems when reusing a parser instance for compiling more than one file=
.. =
After I changed my code to not reuse the parser the compilation time for=
=
about 200 files was reduced from 10 seconds to under one second so it is=
=
now 10 times faster.
The ParseController that gets generated by IMP also lazily instantiates =
=
the internal DeterministicParser. So perhaps IMP suffers the same =
performance problems (I'm not sure if IMP reuses the ParseController).
If someone can confirm that this is a bug I'll create a bug report for i=
t.
Greetings,
Dieter
|
|
|
Re: Possible performance bug in LPG? [message #28163 is a reply to message #28123] |
Wed, 24 June 2009 20:29 |
Ed Willink Messages: 7670 Registered: July 2009 |
Senior Member |
|
|
Hi Dieter
If you're reparsing you need to work quite hard to make sure that you
use a new parser and null out every possible reference to anything in the
old Token array, or AST tree; otherwise memory leaks very fast. Use a new parser.
Regards
Ed Willink
Dieter Kleinrath wrote:
> While trying to improve the performance of my LPG parser I think I found
> a serious performance bug in LPG.
> Actually I can not believe that this wasn't mentioned before so if this
> is known or if it is intended behaviour please appologize.
>
> The problem arises if the same generated parser is reused to parse
> several files. When a file is parsed with
> lpg.runtime.DeterministicParser it first calls the method
> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this method:
>
> int old_stack_length = (stateStack == null ? 0 :
> stateStack.length),
> stack_length = old_stack_length + STACK_INCREMENT;
>
> if (stateStack == null)
> {
> stateStack = new int[stack_length];
> locationStack = new int[stack_length];
> parseStack = new Object[stack_length];
> }
> else
> {
> System.arraycopy(stateStack, 0, stateStack = new
> int[stack_length], 0, old_stack_length);
> System.arraycopy(locationStack, 0, locationStack = new
> int[stack_length], 0, old_stack_length);
> System.arraycopy(parseStack, 0, parseStack = new
> Object[stack_length], 0, old_stack_length);
> }
> return;
>
> As you can see if the stateStack was already created, the parser will
> copy its contents into a new array with the size
> old_stack_length+STACK_INCREMENT. This is done because the
> reallocateStacks() method is also used to resize the stack when needed.
> This behaviour results in a memory leak and in serious performance
> problems when reusing a parser instance for compiling more than one
> file. After I changed my code to not reuse the parser the compilation
> time for about 200 files was reduced from 10 seconds to under one second
> so it is now 10 times faster.
>
>
> The ParseController that gets generated by IMP also lazily instantiates
> the internal DeterministicParser. So perhaps IMP suffers the same
> performance problems (I'm not sure if IMP reuses the ParseController).
>
>
> If someone can confirm that this is a bug I'll create a bug report for it.
>
> Greetings,
> Dieter
|
|
|
Re: Possible performance bug in LPG? [message #28201 is a reply to message #28163] |
Wed, 24 June 2009 21:42 |
Eclipse User |
|
|
|
Originally posted by: d.kleinrath.inode.at
Ok. So if one actually wanted to reuse the parser one would have to null
out the arrays in lpg.runtime.Stacks manually. This should really be
documented somewhere in the parser class - it caused me quite some
headache.
Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
> Hi Dieter
>
> If you're reparsing you need to work quite hard to make sure that you
> use a new parser and null out every possible reference to anything in the
> old Token array, or AST tree; otherwise memory leaks very fast. Use a
> new parser.
>
> Regards
>
> Ed Willink
>
> Dieter Kleinrath wrote:
>> While trying to improve the performance of my LPG parser I think I
>> found a serious performance bug in LPG.
>> Actually I can not believe that this wasn't mentioned before so if this
>> is known or if it is intended behaviour please appologize.
>> The problem arises if the same generated parser is reused to parse
>> several files. When a file is parsed with
>> lpg.runtime.DeterministicParser it first calls the method
>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>> method:
>> int old_stack_length = (stateStack == null ? 0 :
>> stateStack.length),
>> stack_length = old_stack_length + STACK_INCREMENT;
>> if (stateStack == null)
>> {
>> stateStack = new int[stack_length];
>> locationStack = new int[stack_length];
>> parseStack = new Object[stack_length];
>> }
>> else
>> {
>> System.arraycopy(stateStack, 0, stateStack = new
>> int[stack_length], 0, old_stack_length);
>> System.arraycopy(locationStack, 0, locationStack = new
>> int[stack_length], 0, old_stack_length);
>> System.arraycopy(parseStack, 0, parseStack = new
>> Object[stack_length], 0, old_stack_length);
>> }
>> return;
>> As you can see if the stateStack was already created, the parser will
>> copy its contents into a new array with the size
>> old_stack_length+STACK_INCREMENT. This is done because the
>> reallocateStacks() method is also used to resize the stack when needed.
>> This behaviour results in a memory leak and in serious performance
>> problems when reusing a parser instance for compiling more than one
>> file. After I changed my code to not reuse the parser the compilation
>> time for about 200 files was reduced from 10 seconds to under one
>> second so it is now 10 times faster.
>> The ParseController that gets generated by IMP also lazily
>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>> the same performance problems (I'm not sure if IMP reuses the
>> ParseController).
>> If someone can confirm that this is a bug I'll create a bug report
>> for it.
>> Greetings,
>> Dieter
--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
|
|
|
Re: Possible performance bug in LPG? [message #28240 is a reply to message #28201] |
Thu, 25 June 2009 19:59 |
Ed Willink Messages: 7670 Registered: July 2009 |
Senior Member |
|
|
Hi Dieter
No it's 'your' own code that may be a problem.
If you accidentally hang on to some reference to some AST node, it
may reference all the other AST nodes, which may reference an
IToken node which may lock in all IToken nodes via perhaps an array.
One accidental non-nulled reference to the AST can really ripple.
You must null all your AST references so that the garbage collector
can clean up.
Regards
Ed Willink
Dieter Kleinrath wrote:
> Ok. So if one actually wanted to reuse the parser one would have to null
> out the arrays in lpg.runtime.Stacks manually. This should really be
> documented somewhere in the parser class - it caused me quite some
> headache.
>
>
> Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>
>> Hi Dieter
>>
>> If you're reparsing you need to work quite hard to make sure that you
>> use a new parser and null out every possible reference to anything in the
>> old Token array, or AST tree; otherwise memory leaks very fast. Use a
>> new parser.
>>
>> Regards
>>
>> Ed Willink
>>
>> Dieter Kleinrath wrote:
>>> While trying to improve the performance of my LPG parser I think I
>>> found a serious performance bug in LPG.
>>> Actually I can not believe that this wasn't mentioned before so if
>>> this is known or if it is intended behaviour please appologize.
>>> The problem arises if the same generated parser is reused to parse
>>> several files. When a file is parsed with
>>> lpg.runtime.DeterministicParser it first calls the method
>>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>>> method:
>>> int old_stack_length = (stateStack == null ? 0 :
>>> stateStack.length),
>>> stack_length = old_stack_length + STACK_INCREMENT;
>>> if (stateStack == null)
>>> {
>>> stateStack = new int[stack_length];
>>> locationStack = new int[stack_length];
>>> parseStack = new Object[stack_length];
>>> }
>>> else
>>> {
>>> System.arraycopy(stateStack, 0, stateStack = new
>>> int[stack_length], 0, old_stack_length);
>>> System.arraycopy(locationStack, 0, locationStack = new
>>> int[stack_length], 0, old_stack_length);
>>> System.arraycopy(parseStack, 0, parseStack = new
>>> Object[stack_length], 0, old_stack_length);
>>> }
>>> return;
>>> As you can see if the stateStack was already created, the parser
>>> will copy its contents into a new array with the size
>>> old_stack_length+STACK_INCREMENT. This is done because the
>>> reallocateStacks() method is also used to resize the stack when
>>> needed. This behaviour results in a memory leak and in serious
>>> performance problems when reusing a parser instance for compiling
>>> more than one file. After I changed my code to not reuse the parser
>>> the compilation time for about 200 files was reduced from 10 seconds
>>> to under one second so it is now 10 times faster.
>>> The ParseController that gets generated by IMP also lazily
>>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>>> the same performance problems (I'm not sure if IMP reuses the
>>> ParseController).
>>> If someone can confirm that this is a bug I'll create a bug report
>>> for it.
>>> Greetings,
>>> Dieter
>
>
>
|
|
|
Re: Possible performance bug in LPG? [message #28279 is a reply to message #28240] |
Sat, 27 June 2009 09:39 |
Eclipse User |
|
|
|
Originally posted by: d.kleinrath.inode.at
Hi Ed,
Sorry to be annoying but I guess eather I don't understand you or you
don't understand me.
The memory leaks no matter if I keep references or not.
The method lpg.runtime.Stacks.reallocateStacks() does the following each
time a parser is reused:
It copies the arrays stateStack, locationStack and parseStack into new
arrays with the size "stack_length = old_stack_length + STACK_INCREMENT;".
The STACK_INCREMENT is by deafault 1024 so each time you reparse, the
stacks used by the parser will increase by this ammount. This extra stack
space is what leaks the memory and it is absolutly unnecessary because it
is never used. When compiling 200 files with the same parser there are
three stacks of size 205824 in the Parser and each of these stacks gets
copied into an even larger stack during the next parsing.
I just debugged the Stack class while using IMP and IMP suffers the same
memory leak during reconciling. If you keep changing an opened file, each
time IMP reconciles it the stacks will grow and grow and grow...
This bug can be fixed very easy by adding the following lines to the reset
method in the DeterministicParser:
locationStack = null;
parseStack= null;
stateStack= null;
I really hope I do not start to get too annoying with this issue but since
fixing it is so easy I thought I'll give it another try and I also think
that fixing it is really worth it.
best regards,
Dieter
Am 25.06.2009, 21:59 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
> Hi Dieter
>
> No it's 'your' own code that may be a problem.
>
> If you accidentally hang on to some reference to some AST node, it
> may reference all the other AST nodes, which may reference an
> IToken node which may lock in all IToken nodes via perhaps an array.
>
> One accidental non-nulled reference to the AST can really ripple.
> You must null all your AST references so that the garbage collector
> can clean up.
>
> Regards
>
> Ed Willink
>
>
> Dieter Kleinrath wrote:
>> Ok. So if one actually wanted to reuse the parser one would have to
>> null out the arrays in lpg.runtime.Stacks manually. This should really
>> be documented somewhere in the parser class - it caused me quite some
>> headache.
>> Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>>
>>> Hi Dieter
>>>
>>> If you're reparsing you need to work quite hard to make sure that you
>>> use a new parser and null out every possible reference to anything in
>>> the
>>> old Token array, or AST tree; otherwise memory leaks very fast. Use a
>>> new parser.
>>>
>>> Regards
>>>
>>> Ed Willink
>>>
>>> Dieter Kleinrath wrote:
>>>> While trying to improve the performance of my LPG parser I think I
>>>> found a serious performance bug in LPG.
>>>> Actually I can not believe that this wasn't mentioned before so if
>>>> this is known or if it is intended behaviour please appologize.
>>>> The problem arises if the same generated parser is reused to parse
>>>> several files. When a file is parsed with
>>>> lpg.runtime.DeterministicParser it first calls the method
>>>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>>>> method:
>>>> int old_stack_length = (stateStack == null ? 0 :
>>>> stateStack.length),
>>>> stack_length = old_stack_length + STACK_INCREMENT;
>>>> if (stateStack == null)
>>>> {
>>>> stateStack = new int[stack_length];
>>>> locationStack = new int[stack_length];
>>>> parseStack = new Object[stack_length];
>>>> }
>>>> else
>>>> {
>>>> System.arraycopy(stateStack, 0, stateStack = new
>>>> int[stack_length], 0, old_stack_length);
>>>> System.arraycopy(locationStack, 0, locationStack = new
>>>> int[stack_length], 0, old_stack_length);
>>>> System.arraycopy(parseStack, 0, parseStack = new
>>>> Object[stack_length], 0, old_stack_length);
>>>> }
>>>> return;
>>>> As you can see if the stateStack was already created, the parser
>>>> will copy its contents into a new array with the size
>>>> old_stack_length+STACK_INCREMENT. This is done because the
>>>> reallocateStacks() method is also used to resize the stack when
>>>> needed. This behaviour results in a memory leak and in serious
>>>> performance problems when reusing a parser instance for compiling
>>>> more than one file. After I changed my code to not reuse the parser
>>>> the compilation time for about 200 files was reduced from 10 seconds
>>>> to under one second so it is now 10 times faster.
>>>> The ParseController that gets generated by IMP also lazily
>>>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>>>> the same performance problems (I'm not sure if IMP reuses the
>>>> ParseController).
>>>> If someone can confirm that this is a bug I'll create a bug report
>>>> for it.
>>>> Greetings,
>>>> Dieter
>>
--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
|
|
| |
Re: Possible performance bug in LPG? [message #28391 is a reply to message #28353] |
Sun, 28 June 2009 18:31 |
Eclipse User |
|
|
|
Originally posted by: d.kleinrath.inode.at
Hi Ed,
Sorry. For whatever reason I thougth IMP and LPG are somehow connected.
The report would certainly belong to the LPG newsgroup.
Thanks for pointing out the memory problems associated with reusing a
parser in general.
Dieter
Am 28.06.2009, 07:38 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
> Hi Dieter
>
> > This bug can be fixed very easy by adding the following lines to the
> > reset method in the DeterministicParser:
> > locationStack = null;
> > parseStack= null;
> > stateStack= null;
>
> A bug with LPG should be raised against LPG and perhaps discussed
> on an LPG newsgroup.
>
> I am merely pointing out how difficult it is to avoid locking parsed
> artefacts in memory and consequently how reuse of a parser perhaps
> aggaravates the difficulties.
>
> Regards
>
> Ed Willink
--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
|
|
|
Re: Possible performance bug in LPG? [message #29128 is a reply to message #28391] |
Tue, 14 July 2009 16:53 |
Robert M. Fuhrer Messages: 294 Registered: July 2009 |
Senior Member |
|
|
Hi there,
Sorry for not having responded earlier; I was on vacation and recovering
from an important deadline. :-)
Yes, LPG issues are best brought up on the LPG forum on SourceForge.
[The connection between IMP and LPG is that the main LPG developer,
Philippe Charles, is also a member of the IMP team, but they're really
separate projects.]
Dieter Kleinrath wrote:
> Hi Ed,
>
> Sorry. For whatever reason I thougth IMP and LPG are somehow connected.
> The report would certainly belong to the LPG newsgroup.
> Thanks for pointing out the memory problems associated with reusing a
> parser in general.
>
> Dieter
>
>
> Am 28.06.2009, 07:38 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>> Hi Dieter
>>
>> > This bug can be fixed very easy by adding the following lines to the
>> > reset method in the DeterministicParser:
>> > locationStack = null;
>> > parseStack= null;
>> > stateStack= null;
>>
>> A bug with LPG should be raised against LPG and perhaps discussed
>> on an LPG newsgroup.
>>
>> I am merely pointing out how difficult it is to avoid locking parsed
>> artefacts in memory and consequently how reuse of a parser perhaps
>> aggaravates the difficulties.
>>
>> Regards
>>
>> Ed Willink
>
>
>
--
Cheers,
-- Bob
--------------------------------
Robert M. Fuhrer
Research Staff Member
Programming Technologies Dept.
IBM T.J. Watson Research Center
IDE Meta-tooling Platform Project Lead (http://www.eclipse.org/imp)
X10: Productive High-Performance Parallel Programming (http://x10.sf.net)
|
|
|
Re: Possible performance bug in LPG? [message #575596 is a reply to message #28123] |
Wed, 24 June 2009 20:29 |
Ed Willink Messages: 7670 Registered: July 2009 |
Senior Member |
|
|
Hi Dieter
If you're reparsing you need to work quite hard to make sure that you
use a new parser and null out every possible reference to anything in the
old Token array, or AST tree; otherwise memory leaks very fast. Use a new parser.
Regards
Ed Willink
Dieter Kleinrath wrote:
> While trying to improve the performance of my LPG parser I think I found
> a serious performance bug in LPG.
> Actually I can not believe that this wasn't mentioned before so if this
> is known or if it is intended behaviour please appologize.
>
> The problem arises if the same generated parser is reused to parse
> several files. When a file is parsed with
> lpg.runtime.DeterministicParser it first calls the method
> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this method:
>
> int old_stack_length = (stateStack == null ? 0 :
> stateStack.length),
> stack_length = old_stack_length + STACK_INCREMENT;
>
> if (stateStack == null)
> {
> stateStack = new int[stack_length];
> locationStack = new int[stack_length];
> parseStack = new Object[stack_length];
> }
> else
> {
> System.arraycopy(stateStack, 0, stateStack = new
> int[stack_length], 0, old_stack_length);
> System.arraycopy(locationStack, 0, locationStack = new
> int[stack_length], 0, old_stack_length);
> System.arraycopy(parseStack, 0, parseStack = new
> Object[stack_length], 0, old_stack_length);
> }
> return;
>
> As you can see if the stateStack was already created, the parser will
> copy its contents into a new array with the size
> old_stack_length+STACK_INCREMENT. This is done because the
> reallocateStacks() method is also used to resize the stack when needed.
> This behaviour results in a memory leak and in serious performance
> problems when reusing a parser instance for compiling more than one
> file. After I changed my code to not reuse the parser the compilation
> time for about 200 files was reduced from 10 seconds to under one second
> so it is now 10 times faster.
>
>
> The ParseController that gets generated by IMP also lazily instantiates
> the internal DeterministicParser. So perhaps IMP suffers the same
> performance problems (I'm not sure if IMP reuses the ParseController).
>
>
> If someone can confirm that this is a bug I'll create a bug report for it.
>
> Greetings,
> Dieter
|
|
|
Re: Possible performance bug in LPG? [message #575639 is a reply to message #28163] |
Wed, 24 June 2009 21:42 |
Dieter Kleinrath Messages: 5 Registered: July 2009 |
Junior Member |
|
|
Ok. So if one actually wanted to reuse the parser one would have to null
out the arrays in lpg.runtime.Stacks manually. This should really be
documented somewhere in the parser class - it caused me quite some
headache.
Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
> Hi Dieter
>
> If you're reparsing you need to work quite hard to make sure that you
> use a new parser and null out every possible reference to anything in the
> old Token array, or AST tree; otherwise memory leaks very fast. Use a
> new parser.
>
> Regards
>
> Ed Willink
>
> Dieter Kleinrath wrote:
>> While trying to improve the performance of my LPG parser I think I
>> found a serious performance bug in LPG.
>> Actually I can not believe that this wasn't mentioned before so if this
>> is known or if it is intended behaviour please appologize.
>> The problem arises if the same generated parser is reused to parse
>> several files. When a file is parsed with
>> lpg.runtime.DeterministicParser it first calls the method
>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>> method:
>> int old_stack_length = (stateStack == null ? 0 :
>> stateStack.length),
>> stack_length = old_stack_length + STACK_INCREMENT;
>> if (stateStack == null)
>> {
>> stateStack = new int[stack_length];
>> locationStack = new int[stack_length];
>> parseStack = new Object[stack_length];
>> }
>> else
>> {
>> System.arraycopy(stateStack, 0, stateStack = new
>> int[stack_length], 0, old_stack_length);
>> System.arraycopy(locationStack, 0, locationStack = new
>> int[stack_length], 0, old_stack_length);
>> System.arraycopy(parseStack, 0, parseStack = new
>> Object[stack_length], 0, old_stack_length);
>> }
>> return;
>> As you can see if the stateStack was already created, the parser will
>> copy its contents into a new array with the size
>> old_stack_length+STACK_INCREMENT. This is done because the
>> reallocateStacks() method is also used to resize the stack when needed.
>> This behaviour results in a memory leak and in serious performance
>> problems when reusing a parser instance for compiling more than one
>> file. After I changed my code to not reuse the parser the compilation
>> time for about 200 files was reduced from 10 seconds to under one
>> second so it is now 10 times faster.
>> The ParseController that gets generated by IMP also lazily
>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>> the same performance problems (I'm not sure if IMP reuses the
>> ParseController).
>> If someone can confirm that this is a bug I'll create a bug report
>> for it.
>> Greetings,
>> Dieter
--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
|
|
|
Re: Possible performance bug in LPG? [message #575662 is a reply to message #28201] |
Thu, 25 June 2009 19:59 |
Ed Willink Messages: 7670 Registered: July 2009 |
Senior Member |
|
|
Hi Dieter
No it's 'your' own code that may be a problem.
If you accidentally hang on to some reference to some AST node, it
may reference all the other AST nodes, which may reference an
IToken node which may lock in all IToken nodes via perhaps an array.
One accidental non-nulled reference to the AST can really ripple.
You must null all your AST references so that the garbage collector
can clean up.
Regards
Ed Willink
Dieter Kleinrath wrote:
> Ok. So if one actually wanted to reuse the parser one would have to null
> out the arrays in lpg.runtime.Stacks manually. This should really be
> documented somewhere in the parser class - it caused me quite some
> headache.
>
>
> Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>
>> Hi Dieter
>>
>> If you're reparsing you need to work quite hard to make sure that you
>> use a new parser and null out every possible reference to anything in the
>> old Token array, or AST tree; otherwise memory leaks very fast. Use a
>> new parser.
>>
>> Regards
>>
>> Ed Willink
>>
>> Dieter Kleinrath wrote:
>>> While trying to improve the performance of my LPG parser I think I
>>> found a serious performance bug in LPG.
>>> Actually I can not believe that this wasn't mentioned before so if
>>> this is known or if it is intended behaviour please appologize.
>>> The problem arises if the same generated parser is reused to parse
>>> several files. When a file is parsed with
>>> lpg.runtime.DeterministicParser it first calls the method
>>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>>> method:
>>> int old_stack_length = (stateStack == null ? 0 :
>>> stateStack.length),
>>> stack_length = old_stack_length + STACK_INCREMENT;
>>> if (stateStack == null)
>>> {
>>> stateStack = new int[stack_length];
>>> locationStack = new int[stack_length];
>>> parseStack = new Object[stack_length];
>>> }
>>> else
>>> {
>>> System.arraycopy(stateStack, 0, stateStack = new
>>> int[stack_length], 0, old_stack_length);
>>> System.arraycopy(locationStack, 0, locationStack = new
>>> int[stack_length], 0, old_stack_length);
>>> System.arraycopy(parseStack, 0, parseStack = new
>>> Object[stack_length], 0, old_stack_length);
>>> }
>>> return;
>>> As you can see if the stateStack was already created, the parser
>>> will copy its contents into a new array with the size
>>> old_stack_length+STACK_INCREMENT. This is done because the
>>> reallocateStacks() method is also used to resize the stack when
>>> needed. This behaviour results in a memory leak and in serious
>>> performance problems when reusing a parser instance for compiling
>>> more than one file. After I changed my code to not reuse the parser
>>> the compilation time for about 200 files was reduced from 10 seconds
>>> to under one second so it is now 10 times faster.
>>> The ParseController that gets generated by IMP also lazily
>>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>>> the same performance problems (I'm not sure if IMP reuses the
>>> ParseController).
>>> If someone can confirm that this is a bug I'll create a bug report
>>> for it.
>>> Greetings,
>>> Dieter
>
>
>
|
|
|
Re: Possible performance bug in LPG? [message #575714 is a reply to message #28240] |
Sat, 27 June 2009 09:39 |
Dieter Kleinrath Messages: 5 Registered: July 2009 |
Junior Member |
|
|
Hi Ed,
Sorry to be annoying but I guess eather I don't understand you or you
don't understand me.
The memory leaks no matter if I keep references or not.
The method lpg.runtime.Stacks.reallocateStacks() does the following each
time a parser is reused:
It copies the arrays stateStack, locationStack and parseStack into new
arrays with the size "stack_length = old_stack_length + STACK_INCREMENT;".
The STACK_INCREMENT is by deafault 1024 so each time you reparse, the
stacks used by the parser will increase by this ammount. This extra stack
space is what leaks the memory and it is absolutly unnecessary because it
is never used. When compiling 200 files with the same parser there are
three stacks of size 205824 in the Parser and each of these stacks gets
copied into an even larger stack during the next parsing.
I just debugged the Stack class while using IMP and IMP suffers the same
memory leak during reconciling. If you keep changing an opened file, each
time IMP reconciles it the stacks will grow and grow and grow...
This bug can be fixed very easy by adding the following lines to the reset
method in the DeterministicParser:
locationStack = null;
parseStack= null;
stateStack= null;
I really hope I do not start to get too annoying with this issue but since
fixing it is so easy I thought I'll give it another try and I also think
that fixing it is really worth it.
best regards,
Dieter
Am 25.06.2009, 21:59 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
> Hi Dieter
>
> No it's 'your' own code that may be a problem.
>
> If you accidentally hang on to some reference to some AST node, it
> may reference all the other AST nodes, which may reference an
> IToken node which may lock in all IToken nodes via perhaps an array.
>
> One accidental non-nulled reference to the AST can really ripple.
> You must null all your AST references so that the garbage collector
> can clean up.
>
> Regards
>
> Ed Willink
>
>
> Dieter Kleinrath wrote:
>> Ok. So if one actually wanted to reuse the parser one would have to
>> null out the arrays in lpg.runtime.Stacks manually. This should really
>> be documented somewhere in the parser class - it caused me quite some
>> headache.
>> Am 24.06.2009, 22:29 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>>
>>> Hi Dieter
>>>
>>> If you're reparsing you need to work quite hard to make sure that you
>>> use a new parser and null out every possible reference to anything in
>>> the
>>> old Token array, or AST tree; otherwise memory leaks very fast. Use a
>>> new parser.
>>>
>>> Regards
>>>
>>> Ed Willink
>>>
>>> Dieter Kleinrath wrote:
>>>> While trying to improve the performance of my LPG parser I think I
>>>> found a serious performance bug in LPG.
>>>> Actually I can not believe that this wasn't mentioned before so if
>>>> this is known or if it is intended behaviour please appologize.
>>>> The problem arises if the same generated parser is reused to parse
>>>> several files. When a file is parsed with
>>>> lpg.runtime.DeterministicParser it first calls the method
>>>> lpg.runtime.Stacks.reallocateStacks(). Here is the code from this
>>>> method:
>>>> int old_stack_length = (stateStack == null ? 0 :
>>>> stateStack.length),
>>>> stack_length = old_stack_length + STACK_INCREMENT;
>>>> if (stateStack == null)
>>>> {
>>>> stateStack = new int[stack_length];
>>>> locationStack = new int[stack_length];
>>>> parseStack = new Object[stack_length];
>>>> }
>>>> else
>>>> {
>>>> System.arraycopy(stateStack, 0, stateStack = new
>>>> int[stack_length], 0, old_stack_length);
>>>> System.arraycopy(locationStack, 0, locationStack = new
>>>> int[stack_length], 0, old_stack_length);
>>>> System.arraycopy(parseStack, 0, parseStack = new
>>>> Object[stack_length], 0, old_stack_length);
>>>> }
>>>> return;
>>>> As you can see if the stateStack was already created, the parser
>>>> will copy its contents into a new array with the size
>>>> old_stack_length+STACK_INCREMENT. This is done because the
>>>> reallocateStacks() method is also used to resize the stack when
>>>> needed. This behaviour results in a memory leak and in serious
>>>> performance problems when reusing a parser instance for compiling
>>>> more than one file. After I changed my code to not reuse the parser
>>>> the compilation time for about 200 files was reduced from 10 seconds
>>>> to under one second so it is now 10 times faster.
>>>> The ParseController that gets generated by IMP also lazily
>>>> instantiates the internal DeterministicParser. So perhaps IMP suffers
>>>> the same performance problems (I'm not sure if IMP reuses the
>>>> ParseController).
>>>> If someone can confirm that this is a bug I'll create a bug report
>>>> for it.
>>>> Greetings,
>>>> Dieter
>>
--
Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/
|
|
| | |
Re: Possible performance bug in LPG? [message #575984 is a reply to message #28391] |
Tue, 14 July 2009 16:53 |
Robert M. Fuhrer Messages: 294 Registered: July 2009 |
Senior Member |
|
|
Hi there,
Sorry for not having responded earlier; I was on vacation and recovering
from an important deadline. :-)
Yes, LPG issues are best brought up on the LPG forum on SourceForge.
[The connection between IMP and LPG is that the main LPG developer,
Philippe Charles, is also a member of the IMP team, but they're really
separate projects.]
Dieter Kleinrath wrote:
> Hi Ed,
>
> Sorry. For whatever reason I thougth IMP and LPG are somehow connected.
> The report would certainly belong to the LPG newsgroup.
> Thanks for pointing out the memory problems associated with reusing a
> parser in general.
>
> Dieter
>
>
> Am 28.06.2009, 07:38 Uhr, schrieb Ed Willink <ed@willink.me.uk>:
>> Hi Dieter
>>
>> > This bug can be fixed very easy by adding the following lines to the
>> > reset method in the DeterministicParser:
>> > locationStack = null;
>> > parseStack= null;
>> > stateStack= null;
>>
>> A bug with LPG should be raised against LPG and perhaps discussed
>> on an LPG newsgroup.
>>
>> I am merely pointing out how difficult it is to avoid locking parsed
>> artefacts in memory and consequently how reuse of a parser perhaps
>> aggaravates the difficulties.
>>
>> Regards
>>
>> Ed Willink
>
>
>
--
Cheers,
-- Bob
--------------------------------
Robert M. Fuhrer
Research Staff Member
Programming Technologies Dept.
IBM T.J. Watson Research Center
IDE Meta-tooling Platform Project Lead (http://www.eclipse.org/imp)
X10: Productive High-Performance Parallel Programming (http://x10.sf.net)
|
|
|
Goto Forum:
Current Time: Wed Sep 25 19:28:04 GMT 2024
Powered by FUDForum. Page generated in 0.08417 seconds
|