Eclipse Community Forums: C / C++ IDE (CDT) » Eclipse Runs out of Memory when Parsing Include files

Home » Language IDEs » C / C++ IDE (CDT) » Eclipse Runs out of Memory when Parsing Include files

Eclipse Runs out of Memory when Parsing Include files [message #25998]

Wed, 24 April 2002 08:59

Eclipse User

Originally posted by: hbhasker.yahoo.com

Hi,

I am having a problem with Eclipse, from what i understand to enable code
completion one needs to Parse the header files, So to do that, i set the
parsing level to comprehensive and asked it to parse the DirectX header
files and the Borland\Include header files. It seems to always run out of
memory when i ask it to parse both of them, if I ask it only to Parse
Borland Include files it manages to do it, But I would like to use Code
completion for DX functions also..

So Any suggestions as to how i can manage to do it?? Has anybody faced a
similar problem parsing include files?

If i set the parsing level to fast, it parses comfortably, but in which
case only my class Names show up, when i click on them the functions for
my classes don't show up in the Project Details window? Why is this so?

But if i ask it to show all objects in the Project Object Window it shows
all the functions there..

any and all help will be appreciated.:-)

regards,
Bhasker

Re: Eclipse Runs out of Memory when Parsing Include files [message #26081 is a reply to message #25998]

Wed, 24 April 2002 10:38

Eclipse User

Hi,

There are known memory consumption issues when parsing large numbers of
files. This is a problem that we are investigating now. You can use less
memory by changing the parsing level to something less comprehensive. For
code assist, the most comprehensive parse level is not needed - I think the
next highest level should work and consume less memory.

The functions problem that you mention is related to something else being
worked on. Right now, the parser treats function declarations and
definitions as different entities. In the fast parse, the parser does not go
below the class declaration level, but it does find function definitions for
the functions of a class. There is work now being done to connect the
function definitions to the declarations so that the definitiions will be
accessable from under the class. This is described here:
http://bugs.eclipse.org/bugs/show_bug.cgi?id=7544

Hope that helps,
Dave

Bhasker wrote:

> Hi,
>
> I am having a problem with Eclipse, from what i understand to enable code
> completion one needs to Parse the header files, So to do that, i set the
> parsing level to comprehensive and asked it to parse the DirectX header
> files and the Borland\Include header files. It seems to always run out of
> memory when i ask it to parse both of them, if I ask it only to Parse
> Borland Include files it manages to do it, But I would like to use Code
> completion for DX functions also..
>
> So Any suggestions as to how i can manage to do it?? Has anybody faced a
> similar problem parsing include files?
>
> If i set the parsing level to fast, it parses comfortably, but in which
> case only my class Names show up, when i click on them the functions for
> my classes don't show up in the Project Details window? Why is this so?
>
> But if i ask it to show all objects in the Project Object Window it shows
> all the functions there..
>
> any and all help will be appreciated.:-)
>
> regards,
> Bhasker

Re: Eclipse Runs out of Memory when Parsing Include files [message #26399 is a reply to message #26081]

Thu, 25 April 2002 00:09

Eclipse User

Originally posted by: hbhasker.yahoo.com

Hi David,

From what i understand, why not use something like ctags for finding the
declarations, the V IDE uses ctags and to locate references for all files,
and is using an XML file for storing all the declarations a good idea?
Because from what i understand of XML DOM parsers, they require the whole
tree to be in memory? correct me if i am wrong? How does MSVC or Builder
handle such issues? Won't it be better to use some database structure and
look it up? and also the XML parsed information tends to create a really
big file for just parsing the DX header and Windows Header a 17MB
file...or more...at one step below the Comprehensive level? Is this really
feasible? I have seen that Eclipse really hogs huge amounts of memory once
u ask it to parse the header files.. close to sometimes 78MB or so..

do let me know your views on the same,
regards,
Bhasker

David McKnight wrote:

> Hi,

> There are known memory consumption issues when parsing large numbers of
> files. This is a problem that we are investigating now. You can use less
> memory by changing the parsing level to something less comprehensive. For
> code assist, the most comprehensive parse level is not needed - I think the
> next highest level should work and consume less memory.

> The functions problem that you mention is related to something else being
> worked on. Right now, the parser treats function declarations and
> definitions as different entities. In the fast parse, the parser does not go
> below the class declaration level, but it does find function definitions for
> the functions of a class. There is work now being done to connect the
> function definitions to the declarations so that the definitiions will be
> accessable from under the class. This is described here:
> http://bugs.eclipse.org/bugs/show_bug.cgi?id=7544

> Hope that helps,
> Dave

> Bhasker wrote:

> > Hi,
> >
> > I am having a problem with Eclipse, from what i understand to enable code
> > completion one needs to Parse the header files, So to do that, i set the
> > parsing level to comprehensive and asked it to parse the DirectX header
> > files and the Borland\Include header files. It seems to always run out of
> > memory when i ask it to parse both of them, if I ask it only to Parse
> > Borland Include files it manages to do it, But I would like to use Code
> > completion for DX functions also..
> >
> > So Any suggestions as to how i can manage to do it?? Has anybody faced a
> > similar problem parsing include files?
> >
> > If i set the parsing level to fast, it parses comfortably, but in which
> > case only my class Names show up, when i click on them the functions for
> > my classes don't show up in the Project Details window? Why is this so?
> >
> > But if i ask it to show all objects in the Project Object Window it shows
> > all the functions there..
> >
> > any and all help will be appreciated.:-)
> >
> > regards,
> > Bhasker

Re: Eclipse Runs out of Memory when Parsing Include files [message #26496 is a reply to message #26399]

Thu, 25 April 2002 09:32

Eclipse User

Hi Bhasker,

Improving the performace of the parser is a definite work item which will be
done under Enhancement 14600
(http://bugs.eclipse.org/bugs/show_bug.cgi?id=14600). All methods of
improving parser performance will be "on the table", such as using ctags or
a database backend as you suggest. Some of the reasons we initially decided
against using ctags were:

1. It doesn't provide enough information...Things such as function
callees\callers, and where variables are used in statements, parameter lists
etc. are not supported.
2. Extending it is not easy. If there were features that we wished to add,
we could provide our own language parser, but then we would get into the
business of shipping different builds of it for the various platforms that
we run on (or requiring the user to build it first). Having a parser
written in java allows to run anywhere that has a jre.
3. ctags does not do preprocessing...I guess you could run the code through
something like m4 before giving it to ctags, but then you lose information
about where macros are used. From the ctags documentation:
"Because ctags is neither a preprocessor nor a compiler, use of preprocessor
macros can fool ctags into either missing tags or improperly generating
inappropriate tags"

Anyway, these reasons will need to be re-examined. Another interesting
development is the discussions on the cdt-dev mailing lists about coming up
with APIs that will allow various pieces of the CDT to be replaceable. The
parser is an obvious candidate...If we could come up with a good set of
APIs, then various parsers that each have their own strengths could be
plugged in by the user depending on what sort of requirements they have.

Jeff.

"Bhasker" <hbhasker@yahoo.com> wrote in message
news:aa7vi3$65c$1@rogue.oti.com...
> Hi David,
>
> From what i understand, why not use something like ctags for finding the
> declarations, the V IDE uses ctags and to locate references for all files,
> and is using an XML file for storing all the declarations a good idea?
> Because from what i understand of XML DOM parsers, they require the whole
> tree to be in memory? correct me if i am wrong? How does MSVC or Builder
> handle such issues? Won't it be better to use some database structure and
> look it up? and also the XML parsed information tends to create a really
> big file for just parsing the DX header and Windows Header a 17MB
> file...or more...at one step below the Comprehensive level? Is this really
> feasible? I have seen that Eclipse really hogs huge amounts of memory once
> u ask it to parse the header files.. close to sometimes 78MB or so..
>
> do let me know your views on the same,
> regards,
> Bhasker
>
> David McKnight wrote:
>
> > Hi,
>
> > There are known memory consumption issues when parsing large numbers of
> > files. This is a problem that we are investigating now. You can use
less
> > memory by changing the parsing level to something less comprehensive.
For
> > code assist, the most comprehensive parse level is not needed - I think
the
> > next highest level should work and consume less memory.
>
> > The functions problem that you mention is related to something else
being
> > worked on. Right now, the parser treats function declarations and
> > definitions as different entities. In the fast parse, the parser does
not go
> > below the class declaration level, but it does find function definitions
for
> > the functions of a class. There is work now being done to connect the
> > function definitions to the declarations so that the definitiions will
be
> > accessable from under the class. This is described here:
> > http://bugs.eclipse.org/bugs/show_bug.cgi?id=7544
>
> > Hope that helps,
> > Dave
>
> > Bhasker wrote:
>
> > > Hi,
> > >
> > > I am having a problem with Eclipse, from what i understand to enable
code
> > > completion one needs to Parse the header files, So to do that, i set
the
> > > parsing level to comprehensive and asked it to parse the DirectX
header
> > > files and the Borland\Include header files. It seems to always run out
of
> > > memory when i ask it to parse both of them, if I ask it only to Parse
> > > Borland Include files it manages to do it, But I would like to use
Code
> > > completion for DX functions also..
> > >
> > > So Any suggestions as to how i can manage to do it?? Has anybody faced
a
> > > similar problem parsing include files?
> > >
> > > If i set the parsing level to fast, it parses comfortably, but in
which
> > > case only my class Names show up, when i click on them the functions
for
> > > my classes don't show up in the Project Details window? Why is this
so?
> > >
> > > But if i ask it to show all objects in the Project Object Window it
shows
> > > all the functions there..
> > >
> > > any and all help will be appreciated.:-)
> > >
> > > regards,
> > > Bhasker
>
>
>
>

Re: Eclipse Runs out of Memory when Parsing Include files [message #27104 is a reply to message #26496]

Fri, 26 April 2002 17:12

Eclipse User

>>>>> "Jeff" == Jeff Turnham <turnham@ca.ibm.com> writes:

Jeff> Improving the performace of the parser is a definite work item
Jeff> which will be done under Enhancement 14600
Jeff> (http://bugs.eclipse.org/bugs/show_bug.cgi?id=14600). All
Jeff> methods of improving parser performance will be "on the table",
Jeff> such as using ctags or a database backend as you suggest.

Great! We've done some investigation and we've also found problems
with the current parser performance (specifically memory use). So far
we haven't tried to find out exactly what the problem might be.

Jeff> Some of the reasons we initially decided against using ctags
Jeff> were:

I agree, ctags doesn't seem all that useful. It really provides the
bare minimum of information.

Jeff> 3. ctags does not do preprocessing...I guess you could run the
Jeff> code through something like m4 before giving it to ctags, but
Jeff> then you lose information about where macros are used.

I'm curious about this point. How well does the current parser deal
with code that can't compile? How does it handle
conditionally-compiled code? Is there any documentation on the parser
I could read? I looked at the source a little but there isn't too
much documentation there, especially on the "big picture".

For Source Navigator (http://sources.redhat.com/sourcenav/), we used a
fuzzy parser, which, as I recall, didn't run the preprocessor. So
generally it had to guess about certain things. In practice this
turned out to work fairly well. It had the nice feature of being able
to do something intelligent with code that wouldn't compile or
preprocess; this is useful in a number of situations.

S-N's parsers fed all the information into a group of Berkeley
databases (as I recall -- I haven't worked on S-N for a few years --
they were all just different ways of indexing the same data). We ran
the parser once per file in the project (and automatically re-ran it
whenever the file was saved). This approach had a bit of overhead but
the memory use was low; we were able to scan all of gcc using S-N.

Jeff> Another interesting development is the discussions on the
Jeff> cdt-dev mailing lists about coming up with APIs that will allow
Jeff> various pieces of the CDT to be replaceable. The parser is an
Jeff> obvious candidate...

This would be interesting.

I'm curious to know if the parser is the major contributor to the
problem, or if the data store itself is. Have your investigations
looked at this? (If I'm way off base here, please point it out. I'm
still early in the learning phase here.)

It is possible that we could re-use the existing S-N parsers for
Eclipse. They're written in C. For us, having to rebuild parsers for
a particular machine isn't a big problem. I think we'd gladly trade
this little build difficulty for a performance/memory usage gain
(barring other factors). The S-N parsers are GPL; Red Hat might be
willing to relicense them (I'd have to ask management).

Tom

Re: Eclipse Runs out of Memory when Parsing Include files [message #27182 is a reply to message #27104]

Fri, 26 April 2002 18:25

Eclipse User

Tom Tromey wrote:

> >>>>> "Jeff" == Jeff Turnham <turnham@ca.ibm.com> writes:
>
> Jeff> Improving the performace of the parser is a definite work item
> Jeff> which will be done under Enhancement 14600
> Jeff> (http://bugs.eclipse.org/bugs/show_bug.cgi?id=14600). All
> Jeff> methods of improving parser performance will be "on the table",
> Jeff> such as using ctags or a database backend as you suggest.
>
> Great! We've done some investigation and we've also found problems
> with the current parser performance (specifically memory use). So far
> we haven't tried to find out exactly what the problem might be.
>
> Jeff> Some of the reasons we initially decided against using ctags
> Jeff> were:
>
> I agree, ctags doesn't seem all that useful. It really provides the
> bare minimum of information.
>
> Jeff> 3. ctags does not do preprocessing...I guess you could run the
> Jeff> code through something like m4 before giving it to ctags, but
> Jeff> then you lose information about where macros are used.
>
> I'm curious about this point. How well does the current parser deal
> with code that can't compile? How does it handle
> conditionally-compiled code? Is there any documentation on the parser
> I could read? I looked at the source a little but there isn't too
> much documentation there, especially on the "big picture".
>
> For Source Navigator (http://sources.redhat.com/sourcenav/), we used a
> fuzzy parser, which, as I recall, didn't run the preprocessor. So
> generally it had to guess about certain things. In practice this
> turned out to work fairly well. It had the nice feature of being able
> to do something intelligent with code that wouldn't compile or
> preprocess; this is useful in a number of situations.
>
> S-N's parsers fed all the information into a group of Berkeley
> databases (as I recall -- I haven't worked on S-N for a few years --
> they were all just different ways of indexing the same data). We ran
> the parser once per file in the project (and automatically re-ran it
> whenever the file was saved). This approach had a bit of overhead but
> the memory use was low; we were able to scan all of gcc using S-N.
>
> Jeff> Another interesting development is the discussions on the
> Jeff> cdt-dev mailing lists about coming up with APIs that will allow
> Jeff> various pieces of the CDT to be replaceable. The parser is an
> Jeff> obvious candidate...
>
> This would be interesting.
>
> I'm curious to know if the parser is the major contributor to the
> problem, or if the data store itself is. Have your investigations
> looked at this? (If I'm way off base here, please point it out. I'm
> still early in the learning phase here.)
>

The amount of memory taken up by DataElements in the DataStore is
certainly a bit part of the memory problem. Right now, every piece of
parse information is stored in DataElement form and the strings that are
used for representing attributes of the elements seem to be consuming most
of the memory. DataElements are used in order to get information into
views, to communicate information across a network and as the means of
integration with other tools (miners). In the case of the parser, we
should only really need to turn parse info into DataElements when we want
to provide information to the UI or other tools. That is one way that
memory can be reduced specifically for the parser. The other thing is
that, for DataElements in general, we need to provide a more optimal way
of managing strings. Currently there's a lot of redundant string
information that could be reduced; for example, the source attribute of
each parse element contains both the qualified name of the source file
that was parsed along with the line of the file in which the element was
found. If a single file contains 1000 elements, then there could be 1000
duplicates of the string that represents the same source file name. One
approach that would be an improvement is to have a single element that
represents the source file attribute (without the line) and have each
parse element reference it instead of contain it's own.

>
> It is possible that we could re-use the existing S-N parsers for
> Eclipse. They're written in C. For us, having to rebuild parsers for
> a particular machine isn't a big problem. I think we'd gladly trade
> this little build difficulty for a performance/memory usage gain
> (barring other factors). The S-N parsers are GPL; Red Hat might be
> willing to relicense them (I'd have to ask management).
>
> Tom

Dave

Previous Topic:	Proposal for Uniform Environment Support in the CDT...Comments?
Next Topic:	org.eclipse.cdt.linux.help -- asking for comments

Goto Forum:

-=] Back to Top [=-

Current Time: Fri Jul 25 16:03:51 EDT 2025

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter