Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [cdt-dev] CDT performance

Thanks for looking into this Oleg. The indexer performance is our number one
problem and it is the top priority for my personal development work on the
CDT (now if I can get development work to be my top priority :( ).

We spent a lot of time back at IBM optimizing performance at this level, but
I'm not surprised that there is more we can do like this. Raising the issues
you have found as bug reports is a good idea. And I will accept patches that
address these issues especially for 3.0.2.

However from what I've seen and heard, we really need to increase
performance by orders of magnitude and this will require a whole new
indexing strategy. My work on the PDOM for 3.1, which in the end is really
just a new indexer, is aimed at indexing each file once, somewhat like ctags
but with a real parser and a symbols database to fill in the gaps. The
accuracy will be a bit less but the performance gains should be very
significant.

In the meantime, I am recommending users use the ctags indexer on large
project. It doesn't pick up references and it doesn't handle heavily macro'd
code very well, but it is lighting fast (3 minutes on Mozilla's 17000 files)
and works in a lot of cases.

Cheers,
Doug

> -----Original Message-----
> From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On
> Behalf Of Krasilnikov, Oleg
> Sent: Friday, November 18, 2005 9:48 AM
> To: CDT General developers list.
> Cc: Voronin, Mikhail; Sennikovsky, Mikhail
> Subject: RE: [cdt-dev] CDT performance
> 
> Hi all.
> 
> I am not sure whether the thing described below is useful or not.
> What do you think, is it reasonable to continue this work,
> 
> In brief, there are a lot of complains about CDT indexing speed.
> After process tracing, the following results are obtained:
> 
> - approximately 2/3 of time is consumed by "parser.parse()" operation.
>   It is not a secret. May be this process can be enchanced, but now I
>   have no exact proposals. See further section.
> 
> - approximately 1/4 of time is consumed by
> "CGenerateIndexVisitor.visit(..)" calls.
>   In fact, these methods are called very frequently. For example, when I
> have tried
>   with small C project (Midnight Commander source, 125 files), calls to
> visit(..)
>   were performed ~140 000 times. So it seems to be reasonable to tune up
> this
>   code as much as possible.
> 
> I've tested some minor changes today. Details are below.
> Totally, they can save ~1-2% of indexing time, according to my
> measurements.
> It's not too much, but work is not finished now.
> 
> My question is:
> should corresponding BugZilla ID be created,
> or all these things are useless at all ?
> 
> ------------------------------------------------------------------------
> -----------
> All changes are related to package
> org.eclipse.cdt.internal.core.index.domsourceindexer
> 
> 1.	CGenerateIndexVisitor.getFullyQualifiedName():
> 
> Avoid calling "name.resolveBinding()" - instead pass it from
> processNameBinding(),
> where it's already obtained.
> 
> 2.	CGenerateIndexVisitor.processNameBinding():
> 
> Avoid calling "IndexEncoderUtil.getFileLocation(name)",
> because direct call to "name.getFileLocation()" returns the same value;
> Currently used static method is only wrapper around it.
> 
> 3.	CGenerateIndexVisitor.processNameBinding():
> 
> Reorder if..elseif blocks according to their statistical frequency:
> 
> if (name.isReference())
> else if (name.isDefinition())
> else if (name.isDeclaration())
> 
> if (binding instanceof IVariable && !(binding instanceof IParameter))
> else if (binding instanceof IFunction)
> else if (binding instanceof IField)
> else if (binding instanceof ITypedef)
> else if (binding instanceof IEnumerator)
> else if (binding instanceof ICompositeType)
> else if (binding instanceof IEnumeration)
> 
> 4.	CGenerateIndexVisitor.processNameBinding():
> 
> Avoid calling "name.getTranslationUnit().getScope()" for each variable.
> Instead, store it in class variable and initialize it in object
> constructor.
> (since CGenerateIndexVisitor object is created for separate translation
> unit, we can assume that scope value will be constant during object
> lifetime).
> 
> 5.	IndexEncoderUtil.nodeInVisitedExternalHeader()
> 
> Currently, only the last file name is remembered.  In case of names
> interchange
> (stdio.h string.h stdio.h string.h etc) caching does not work, and full
> checks are
> performed again and again. It affects ~10% of
> 
> Providing level 2 cache (in ArrayList) can help to avoid extra
> operations.
> 
> Statistics:
> Total calls: ~165 000
> Repeated last name: 148 767 (already implemented)
> Name in L2 cache:     14 421 (to be implemented)
> New names:                2 384 (full check is performed)
> 
> Note: mentioned method is static. I cannot imagine how it can work with
> multithreads.
> So it seems like this functionality needs to be reviewed in general.
> 
> 6.	To be continued with:
> 
> - parse()
> - CPPGenerateIndexVisitor specific
> - and so on.
> -----------------------------------
> With best regards, Oleg Krasilnikov
> Software designer, Eclipse team.
> Intel corp.
> +7 8312 162 444 ext. 2587
> (Russia, Nizhny Novgorod)
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev


Back to the top