Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
RE: [cdt-dev] Performance problems building C++ ASTs

Yes. Oh what joy. :)
So what should I be doing? I have tried this:

ITranslationUnit tu = ...
IIndex index = CCorePlugin.getIndexManager().getIndex(tu.getCProject());
IASTTranslationUnit ast = tu.getAST(index, ITranslationUnit.AST_SKIP_ALL_HEADERS) ;
// initially I was doing just getAST() with no args

which does indeed run much faster :) but then I can't find declarations for anything.

If I do getAST(index, 0) there is no speedup

What I'm trying to do is: walk the AST, and for each api found, determine if it's coming from a header file in a particular path.
Is there a way to only have it, say, parse the header files of the ones I'm interested in?

Then, in some other code doing some very different analysis, by a different author, we get the AST like this (I have not tested performance of this on large C++ apps because it was aimed at C-only apps anyway):
IFile file;
IASTTranslationUnit ast = CDOM.getInstance().getASTService().getTranslationUnit(file,
CDOM.getInstance().getCodeReaderFactory(CDOM.PARSE_SAVED_RESOURCES));

Pros/Cons of each way? suggestions for improving?


...Beth

Beth Tibbitts (859) 243-4981 (TL 545-4981)
High Productivity Tools / Parallel Tools http://eclipse.org/ptp
IBM T.J.Watson Research Center
Mailing Address: IBM Corp., 745 West New Circle Road, Lexington, KY 40511
Inactive hide details for "Schaefer, Doug" <Doug.Schaefer@xxxxxxxxxxxxx>"Schaefer, Doug" <Doug.Schaefer@xxxxxxxxxxxxx>


          "Schaefer, Doug" <Doug.Schaefer@xxxxxxxxxxxxx>
          Sent by: cdt-dev-bounces@xxxxxxxxxxx

          05/21/08 03:18 PM

          Please respond to
          "CDT General developers list." <cdt-dev@xxxxxxxxxxx>

To

"CDT General developers list." <cdt-dev@xxxxxxxxxxx>

cc


Subject

RE: [cdt-dev] Performance problems building C++ ASTs

It sounds like your joining in the fun we've had over the last 5 years with performance of parsing :) Welcome!

Does the LOC include header files? That's the usual killer in C++ land. iostream with all it's template madness is huge.

You're times look very familiar. The way I figured it, the first run read all the header files into the operating system's cache making the subsequent ones a lot faster (happens almost all the time). I usuall ran the performance tests until I got three in a row that were close.

In the end, this is what lead us to the architecture of getting information out of the index instead of parsing the header files. The performance gains were dramatic.

Cheers,
Doug.


From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Beth Tibbitts
Sent:
Wednesday, May 21, 2008 3:11 PM
To:
CDT General developers list.
Subject:
RE: [cdt-dev] Performance problems building C++ ASTs

To be honest I don't think I ever ran the analysis on a large MPI C++ project - only on small C++ test cases, using large benchmarks projects that were C-only.
As my C++ test cases got bigger I noticed the slow down, and it *may* have been there for a long time.
Blame Greg. He found a large C++ MPI program testcase. :)
But if this helps me speed 'em all up, then that's terrific.

>apples vs. apples
The difference in time is remarkable, yes.
I don't know about #tokens but here are my rough test cases in lines of code ... tokens would be roughly proportional.

C++ loc=56 getAST() timing, three tries: 3.3,1.6,1.9 sec
C loc=80 getAST() timing, three tries: .19,.10,.03 sec

These times are just System.currentTimeMillis() before and after the getAST() call (no args. looking at using args now)

I assume the first run does some setup that the subsequent runs get to share.
I even did the C one before the C++ one once, but the times were still in the same ball park.
Repeated it a couple of times bringing up runtime workbench anew each time too - results always in the same ball park too.

This is CDT 4.0.3.200802251018


...Beth

Beth Tibbitts (859) 243-4981 (TL 545-4981)
High Productivity Tools / Parallel Tools
http://eclipse.org/ptp
IBM T.J.Watson Research Center
Mailing Address: IBM Corp., 745 West New Circle Road, Lexington, KY 40511
Inactive hide details for "Schaefer, Doug" <Doug.Schaefer@xxxxxxxxxxxxx>"Schaefer, Doug" <Doug.Schaefer@xxxxxxxxxxxxx>

                  "Schaefer, Doug" <Doug.Schaefer@xxxxxxxxxxxxx>
                  Sent by: cdt-dev-bounces@xxxxxxxxxxx

                  05/21/08 02:34 PM

Please respond to
"CDT General developers list." <cdt-dev@xxxxxxxxxxx>
To

"CDT General developers list." <cdt-dev@xxxxxxxxxxx>
cc
Subject

RE: [cdt-dev] Performance problems building C++ ASTs

Before going to far, Beth, is this a regression? Was it faster before?


Our timing for the trilogy (stdio.h, windows.h, iostream) was always in the 3 second area doing a full parse. The time of the parse is relative to the number of tokens read in. Are you comparing C apples with C++ apples?


Cheers,
Doug.




From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Mike Kucera
Sent:
Wednesday, May 21, 2008 2:15 PM
To:
CDT General developers list.
Subject:
RE: [cdt-dev] Performance problems building C++ ASTs

I think you can get a reference to the IIndex using CCorePlugin.getIndexManager().getIndex(tu.getCProject()).
Then call getAST() with the option ITranslationUnit.AST_SKIP_ALL_HEADERS.

This will cause the preprocessor to ignore #include directives, which is a huge performance boost. Bindings and macro calls will be resolved using the index when necessary.

I don't know if the OpenMP analysis code depends on a full parse though.

But it seems strange that parsing would be so much faster for C than C++. Are you calling the parser exactly the same way in both cases? I can see that the current code uses the CDOM.

Is CDOM something that is supposed to be deprecated eventually? Should we do it for 5.0?


Mike Kucera
Software Developer
IBM Eclipse CDT Team
mkucera@xxxxxxxxxx



Inactive hide details for Beth Tibbitts ---05/21/2008 01:14:28 PM---So how do I improve this? Can you provide some suggestions?Beth Tibbitts ---05/21/2008 01:14:28 PM---So how do I improve this? Can you provide some suggestions?

                                  Beth Tibbitts <tibbitts@xxxxxxxxxx>
                                  Sent by: cdt-dev-bounces@xxxxxxxxxxx

                                  05/21/2008 01:13 PM


Please respond to
"CDT General developers list." <cdt-dev@xxxxxxxxxxx>
To

"CDT General developers list." <cdt-dev@xxxxxxxxxxx>
cc
Subject

RE: [cdt-dev] Performance problems building C++ ASTs

So how do I improve this? Can you provide some suggestions?

...Beth

Beth Tibbitts (859) 243-4981 (TL 545-4981)
High Productivity Tools / Parallel Tools
http://eclipse.org/ptp
IBM T.J.Watson Research Center
Mailing Address: IBM Corp., 745 West New Circle Road, Lexington, KY 40511

Inactive hide details for "Schorn, Markus" <Markus.Schorn@xxxxxxxxxxxxx>"Schorn, Markus" <Markus.Schorn@xxxxxxxxxxxxx>
                                                                  "Schorn, Markus" <Markus.Schorn@xxxxxxxxxxxxx>
                                                                  Sent by: cdt-dev-bounces@xxxxxxxxxxx

                                                                  05/21/08 01:00 PM


Please respond to
"CDT General developers list." <cdt-dev@xxxxxxxxxxx>
To

"CDT General developers list." <cdt-dev@xxxxxxxxxxx>
cc
Subject

RE: [cdt-dev] Performance problems building C++ ASTs

The speed depends on the options you use with IASTTranslationUnit.getAST(...). If you are not using an index, parsing an entire project will be too slow.
Markus.



From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Beth Tibbitts
Sent:
Wednesday, May 21, 2008 5:01 PM
To:
cdt-dev@xxxxxxxxxxx
Subject:
[cdt-dev] Performance problems building C++ ASTs
Importance:
Low

I'm tracking down a performance problem (CDT 4.0.3 for now) running our PTP analysis tools on C++ code.
Getting the IASTTranslationUnit from the ITranslationUnit seems to be the culprit.
For C code,

IASTTranslationUnit atu = itranslationunit.getAST();
takes a few hundredths of a sec

For a C++ file, it takes on the order of a second and a half.

Any idea how I can speed this up? Analyzing a large C++ project is impossible at this rate.
Most of our initial analysis was on plain C code and it seems to pose no large problems.



...Beth

Beth Tibbitts (859) 243-4981 (TL 545-4981)
High Productivity Tools / Parallel Tools
http://eclipse.org/ptp
IBM T.J.Watson Research Center
Mailing Address: IBM Corp., 745 West New Circle Road, Lexington, KY 40511
_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx

https://dev.eclipse.org/mailman/listinfo/cdt-dev(See attached file: pic05758.gif)_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx

https://dev.eclipse.org/mailman/listinfo/cdt-dev_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx

https://dev.eclipse.org/mailman/listinfo/cdt-dev_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev

GIF image

GIF image

GIF image

GIF image

GIF image


Back to the top