[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
[cdt-core-dev] Re: Open C++ Parser
|
I have a question about C/C++ Parsing. We have a fair sized body of C code, maybe 30 million lines, that makes heavy use of h files. Each C file might #include 10k or 100k lines of header files, and I'm concerned that a real C/C++ parser would bog down on that (not to mention the need to list 300+ include file directories that vary according to each target you are compiling for and which of 10k executables you are building). The question is, how far can you get without doing a real preprocessor phase? If macros are fairly well formed (i.e. they are defined as expressions) as most are, it seems to me that a parser would not need to visit the #include files or have a full preprocessor in order to understand the structure of the C code. The only trouble would be in code that used macros like:
#define BEGIN {
#define END }
void main(void)
BEGIN
printf("hi");
END
Or:
#define Doit dofunc(X); /* note semi */
...
X=5;
Doit
X=4;
Doit
X=3;
Doit
These would be very complicated to handle in any round-trip CDOM or code formatter anyway don't you think? So if you could handle the simpler case that doesn't need the preprocessor then I think you could do the job much faster for large projects. The big disadvantage would be that most people put their structure definitions in header files, so if you don't look there you can't do some code completion, member refactoring, and other things with those types. Maybe there's a middle ground I'm not thinking of...
Anyway, if nobody agrees with this and a preprocessor is added, then I'd just like to request that care be taken to use caching, pretokenizing, incremental or background processing, or other techniques to handle large projects, otherwise it could be too slow to be useful.
Thanks.
> -----Original Message-----
>
> Message: 1
> From: "Schaefer, Doug" <dschaefer@xxxxxxxxxxxx>
> To: "'cdt-core-dev@xxxxxxxxxxx'" <cdt-core-dev@xxxxxxxxxxx>
> Date: Thu, 12 Dec 2002 11:39:14 -0500
> Subject: [cdt-core-dev] Open C++ Parser
> Reply-To: cdt-core-dev@xxxxxxxxxxx
>
> This message is in MIME format. Since your mail reader does
> not understand
> this format, some or all of this message may not be legible.
>
> ------_=_NextPart_001_01C2A1FD.015AC940
> Content-Type: text/plain
>
> Hey all,
>
>
>
> We're having a lot of fun here porting the Open C++ parser to Java for
> potential use in the CDT. We've started parsing some basic
> things like
> stdio.h and performance up until now seems to be reasonable.
>
>
>
> However, before we get too far, people on the conference call
> on Monday
> mentioned they had experience with the Open C++ parser and
> were willing to
> share that with us. If you were one of those persons, could
> you please drop
> us a line. I have made some changes to make the parser do
> what we want
> including: our own handwritten scanner that also handles pre-processor
> directives, using exceptions for backtracking, replacing the
> Ptree with our
> own JDT-like AST, amongst other minor changes. In the end,
> we're really
> only using the grammar and the strategy of handwriting the
> parser. And it
> seems to be working although there is a lot of testing that
> needs to go
> on...
>
>
>
> Cheers,
>
> Doug Schaefer
> Senior Staff Software Engineer
>
> Rational - the software development company
> Ottawa (Kanata), Ontario, Canada