RE: [cdt-dev] decoupled preprocessor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

RE: [cdt-dev] decoupled preprocessor

From: Mike Kucera <mkucera@xxxxxxxxxx>
Date: Tue, 19 Jun 2007 17:40:01 -0400
Delivered-to: cdt-dev@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/listinfo/cdt-dev>
List-help: <mailto:cdt-dev-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/cdt-dev>, <mailto:cdt-dev-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/listinfo/cdt-dev>, <mailto:cdt-dev-request@eclipse.org?subject=unsubscribe>

It looks like you are planning to do preprocessing on the raw character
stream and then feed the result to your ANTLR lexer.

The C99 preprocessor works differently, it processes a token stream, not a
character stream. It creates a CodeReader for each include, passes it to
the lexer and expects a token stream as the result. It then adds the token
stream to its own input and continues processing.

I don't know which approach makes more sense with ANTLR. With LPG I was
able to separate the lexer and parser and stick the preprocessor
in-between.

I believe that doing lexing before preprocessing makes the preprocessing
phase much easier to write and maintain. For example the C99 preprocessor
doesn't need to deal with comments, from bug reports this is something that
I can tell has created many issues in the DOM scanner. Also the code is
cleaner because it is processing a token stream instead of a raw character
stream (for example, compare Macro.invoke() to BaseScanner.
expandFunctionStyleMacro()).

Also, if you return raw characters from the preprocessor then how will you
the calculate the offsets on the AST nodes? The offsets are normally
contained in the tokens.

> But if you already have everything we've done
> there, then might be the better approach.

Well, I hope so :) Its pretty new and I'm still working out the bugs. It
does have a few features the DOM scanner doesn't, like support for
trigraphs.

I hope you do decide to give it a try. I'll decouple it soon.


Mike Kucera
Software Developer
IBM CDT Team, Toronto
mkucera@xxxxxxxxxx



                                                                           
             Doug Schaefer                                                 
             <DSchaefer@xxxxxx                                             
             m>                                                         To 
             Sent by:                  "CDT General developers list."      
             cdt-dev-bounces@e         <cdt-dev@xxxxxxxxxxx>               
             clipse.org                                                 cc 
                                                                           
                                                                   Subject 
             06/19/2007 03:53          RE: [cdt-dev] decoupled             
             PM                        preprocessor                        
                                                                           
                                                                           
             Please respond to                                             
               "CDT General                                                
             developers list."                                             
             <cdt-dev@eclipse.                                             
                   org>                                                    
                                                                           
                                                                           




Yes, it is definitely something I'll need. I'll need to take a look at what
you've done. ANTLR uses it's own character stream interface to feed
characters to the lexer. It provides implementations that can pull that out
of Readers and InputStreams. I will likely want to create a new one that
doesn't try to load it all into a char[] at startup like the built in ones
do. We can then hook that up to the preprocessor.

I'm not sure how you built yours but the easiest path I can see is to take
our current scanner and replace nextToken with getChar and strip out
anything that creates a token. But if you already have everything we've
done
there, then might be the better approach.

Anyway, another shiny object flew by called CDT user docs, so I'll get back
to ANTLR in a few days :).

Cheers,
Doug Schaefer, QNX Software Systems
Eclipse CDT Project Lead, http://cdtdoug.blogspot.com


> -----Original Message-----
> From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On
> Behalf Of Mike Kucera
> Sent: Tuesday, June 19, 2007 3:43 PM
> To: CDT General developers list.
> Subject: [cdt-dev] decoupled preprocessor
>
>
> Hi Doug,
>
> I take it from your latest blog post that you are going to be in need of
a
> preprocessor for you ANTLR C++ experiment. I was planning on decoupling
> the
> preprocessor that I wrote for the C99 parser so that it can be used with
> any parser. If you are interested in picking this up when would you need
> it?
>
> Mike Kucera
> Software Developer
> IBM CDT Team, Toronto
> mkucera@xxxxxxxxxx
>
> _______________________________________________
> cdt-dev mailing list
> cdt-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/cdt-dev
_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev

Follow-Ups:
- RE: [cdt-dev] decoupled preprocessor
  - From: Schorn, Markus

References:
- RE: [cdt-dev] decoupled preprocessor
  - From: Doug Schaefer

Prev by Date: RE: [cdt-dev] http://cdt.eclipse.org/ down
Next by Date: RE: [cdt-dev] decoupled preprocessor
Previous by thread: RE: [cdt-dev] decoupled preprocessor
Next by thread: RE: [cdt-dev] decoupled preprocessor
Index(es):
- Date
- Thread

Breadcrumbs