Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cdt-dev] Suggestions for dealing with tests

Hello again,
I have the basic semantics in place; auto types are deduced as long as the operators return simple types. For the rest of the semantics, I'm going to need some help. I'd also like to move much of the code I wrote in CPPASTLiteralExpression somewhere else, but I don't know where is appropriate, suggestions?

Thanks, Richard.

On 29 June 2012 13:37, Richard <naddiseo+cdt-dev@xxxxxxxxx> wrote:
Hi,
I've finally had some more time to work on this. Markus, I've followed your suggestions about having separate processings of the tokens, and this has fixed all the failing tests I was having before, but I'm not sure what ramifications it may have on speed, or the semantics needed for UDLs. My currently implementation (https://github.com/Naddiseo/cdt/compare/3b5ee13aeb2...udl) is more or less complete in terms of syntax, I did have to add two more errors messages that I found GCC had about having invalid prefixes or suffixes. 

Anyway, I need help with the only test that's failing: ASTWriterTest, I don't know where to begin with it since it reads a file in, and I can't reproduce it in the editor itself. I'm getting the following comparison failure:
Expected:
int i = ZWO + 2;
int i = 2 + ZWO;

Actual:
int i = 2 + 2;
int i = 2 + 2;

Any help or insight anyone can provided would be appreciated.

Richard.

On 13 June 2012 23:05, Schorn, Markus <Markus.Schorn@xxxxxxxxxxxxx> wrote:

Hi,

I disagree, the preprocessor should just do what the specification asks him to do:

                Create preprocessor number tokens.

That is because multiple languages use the same specification for their preprocessing (at least C, C++, ObjectiveC), however each language is free to interpret the pp-number tokens as they like. I’d like to leave the path open to use it for whatever language you need it for.

 

The preprocessor itself needs to interpret pp-number tokens within #if preprocessing directives and can use its own algorithm for that. What it has to do is not 100% specified, however it’s obvious that it shall make an attempt to interpret as many of the integral numbers as possible and generate an error on the others.

However, there is no need to do the interpretation twice: Either it needs to be interpreted by the preprocessor (in #if directives) or by the parser, elsewhere.

 

Markus.

 

From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Richard
Sent: Wednesday, June 13, 2012 7:04 PM


To: CDT General developers list.
Subject: Re: [cdt-dev] Suggestions for dealing with tests

 

The problem with doing that is then you have to parse the number twice (or more) to determine what kind of number it is. In CDT, first the lexer returns the number possibly with a suffix. Then in the preprocessor, it has to check the suffix to see if it's a UDL (can't have UDLs in preprocessor directives), then in the parser the number is broken down further to see if the number has the correct format depending if its an int/hex/binary/etc. My proposal is to ignore the suffixes in the lexer, and return a number token and an identifier. The check for number validity can be done in the lexer, and the checks on the suffix can be done in the preprocessor and/or the parser. 

 

An update on the progress I've made since this discussion began: 

- I've moved the lexing of numbers into the lexer

- The lexer doesn't return the suffix as part of the number

- Added parser and scanner options of UDLs

- Added a suffix to CPPASTLiteralExpression

 

I'm still working through some bugs, but it's mostly there. The only questions I have now, is how I pass more information from the lexer to the parser, because it would be helpful if the lexer could tell the parser if the number is float/hex/int/hexfloat/etc. And how to use IASTImplicitNameOwner; I'm having trouble finding an example of how to use it.

 

On 13 June 2012 00:28, Corbat Thomas (tcorbat@xxxxxx) <tcorbat@xxxxxx> wrote:

Hm... Actually, I got it differently, regarding handling of numbers.

The grammar for pp-number (preprocessor numbers) is as follows [lex.ppnumber]:

 

pp-number:

                digit

                . digit

                pp-number digit

                pp-number identifier-nondigit

                pp-number e sign

                pp-number E sign

                pp-number .

 

Therefore, everything starting with a digit or a dot and a digit is a number. Thus .55.4h5ze+E-5gg would be a pp-number, even though it will not be convertible into a meaningful floating number.

I guess the way Markus suggested was to rip the part districting the kinds of numbers out of the preprocessor, leaving it yielding pp-numbers and move it to the parser.

 

 

From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Richard
Sent: Montag, 11. Juni 2012 17:39


To: CDT General developers list.
Subject: Re: [cdt-dev] Suggestions for dealing with tests

 

Regarding the tests: At the moment I have 17 tests that error in a suite that tests included files (I forget which it is exactly), and a few that fail because there aren't any tests in the suite (base classes I think). I'm assuming the latter is because I'm running the suite incorrectly: I'm right-clicking on the package and Run As->jUnit Plugin Test. 

 

For the UDL implementation, I understand what you're saying about not introducing new elements, which is what my first attempt tried to do and I had problems with, but now I look at it again, I think I understand how you're suggesting to implement it. 

 

Yes, dealing with the suffix is the parsers task. By the standard it’s even the task of the parser to distinguish between the various kinds of number literals. I don’t exactly know why this is done in the preprocessor. It’d be actually nice to move this task away from it, which would collapse the various kinds of number tokens to one preprocessor number token as described in 2.10.

So, if I understand correctly, the lexer should return two tokens for numbers with suffixes: the number, and an identifier. And it would be the job of the Parser to combine these into one ASTNode. I agree, that does seem like the best approach, and would mean that the places (I've counted at least two, Lexer.java and CPreprocessor.java) that have more-or-less the same code for parsing numbers can be in just in the place they belong in the lexer, and the AST token can be used in CPreprocessor to determine if the token is a UDL, by adding and extra method, isUserDefined, and maybe other methods such as getNumberType, which would return INT, HEX, BINARY, etc. The other method I see that could be useful is isMalformed() for use in CPreprocessor.

 

I think the code for all this is already written between Corbat and myself, and just needs placing in the right places and testing.

 

 

On 11 June 2012 07:08, Corbat Thomas (tcorbat@xxxxxx) <tcorbat@xxxxxx> wrote:

I had a closer look at the standard and I agree with you:

pp-number contains any number including user defined literals. But for preprocessing the characters and strings are distinct from user defined characters and strings, considering preprocessing tokens (from translation phase 3).

Looking at translation phase 7 (conversion of preprocessing tokens to tokens) user defined literals are distinct from the other kinds of literals [lex.literal.kinds], with the subcategories int, float, char and string.

From the description the conversion of the tokens seems to me like a pre-step of phase 7. But I guess it could be done on the fly, while parsing as well.

 

 

From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Schorn, Markus
Sent: Montag, 11. Juni 2012 11:31


To: CDT General developers list.
Subject: Re: [cdt-dev] Suggestions for dealing with tests

 

Hi,

A user-defined literal is still of kind integer, char, floating-point or string. So it’d be natural to add the property isUserDefined() to IASTLiteralExpression.

 

Yes, dealing with the suffix is the parsers task. By the standard it’s even the task of the parser to distinguish between the various kinds of number literals. I don’t exactly know why this is done in the preprocessor. It’d be actually nice to move this task away from it, which would collapse the various kinds of number tokens to one preprocessor number token as described in 2.10.

However, the LR-Parsers probably rely on receiving the tokens as they are delivered today. So we could have an option whether the preprocessor shall classify the number tokens or not. For the GNUCPPSourceParser we could then move the classification into the parser.

 

Exactly, CPPASTLiteralExpresion.getExpressionType() needs to be changed. Plus, as mentioned before, by using IASTImplicitNameOwner the literal _expression_ can provide the binding to the operator that is called to create the object denoted by the literal.

 

Markus.

 

From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Corbat Thomas (tcorbat@xxxxxx)
Sent: Monday, June 11, 2012 9:42 AM
To: CDT General developers list.
Subject: Re: [cdt-dev] Suggestions for dealing with tests

 

Hi

 

Regarding the UDL implementation:

So your suggestion, Markus, is to lex user defined literals as tSTRING, tCHAR, tINTEGER and tFLOATINGPT?

Shall IASTExpression get a new kind for user defined literals?

Recognizing the suffix will be task of the parser then, right?

I guess getExpressionType() in CPPASTLiteralExpression must be extended to return the type of the resolved literal operator.

 

That probably makes the implementation a bit easier.

 

Regards

Thomas

 

 

 

From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Schorn, Markus
Sent: Montag, 11. Juni 2012 07:51
To: CDT General developers list.
Subject: Re: [cdt-dev] Suggestions for dealing with tests

 

Hi,

The preprocessor is not run in a specific language. In case there needs to be different behavior for C and C++ you need to introduce an option which should be controlled via IScannerExtensionConfiguration (The GPPLanguage and the GCCLanguage provide different objects for this configuration). The testcase needs to be elaborated to test with the two different configurations.

For the case, where we allow user defined literals the checkNumber method needs to behave differently. I also think that the classification of the number literals in the lexer needs to be changed, however it can be done, such that it works independently of whether user-defined literals are allowed or not.

My recommendation is to neither introduce new kinds of tokens nor to create a new IASTNode. I think it is sufficient to let CPPASTLiteralExpression implement IASTImplicitNameOwner, which allows to provide the references to the implicit function calls.  

The tests nested in org.eclipse.cdt.core.suite.AutomatedIntegrationSuite should all pass.

Markus.

 

From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Richard
Sent: Saturday, June 09, 2012 10:35 PM
To: CDT General developers list.
Subject: [cdt-dev] Suggestions for dealing with tests

 

Hello all,

I've been working on getting user-defined literals working syntactically for the past week, I believe I'm in the home stretch as I'm now working through the failing test cases under org.eclipse.cdt.core.tests. 

I have a few questions regarding this. 

 

- Firstly, how do I deal with tests that are supposed to fail in C mode, but pass in C++ mode?

The current test that demonstrates this is PerprocessorTests.testGCC43BinaryNumbers. 

There are 5 binary literals that are tested for failing: 0b012, 0b01b, 0b1111e01, 0b1111p10, 0b10010.10010

With UDLs the first and the last of these should fail, and the middle three can be considered binary literals with UDL suffixes. But in C mode, or C++ sans UDL, they all should fail. Is there a way to test which language the current test is being run in?

 

- Lastly, I've been testing my branch against master, and I've noticed there are a fair number of tests with errors or failing. Is this expected, or do I have my project set up incorrectly?

 

Thank, Richard


_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev

 


_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev

 


_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev




Back to the top