Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Language supporting Integers, Floats, Hex Numbers and Range Expressions
Language supporting Integers, Floats, Hex Numbers and Range Expressions [message #1021370] Wed, 20 March 2013 00:04 Go to next message
David Pace is currently offline David PaceFriend
Messages: 19
Registered: March 2013
Junior Member
Hello all,

I am developing a grammar for an existing language and have ambiguity problems with integers, floats, hex numbers and range expressions.

Here is a minimal grammar to reconstruct my problem:

grammar de.davehofmann.test.TestDSL hidden (WS)

import "[url removed because of spam filter]/emf/2002/Ecore" as ecore

generate testDSL "[url removed because of spam filter]/test/TestDSL"

Model:
	rangeExpression += RangeExpression*;

terminal WS			: (' '|'\t'|'\r'|'\n')+;

terminal fragment DIGIT: ('0'..'9');
terminal INT returns ecore::EInt: DIGIT+;

terminal fragment HEX_DIGIT: (DIGIT|'a'..'f'|'A'..'F');
terminal HEX returns ecore::EInt: '0x' HEX_DIGIT*;

terminal IDENTIFIER : ('a'..'z') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;

FLOAT returns ecore::EFloat: INT DOT INT ('e' (MINUS|PLUS)? INT)? | (INT 'e' (MINUS|PLUS)? INT);

terminal PLUS: '+';
terminal MINUS: '-';
terminal DOT: '.';

DOTDOT: DOT DOT;

RangeExpression:
	'(' lowerBound=Expression DOTDOT upperBound=Expression ')';

Expression:
	IntegerLiteral | FloatLiteral | Variable;

Variable:
	name=IDENTIFIER;

IntegerLiteral:
	value=(INT | HEX);

FloatLiteral:
	value=FLOAT;


The following expressions are valid:

(1..3)
(2.2..5.1)
(2e-10..3e-8)
(0x1f..0x2d)
(e..x)


Problems:

1. Hex Numbers are not recognized at all (line 4). What's wrong here?
2. 'e' must be a valid variable name, but it clashes with the exponent 'e' in the FLOAT rule (line 5).

If FLOAT becomes a terminal rule, then

3. The expression 1..3 (line 1) is not valid because the lexer gets confused and tries to read a float.

How can I solve this?
Thanks in advance for any hints!
Dave
  • Attachment: test.tdsl
    (Size: 0.05KB, Downloaded 171 times)
  • Attachment: TestDSL.xtext
    (Size: 0.89KB, Downloaded 156 times)
Re: Language supporting Integers, Floats, Hex Numbers and Range Expressions [message #1021425 is a reply to message #1021370] Wed, 20 March 2013 03:36 Go to previous messageGo to next message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 14665
Registered: July 2009
Senior Member
Hi,

you may search the forum for
-lexed terminal rules that are conflicting
-datatype rules to solve that problem
-not to solve everything with the grammar but with semanct validation


Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
Re: Language supporting Integers, Floats, Hex Numbers and Range Expressions [message #1021676 is a reply to message #1021370] Wed, 20 March 2013 13:54 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
I posted on this topic a couple of times in this forum. Let me know if
you need help finding such posts.

- henrik

On 2013-20-03 4:31, David Hofmann wrote:
> Hello all,
>
> I am developing a grammar for an existing language and have ambiguity problems with integers, floats, hex numbers and range expressions.
>
> Here is a minimal grammar to reconstruct my problem:
>
>
> grammar de.davehofmann.test.TestDSL hidden (WS)
>
> import "[url removed because of spam filter]/emf/2002/Ecore" as ecore
>
> generate testDSL "[url removed because of spam filter]/test/TestDSL"
>
> Model:
> rangeExpression += RangeExpression*;
>
> terminal WS : (' '|'\t'|'\r'|'\n')+;
>
> terminal fragment DIGIT: ('0'..'9');
> terminal INT returns ecore::EInt: DIGIT+;
>
> terminal fragment HEX_DIGIT: (DIGIT|'a'..'f'|'A'..'F');
> terminal HEX returns ecore::EInt: '0x' HEX_DIGIT*;
>
> terminal IDENTIFIER : ('a'..'z') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
>
> FLOAT returns ecore::EFloat: INT DOT INT ('e' (MINUS|PLUS)? INT)? | (INT 'e' (MINUS|PLUS)? INT);
>
> terminal PLUS: '+';
> terminal MINUS: '-';
> terminal DOT: '.';
>
> DOTDOT: DOT DOT;
>
> RangeExpression:
> '(' lowerBound=Expression DOTDOT upperBound=Expression ')';
>
> Expression:
> IntegerLiteral | FloatLiteral | Variable;
>
> Variable:
> name=IDENTIFIER;
>
> IntegerLiteral:
> value=(INT | HEX);
>
> FloatLiteral:
> value=FLOAT;
>
>
> The following expressions are valid:
>
>
> (1..3)
> (2.2..5.1)
> (2e-10..3e-8)
> (0x1f..0x2d)
> (e..x)
>
>
> Problems:
>
> 1. Hex Numbers are not recognized at all (line 4). What's wrong here?
> 2. 'e' must be a valid variable name, but it clashes with the exponent 'e' in the FLOAT rule (line 5).
>
> If FLOAT becomes a terminal rule, then
>
> 3. The expression 1..3 (line 1) is not valid because the lexer gets confused and tries to read a float.
>
> How can I solve this?
> Thanks in advance for any hints!
> Dave
>
Re: Language supporting Integers, Floats, Hex Numbers and Range Expressions [message #1022990 is a reply to message #1021676] Sat, 23 March 2013 00:32 Go to previous messageGo to next message
David Pace is currently offline David PaceFriend
Messages: 19
Registered: March 2013
Junior Member
Hi,

thank you for your hints. I searched for a solution in the forum, and finally found out that the problem regarding the Hex numbers was neither in the Lexer nor in the Parser. The problem was a missing value converter for hex numbers.
The error message was simply: "for input String". Suggestion: There should be a more meaningful error message like "Could not convert value 0xff to Integer".

However, I was not able to find a solution for the second problem: 'e' is not valid variable name because it is used in the FLOAT rule for the exponent notation. Can you point me to the right track in order to make 'e' a valid IDENTIFIER? Thanks in advance!
Re: Language supporting Integers, Floats, Hex Numbers and Range Expressions [message #1023218 is a reply to message #1022990] Sat, 23 March 2013 16:47 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik LindbergFriend
Messages: 2509
Registered: July 2009
Senior Member
On 2013-23-03 1:32, David Hofmann wrote:
> Hi,
>
> thank you for your hints. I searched for a solution in the forum, and
> finally found out that the problem regarding the Hex numbers was neither
> in the Lexer nor in the Parser. The problem was a missing value
> converter for hex numbers.
> The error message was simply: "for input String". Suggestion: There
> should be a more meaningful error message like "Could not convert value
> 0xff to Integer".
>
> However, I was not able to find a solution for the second problem: 'e'
> is not valid variable name because it is used in the FLOAT rule for the
> exponent notation. Can you point me to the right track in order to make
> 'e' a valid IDENTIFIER? Thanks in advance!

I have this in one grammar:

terminal HEX : '0' ('x'|'X')(('0'..'9')|('a'..'f')|('A'..'F'))+ ;
terminal INT : ('0'..'9')+;
REAL hidden(): INT '.' (EXT_INT | INT); // INT ? '.' (EXT_INT | INT);
terminal EXT_INT: INT ('e'|'E')('-'|'+') INT;

Regards
- henrik
Re: Language supporting Integers, Floats, Hex Numbers and Range Expressions [message #1023660 is a reply to message #1023218] Sun, 24 March 2013 21:52 Go to previous message
David Pace is currently offline David PaceFriend
Messages: 19
Registered: March 2013
Junior Member
I found the solution. Basically I had to introduce a new Datatype Rule for Identifiers which has 'e' as valid alternative:

IDENTIFIER: ID | 'e'; // all "keywords" used in Parser / Data Type Rules must be listed here


Also see this post for a step by step solution.

Thanks for your help!
Previous Topic:slf4j logging for runtime debugging
Next Topic:Terminal/Parser rule to support nested indent
Goto Forum:
  


Current Time: Thu Apr 25 14:31:13 GMT 2024

Powered by FUDForum. Page generated in 0.03304 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top