Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Xtext grammar for real literals and filenames with a . dot extension
Xtext grammar for real literals and filenames with a . dot extension [message #996670] Wed, 02 January 2013 16:42 Go to next message
Elvis Dowson is currently offline Elvis Dowson
Messages: 38
Registered: December 2011
Member
Hi,
I have the following grammar defined, but have an issue with detecting real literals, after adding support for detecting filenames with a . dot extension.

The real expressions used to work, prior to adding support for detecting filenames with extensions

Failed tests
real 1312.123e+123
real 1312.123
real 1312_1312.123_123
real 1312_1312.123123e+123
real 1234.1232e-234234
real 1234.123123E+1232

include <header>


Passed tests
include "filename"
include "filename.txt"
include <filename.txt>
include <filename.txt.bak>


Terminals.xtext
grammar com.example.mydsl.xtext.common.Terminals hidden(WS, ML_COMMENT, SL_COMMENT)

import "http://www.eclipse.org/emf/2002/Ecore" as ecore

// Identifiers
terminal identifier           : '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;

// Comments and whitespace
terminal ML_COMMENT  : '/*' -> '*/';
terminal SL_COMMENT  : '//' !('\n'|'\r')* ('\r'? '\n')?;

terminal WS       : (' '|'\t'|'\r'|'\n')+;

// Integer literals
terminal sign                 :  ('+' | '-');
terminal binDigits            :  ('0' | '1')+;
terminal binDigitsUnderscore  :  (binDigits | '_')*;
//terminal octDigits            :  ('0' | '1'..'7' '0'..'7'*);
//terminal octDigitsUnderscore  :  (octDigits | '_')*;
terminal decDigits            :  ('0'..'9')+;
terminal decDigitsUnderscore  :  (decDigits | '_')*;
terminal hexDigitsUnderscore  :  ('0'..'9' | 'a'..'f' |'A'..'F' | '_')+;

bitWidth                      :  decDigits;

decNum                        :  (decDigits | decDigitsUnderscore)?;

baseLiteral                   :  ('b' | 'B') binDigitsUnderscore
//                              |  ('o' | 'O') octDigitsUnderscore
                              |  ('d' | 'D') decDigitsUnderscore
                              |  ('h' | 'H') hexDigitsUnderscore;

unsizedIntLiteral             :  ( sign? decNum ) | ( sign? baseLiteral );

sizedIntLiteral               :  bitWidth baseLiteral;

intLiteral                    :  '0' | '1'
                              |  sizedIntLiteral
                              |  unsizedIntLiteral;

// Real literals
realLiteral hidden()          :  decNum '.' (expDecDigits|decNum);
terminal expDecDigits         :  (decDigits ('e' | 'E') sign? decDigits);


// String literals
terminal stringLiteral        :
         '"' ( '\\' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\') | !('\\'|'"') )* '"' |
         "'" ( '\\' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\') | !('\\'|"'") )* "'"
      ;

terminal filenameLiteral      :
         // Identify filenames in the following format:
         // filename.txt, filename (i.e., without the . filename extension)
         ( ('a'..'z'|'A'..'Z'|'_'|'0'..'9')* ('.' ('a'..'z'|'A'..'Z'|'_'|'0'..'9')* )* )
;
// Bug: terminal filenameLiteral: include <filename> does not work without a file extension.


mydsl.xtext
MyDSLDomainModel:
   elements += AbstractElement*;

AbstractElement:
   IncludeDirective | DefineDirective | IntegerToken | RealToken;

// Compiler directives

// File inclusion: include and line
IncludeDirective:
      'include'     filename=stringLiteral
   |  'include' '<' filename=filenameLiteral '>'
;

// Macro definition and substitutions: define and related directives
DefineDirective:
      'define' macroName=identifier //(macroText=(identifier | intLiteral | realLiteral))?
;

IntegerToken:
   'integer' name=(unsizedIntLiteral | binDigits | binDigitsUnderscore | hexDigitsUnderscore);

RealToken:
   'real' name=realLiteral;


How can I fix the grammar so that I am able to detect real numbers, plus filenames with a . dot extension?

Elvis Dowson

[Updated on: Wed, 02 January 2013 16:43]

Report message to a moderator

Re: Xtext grammar for real literals and filenames with a . dot extension [message #996690 is a reply to message #996670] Wed, 02 January 2013 17:17 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik Lindberg
Messages: 2500
Registered: July 2009
Senior Member
This has been asked and answered earlier in this forum. See if you can
find it by searching for "REAL" - I posted an example of how I solved it.

If you can't find it, ping again, and I will dig it up.
Regards
- henrik

On 2013-02-01 17:42, Elvis Dowson wrote:
> Hi,
> I have the following grammar defined, but have an issue with
> detecting real literals, after adding support for detecting filenames
> with a . dot extension.
>
> The real expressions used to work, prior to adding support for detecting
> filenames with extensions
>
> Failed tests
>
> real 1312.123e+123
> real 1312.123
> real 1312_1312.123_123
> real 1312_1312.123123e+123
> real 1234.1232e-234234
> real 1234.123123E+1232
>
> include <header>
>
>
> Passed tests
>
> include "filename"
> include "filename.txt"
> include <filename.txt>
> include <filename.txt.bak>
>
>
> Terminals.xtext
>
> grammar com.bluespec.mydsl.xtext.common.Terminals hidden(WS, ML_COMMENT,
> SL_COMMENT)
>
> import "http://www.eclipse.org/emf/2002/Ecore" as ecore
>
> // Identifiers
> terminal identifier : '^'?('a'..'z'|'A'..'Z'|'_')
> ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;
>
> // Comments and whitespace
> terminal ML_COMMENT : '/*' -> '*/';
> terminal SL_COMMENT : '//' !('\n'|'\r')* ('\r'? '\n')?;
>
> terminal WS : (' '|'\t'|'\r'|'\n')+;
>
> // Integer literals
> terminal sign : ('+' | '-');
> terminal binDigits : ('0' | '1')+;
> terminal binDigitsUnderscore : (binDigits | '_')*;
> //terminal octDigits : ('0' | '1'..'7' '0'..'7'*);
> //terminal octDigitsUnderscore : (octDigits | '_')*;
> terminal decDigits : ('0'..'9')+;
> terminal decDigitsUnderscore : (decDigits | '_')*;
> terminal hexDigitsUnderscore : ('0'..'9' | 'a'..'f' |'A'..'F' | '_')+;
>
> bitWidth : decDigits;
>
> decNum : (decDigits | decDigitsUnderscore)?;
>
> baseLiteral : ('b' | 'B') binDigitsUnderscore
> // | ('o' | 'O') octDigitsUnderscore
> | ('d' | 'D') decDigitsUnderscore
> | ('h' | 'H') hexDigitsUnderscore;
>
> unsizedIntLiteral : ( sign? decNum ) | ( sign? baseLiteral );
>
> sizedIntLiteral : bitWidth baseLiteral;
>
> intLiteral : '0' | '1'
> | sizedIntLiteral
> | unsizedIntLiteral;
>
> // Real literals
> realLiteral hidden() : decNum '.' (expDecDigits|decNum);
> terminal expDecDigits : (decDigits ('e' | 'E') sign? decDigits);
>
>
> // String literals
> terminal stringLiteral :
> '"' ( '\\' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\') | !('\\'|'"')
> )* '"' |
> "'" ( '\\' ('b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\') | !('\\'|"'")
> )* "'"
> ;
>
> terminal filenameLiteral :
> // Identify filenames in the following format:
> // filename.txt, filename (i.e., without the . filename extension)
> ( ('a'..'z'|'A'..'Z'|'_'|'0'..'9')* ('.'
> ('a'..'z'|'A'..'Z'|'_'|'0'..'9')* )* )
> ;
> // Bug: terminal filenameLiteral: include <filename> does not work
> without a file extension.
>
>
> mydsl.xtext
>
> MyDSLDomainModel:
> elements += AbstractElement*;
>
> AbstractElement:
> IncludeDirective | DefineDirective | IntegerToken | RealToken;
>
> // Compiler directives
>
> // File inclusion: include and line
> IncludeDirective:
> 'include' filename=stringLiteral
> | 'include' '<' filename=filenameLiteral '>'
> ;
>
> // Macro definition and substitutions: define and related directives
> DefineDirective:
> 'define' macroName=identifier //(macroText=(identifier |
> intLiteral | realLiteral))?
> ;
>
> IntegerToken:
> 'integer' name=(unsizedIntLiteral | binDigits | binDigitsUnderscore |
> hexDigitsUnderscore);
>
> RealToken:
> 'real' name=realLiteral;
>
>
> How can I fix the grammar so that I am able to detect real numbers, plus
> filenames with a . dot extension?
>
> Elvis Dowson
Re: Xtext grammar for real literals and filenames with a . dot extension [message #996712 is a reply to message #996690] Wed, 02 January 2013 18:31 Go to previous messageGo to next message
Elvis Dowson is currently offline Elvis Dowson
Messages: 38
Registered: December 2011
Member
Hi Henrik,
Is this the post that you are referring to:
http://www.eclipse.org/forums/index.php/mv/msg/369620/903198/

terminal INT : ('0'..'9')+;
REAL hidden(): INT '.' (EXT_INT | INT);
terminal EXT_INT: INT ('e'|'E')('-'|'+') INT;



I did get real literals to work, but the problem that I am having is *after* adding support for filenames with . dot extensions, the real literals stopped working.


My code snippet for real literals are
terminal sign                 :  ('+' | '-');
terminal decDigits            :  ('0'..'9')+;
terminal decDigitsUnderscore  :  (decDigits | '_')*;
decNum                        :  (decDigits | decDigitsUnderscore)?;
// Real literals
realLiteral hidden()          :  decNum '.' (expDecDigits|decNum);
terminal expDecDigits         :  (decDigits ('e' | 'E') sign? decDigits);


My code snippet for filenames with extensions are
terminal filenameLiteral      :
         // Identify filenames in the following format:
         // filename.txt, filename (i.e., without the . filename extension)
         ( ('a'..'z'|'A'..'Z'|'_'|'0'..'9')* ('.' ('a'..'z'|'A'..'Z'|'_'|'0'..'9')* )* )


Best regards,

Elvis Dowson

[Updated on: Wed, 02 January 2013 19:40]

Report message to a moderator

Re: Xtext grammar for real literals and filenames with a . dot extension [message #996801 is a reply to message #996712] Wed, 02 January 2013 23:27 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik Lindberg
Messages: 2500
Registered: July 2009
Senior Member
On 2013-02-01 19:31, Elvis Dowson wrote:
> Hi Henrik,
> I searched the TMF forums for the word "REAL" and looked up
> posts with your name, but couldn't find the appropriate solution. The
> word "REAL" is too generic, to attempt to try against the whole forum!
>
> If you could locate it or let me know where I'm going wrong, it would be
> great!
>
> Best regards,
>
> Elvis Dowson
ok

Here are the relevant portions of the b3 grammar that shows how I dealt
with both real numbers and integers in hex/octal/decimal.

Hope this helps...
Regards
- henrik

IntegerLiteral returns be::BExpression : {be::BLiteralInteger}
value= RadixIntValue
;

RealLiteral returns be::BExpression : {be::BLiteralExpression}
value = RealValue
;

// Has conversion rule
RealValue returns ecore::EDoubleObject: REAL ;

// Has conversion rule that handles decimal, octal, and hexadecimal
values but returns an Integer
IntValue returns ecore::EIntegerObject : INT | HEX ;

// Has conversion rule that handles decimal, octal, and hexadecimal
values with radix
RadixIntValue returns ecore::EJavaObject : INT | HEX ;

terminal HEX : '0' ('x'|'X')(('0'..'9')|('a'..'f')|('A'..'F'))+ ;
terminal INT : ('0'..'9')+;
REAL hidden(): INT '.' (EXT_INT | INT); // INT ? '.' (EXT_INT | INT);
terminal EXT_INT: INT ('e'|'E')('-'|'+') INT;


Here is some additional code to reuse - or get it from github...

https://github.com/eclipse/b3/tree/master/org.eclipse.b3.beelang/src/org/eclipse/b3

-------
terminal converters

@ValueConverter(rule = "RealValue")
public IValueConverter<Double> RealValue() {
return new IValueConverter<Double>() {

public String toString(Double value) {
return value.toString();
}

public Double toValue(String string, AbstractNode node) {
if(Strings.isEmpty(string))
throw new ValueConverterException("Could not convert empty string
to double", node, null);
try {
return new Double(string);
}
catch(NumberFormatException e) {
throw new ValueConverterException("Could not convert '" + string +
"' to double", node, null);
}
}

};
}


@ValueConverter(rule = "RadixIntValue")
public IValueConverter<IntegerWithRadix> RadixIntValue() {
return new IValueConverter<IntegerWithRadix>() {

public String toString(IntegerWithRadix value) {
return value.toString();
}

public IntegerWithRadix toValue(String string, AbstractNode node)
throws ValueConverterException {
int radix = 10;
if(Strings.isEmpty(string))
throw new ValueConverterException("Can not convert empty string to
int", node, null);
try {
if(string.startsWith("0x") || string.startsWith("0X")) {
radix = 16;
string = string.substring(2);
}
else if(string.startsWith("0") && string.length() > 1)
radix = 8;

return new IntegerWithRadix(Integer.valueOf(string, radix), radix);
}
catch(NumberFormatException e) {
String format = "";
switch(radix) {
case 8:
format = "octal";
break;
case 10:
format = "decimal";
break;
case 16:
format = "hexadecimal";
break;
}
throw new ValueConverterException(
"Can not convert to " + format + " integer : " + string, node, null);
}
}

};
}


----
and if you need it - integer with radix
/**
* Copyright (c) 2010, Cloudsmith Inc.
* The code, documentation and other materials contained herein have been
* licensed under the Eclipse Public License - v 1.0 by the copyright
holder
* listed above, as the Initial Contributor under such license. The text of
* such license is available at www.eclipse.org.
*/

package org.eclipse.b3.backend.core.datatypes;

import java.io.Serializable;

/**
* A representation of an integer with a radix
*
*/
public class IntegerWithRadix implements Serializable {
private static final long serialVersionUID = 3249927626857631021L;

private Integer value;

private int radix;

public IntegerWithRadix(int value) {
this.value = new Integer(value);
this.radix = 10;
}

public IntegerWithRadix(int value, int radix) {
if(!(radix == 8 || radix == 10 || radix == 16))
throw new IllegalArgumentException("Only radix 8, 10 or 16 supported");
this.value = value;
this.radix = radix;
}

public int getRadix() {
return radix;
}

public Integer getValue() {
return value;
}

@Override
public String toString() {
StringBuffer buf = new StringBuffer();
if(radix == 8)
buf.append("0");
else if(radix == 16)
buf.append("0x");
buf.append(Integer.toString(value, radix));
return buf.toString();
}
}
Re: Xtext grammar for real literals and filenames with a . dot extension [message #996816 is a reply to message #996801] Thu, 03 January 2013 00:34 Go to previous messageGo to next message
Henrik Lindberg is currently offline Henrik Lindberg
Messages: 2500
Registered: July 2009
Senior Member
Should have pointed out that REAL is not a terminal.
I had issues with supporting real numbers without an integral part e.g.
..3e2 (as you can see that is commented out). I decided that the solution
was good enough with a requirement to write that as 0.3e2.

Also worth mentioning, in a later grammar (different language) I
switched to using an external lexer, and if I had to do this again, I
would use that approach to produce INT, HEX, OCTAL, or REAL tokens since
it is possible to do more powerful things when lexing with this approach.

Regards
- henrik

On 2013-03-01 24:27, Henrik Lindberg wrote:
> On 2013-02-01 19:31, Elvis Dowson wrote:
>> Hi Henrik,
>> I searched the TMF forums for the word "REAL" and looked up
>> posts with your name, but couldn't find the appropriate solution. The
>> word "REAL" is too generic, to attempt to try against the whole forum!
>>
>> If you could locate it or let me know where I'm going wrong, it would be
>> great!
>>
>> Best regards,
>>
>> Elvis Dowson
> ok
>
> Here are the relevant portions of the b3 grammar that shows how I dealt
> with both real numbers and integers in hex/octal/decimal.
>
> Hope this helps...
> Regards
> - henrik
>
> IntegerLiteral returns be::BExpression : {be::BLiteralInteger}
> value= RadixIntValue
> ;
>
> RealLiteral returns be::BExpression : {be::BLiteralExpression}
> value = RealValue
> ;
>
> // Has conversion rule
> RealValue returns ecore::EDoubleObject: REAL ;
>
> // Has conversion rule that handles decimal, octal, and hexadecimal
> values but returns an Integer
> IntValue returns ecore::EIntegerObject : INT | HEX ;
>
> // Has conversion rule that handles decimal, octal, and hexadecimal
> values with radix
> RadixIntValue returns ecore::EJavaObject : INT | HEX ;
>
> terminal HEX : '0' ('x'|'X')(('0'..'9')|('a'..'f')|('A'..'F'))+ ;
> terminal INT : ('0'..'9')+;
> REAL hidden(): INT '.' (EXT_INT | INT); // INT ? '.' (EXT_INT | INT);
> terminal EXT_INT: INT ('e'|'E')('-'|'+') INT;
>
>
> Here is some additional code to reuse - or get it from github...
>
> https://github.com/eclipse/b3/tree/master/org.eclipse.b3.beelang/src/org/eclipse/b3
>
>
> -------
> terminal converters
>
> @ValueConverter(rule = "RealValue")
> public IValueConverter<Double> RealValue() {
> return new IValueConverter<Double>() {
>
> public String toString(Double value) {
> return value.toString();
> }
>
> public Double toValue(String string, AbstractNode node) {
> if(Strings.isEmpty(string))
> throw new ValueConverterException("Could not
> convert empty string to double", node, null);
> try {
> return new Double(string);
> }
> catch(NumberFormatException e) {
> throw new ValueConverterException("Could not
> convert '" + string + "' to double", node, null);
> }
> }
>
> };
> }
>
>
> @ValueConverter(rule = "RadixIntValue")
> public IValueConverter<IntegerWithRadix> RadixIntValue() {
> return new IValueConverter<IntegerWithRadix>() {
>
> public String toString(IntegerWithRadix value) {
> return value.toString();
> }
>
> public IntegerWithRadix toValue(String string, AbstractNode
> node) throws ValueConverterException {
> int radix = 10;
> if(Strings.isEmpty(string))
> throw new ValueConverterException("Can not convert
> empty string to int", node, null);
> try {
> if(string.startsWith("0x") ||
> string.startsWith("0X")) {
> radix = 16;
> string = string.substring(2);
> }
> else if(string.startsWith("0") && string.length() > 1)
> radix = 8;
>
> return new IntegerWithRadix(Integer.valueOf(string,
> radix), radix);
> }
> catch(NumberFormatException e) {
> String format = "";
> switch(radix) {
> case 8:
> format = "octal";
> break;
> case 10:
> format = "decimal";
> break;
> case 16:
> format = "hexadecimal";
> break;
> }
> throw new ValueConverterException(
> "Can not convert to " + format + " integer : "
> + string, node, null);
> }
> }
>
> };
> }
>
>
> ----
> and if you need it - integer with radix
> /**
> * Copyright (c) 2010, Cloudsmith Inc.
> * The code, documentation and other materials contained herein have been
> * licensed under the Eclipse Public License - v 1.0 by the copyright
> holder
> * listed above, as the Initial Contributor under such license. The
> text of
> * such license is available at www.eclipse.org.
> */
>
> package org.eclipse.b3.backend.core.datatypes;
>
> import java.io.Serializable;
>
> /**
> * A representation of an integer with a radix
> *
> */
> public class IntegerWithRadix implements Serializable {
> private static final long serialVersionUID = 3249927626857631021L;
>
> private Integer value;
>
> private int radix;
>
> public IntegerWithRadix(int value) {
> this.value = new Integer(value);
> this.radix = 10;
> }
>
> public IntegerWithRadix(int value, int radix) {
> if(!(radix == 8 || radix == 10 || radix == 16))
> throw new IllegalArgumentException("Only radix 8, 10 or 16
> supported");
> this.value = value;
> this.radix = radix;
> }
>
> public int getRadix() {
> return radix;
> }
>
> public Integer getValue() {
> return value;
> }
>
> @Override
> public String toString() {
> StringBuffer buf = new StringBuffer();
> if(radix == 8)
> buf.append("0");
> else if(radix == 16)
> buf.append("0x");
> buf.append(Integer.toString(value, radix));
> return buf.toString();
> }
> }
>
>
Re: Xtext grammar for real literals and filenames with a . dot extension [message #996916 is a reply to message #996816] Thu, 03 January 2013 07:21 Go to previous message
Alexander Nittka is currently offline Alexander Nittka
Messages: 1156
Registered: July 2009
Senior Member
Hi,

make filename a datatype rule and not a terminal rule. Actually, this applies to almost all your terminal rules. Not everything has to be dealt with in the grammar. In most cases it is better to have datatype rules and value converters throwing meaningful errors. Overlapping terminal rules are a pain, in particular if you don't really know how the terminals work.

Alex


Need training, onsite consulting or any other kind of help for Xtext?
Go visit http://xtext.itemis.com or send a mail to xtext@itemis.de
Previous Topic:UnsupportedOperationException on XExpression compiling
Next Topic:Formatter problem
Goto Forum:
  


Current Time: Thu Oct 23 06:44:20 GMT 2014

Powered by FUDForum. Page generated in 0.02909 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software