Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Lexer mixes some inputs even though it is set to non-greedy?
Lexer mixes some inputs even though it is set to non-greedy? [message #1841801] Sun, 30 May 2021 17:40 Go to next message
M D is currently offline M DFriend
Messages: 33
Registered: January 2021
Member
I have created a simple grammar, where a value can be a refType:

RefType: 'Aggregate' | 'Composite' | 'Association';

When the value is set to Aggregate or Association, it does not give an error. However, when Composite is written, it gives "no viable alternative at character 'o'. In the DSL it happens at "value": "Composite" . It is the second to last parser rule named ClassReferences in the grammar, at this exact line: '"value"' ':' '"' typeValue = RefType '"'.

I have had a similar error before which was fixed by setting lexer to being greedy:
parserGenerator = {
options = {
backtrackLexer = true
}
}

This however doesn't solve it this time, and I don't really get how changing Composite to Aggregate or Association fixes the problem.

The grammar:
Model:
   {Model}
   '[' tables += Table? (',' tables += Table)* ']'
;

Table:
   ClassTable | ClassReferencesTable
;

ClassTable:
   {ClassTable}
   '{'
      '"Name"' ':' '"Class"' ','
      '"Table"' ':' '['
           class += Class? (',' class += Class)*
      ']'
   '}'
;

Class:
   '{'
      '"Name"' ':' '{'
         '"column"' ':' nameColumn = INT ','
         '"row"' ':' nameRow = INT ','
         '"value"' ':' name = STRING
      '}' (',')? 
      '"References"' ':' '['
         ((references += ClassReferences (',' references += ClassReferences)*) | (references += ClassReferencesReference (',' references += ClassReferencesReference)*))
      ']'
   '}'
;

ClassReferencesTable:
   {ClassReferencesTable}
   '{'
      '"Name"' ':' '"ClassReferences"' ','
      '"Table"' ':' '['
           references += ClassReferences? (',' references += ClassReferences)*
      ']'
   '}'
;

ClassReferencesReference:
   '{'
      '"Name"' ':' '{'
         '"column"' ':' column = INT ','
         '"row"' ':' row = INT ','
         '"value"' ':' name = [ClassReferences|STRING]
      '}'
   '}'
;

ClassReferences:
   '{'
      '"Type"' ':' '{'
         '"column"' ':' typeColumn = INT ','
         '"row"' ':' typeRow = INT ','
         '"value"' ':' '"' typeValue = RefType '"' 
      '}' (',')? 
      '"To"' ':' '{'
         '"column"' ':' toColumn = INT ','
         '"row"' ':' toRow = INT ','
         '"value"' ':' toValue = [Class|STRING]
      '}'
   '}'
;

RefType: 'Aggregate' | 'Composite' | 'Association';

terminal NULL: 'null';

terminal ID: '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;

terminal STRING: '"OKZVVTSPKHOVYSMU' -> 'SQPSUQMWUPQSBXDT"';

terminal INT returns ecore::EInt: ('0'..'9')+;

terminal FLOAT: '-'? INT? '.' INT (('E'|'e') '-'? INT)?;

terminal BOOLEAN: 'true' | 'false';

terminal ML_COMMENT: '/*' -> '*/';

terminal SL_COMMENT: '//' !('\n'|'\r')* ('\r'? '\n')?;

terminal WS: (' '|'\t'|'\r'|'\n')+;

terminal ANY_OTHER: .;



The DSL:
[
  {
    "Name": "Class",
    "Table": [
      {
        "Name": {
          "column": 0,
          "row": 4,
          "value": "OKZVVTSPKHOVYSMUSchoolSQPSUQMWUPQSBXDT"
        },
        "References": [
          {
            "Type": {
              "column": 1,
              "row": 4,
              "value": "Composite" 
            },
            "To": {
              "column": 2,
              "row": 4,
              "value": "OKZVVTSPKHOVYSMUClassRoomSQPSUQMWUPQSBXDT"
            }
          }
        ]
      },
      {
        "Name": {
          "column": 0,
          "row": 5,
          "value": "OKZVVTSPKHOVYSMUClassRoomSQPSUQMWUPQSBXDT"
        },
        "References": [
          {
            "Type": {
              "column": 1,
              "row": 5,
              "value": "Aggregate"
            },
            "To": {
              "column": 2,
              "row": 5,
              "value": "OKZVVTSPKHOVYSMUSchoolSQPSUQMWUPQSBXDT"
            }
          }
        ] 
      }
    ]
  }
]

[Updated on: Sun, 30 May 2021 19:42]

Report message to a moderator

Re: Lexer mixes some inputs even though it is set to greedy? [message #1841802 is a reply to message #1841801] Sun, 30 May 2021 18:51 Go to previous messageGo to next message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 14716
Registered: July 2009
Senior Member
i have no idea on that. but maybe antlr lexers cannot handle all the " stuff you do. in the case the "Class" vs " Composite " even with backtracking
maybe jflex can handle this.
https://github.com/TypeFox/xtext-jflex


Twitter : @chrdietrich
Blog : https://www.dietrich-it.de

[Updated on: Sun, 30 May 2021 18:55]

Report message to a moderator

Re: Lexer mixes some inputs even though it is set to greedy? [message #1841803 is a reply to message #1841802] Sun, 30 May 2021 19:07 Go to previous messageGo to next message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 14716
Registered: July 2009
Senior Member
maybe you can also try to customize the lexer to be non greedy instead of backtracking only.
have zero idea if this has any sideeffects

			parserGenerator =CustomXtextAntlrGeneratorFragment2 {
				options= {
					backtrackLexer=true
				}
			}


class CustomXtextAntlrGeneratorFragment2 extends XtextAntlrGeneratorFragment2 {
		protected def compileLexerOptions(Grammar it, AntlrOptions options) '''
		«IF options.backtrackLexer»

			options {
				backtrack=true;
				memoize=true;
				greedy=false;
			}
		«ENDIF»
	'''
}


Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841804 is a reply to message #1841803] Sun, 30 May 2021 19:27 Go to previous messageGo to next message
M D is currently offline M DFriend
Messages: 33
Registered: January 2021
Member
Thanks, I'll give it a shot. Where am I supposed to put the CustomXtextAntlrGeneratorFragment2 class?
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841805 is a reply to message #1841804] Sun, 30 May 2021 19:42 Go to previous messageGo to next message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 14716
Registered: July 2009
Senior Member
Depends on how you build

If you use maven or gradle you need to create a separate module/plugin
Otherwise next wo workflow is ok


Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841810 is a reply to message #1841805] Mon, 31 May 2021 06:24 Go to previous messageGo to next message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7669
Registered: July 2009
Senior Member
Hi

Your grammar is provocative in a couple of ways and provocative grammars have an uncanny ability to target obscure tooling bugs/features ....

Your use of quotes in keywords causes an ambiguity wrt STRING. I strongly recommend your use conventional keyword spelling to avoid STRING backtracking.

Your failure to extract '"column"', '"row"', '"value"' as a common element may require backtracking since the keyword leads to multiple possible states. I strongly recommend that you factor out the tuple as a rule.

Regards

Ed Willink
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841815 is a reply to message #1841810] Mon, 31 May 2021 07:02 Go to previous messageGo to next message
M D is currently offline M DFriend
Messages: 33
Registered: January 2021
Member
By conventional keyword spelling I assume you mean without double quotes? This is made to support JSON, so I don't see how I can support the double quotes then?

I'll look into extract column, row, value. Thanks
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841819 is a reply to message #1841815] Mon, 31 May 2021 08:13 Go to previous messageGo to next message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7669
Registered: July 2009
Senior Member
Hi

You STRING challenge is potentially the same as is usually faced betweem keywords and IDs requiring a keywordOrID to allow keywords as ids. So you might need the same.

But when I look at STRING which has true usages I am baffled by its definition. It looks like a bit of development debug typing to define the impossible. Your mistake ? My stupidity?

Terminals such as ID and STRING are partially built-in to Xtext and so can be difficult to totally replace. I suggest you use a different spelling for a different purpose.

Regards

Ed Willink

Re: Lexer mixes some inputs even though it is set to greedy? [message #1841820 is a reply to message #1841819] Mon, 31 May 2021 08:44 Go to previous messageGo to next message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 14716
Registered: July 2009
Senior Member
i mean e.g. '"Class"'
vs
'"' 'Composite' '"'


Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841822 is a reply to message #1841820] Mon, 31 May 2021 08:56 Go to previous messageGo to next message
M D is currently offline M DFriend
Messages: 33
Registered: January 2021
Member
Ed Willink wrote on Mon, 31 May 2021 08:13
Hi

You STRING challenge is potentially the same as is usually faced betweem keywords and IDs requiring a keywordOrID to allow keywords as ids. So you might need the same.

But when I look at STRING which has true usages I am baffled by its definition. It looks like a bit of development debug typing to define the impossible. Your mistake ? My stupidity?

Terminals such as ID and STRING are partially built-in to Xtext and so can be difficult to totally replace. I suggest you use a different spelling for a different purpose.

Regards

Ed Willink


The STRING inputs are retrieved from the user and wrapped "OKZVVTSPKHOVYSMU and SQPSUQMWUPQSBXDT". This was done to allow the user to write " within the actual value field. OKZVVTSPKHOVYSMU and SQPSUQMWUPQSBXDT was chosen as random strings as it is not expected the user writes any of these strings.

Christian Dietrich wrote on Mon, 31 May 2021 08:44
i mean e.g. '"Class"'
vs
'"' 'Composite' '"'

I had a previous problem with this, but by enabling backtracking it fixed this problem. Why would it suddenly not work anymore?
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841823 is a reply to message #1841822] Mon, 31 May 2021 09:46 Go to previous messageGo to next message
Christian Dietrich is currently offline Christian DietrichFriend
Messages: 14716
Registered: July 2009
Senior Member
I have no idea. Did you try the greedy

Twitter : @chrdietrich
Blog : https://www.dietrich-it.de
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841826 is a reply to message #1841823] Mon, 31 May 2021 11:50 Go to previous message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7669
Registered: July 2009
Senior Member
Hi

Quote:
The STRING inputs are retrieved from the user and wrapped "OKZVVTSPKHOVYSMU and SQPSUQMWUPQSBXDT"


Ok. Though you might prefer to name the terminal WRAPPED_STRING to avoid confusing the casual reader and to allow a ValueConverter to unwrap in diagnostics.

Temporaily replacing the " in just STRING by perhaps # might help distinguish a backtracking issue

Regards

Ed Willink
Previous Topic:Using the Xtext.xtext grammar in another DSL gives "Couldn't resolve reference to EClassifier&q
Next Topic:Optional ordering with commas where first element can also be optional
Goto Forum:
  


Current Time: Fri Sep 13 19:38:59 GMT 2024

Powered by FUDForum. Page generated in 0.05312 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top