Skip to main content



      Home
Home » Modeling » TMF (Xtext) » Lexer mixes some inputs even though it is set to non-greedy?
Lexer mixes some inputs even though it is set to non-greedy? [message #1841801] Sun, 30 May 2021 13:40 Go to next message
Eclipse UserFriend
I have created a simple grammar, where a value can be a refType:

RefType: 'Aggregate' | 'Composite' | 'Association';

When the value is set to Aggregate or Association, it does not give an error. However, when Composite is written, it gives "no viable alternative at character 'o'. In the DSL it happens at "value": "Composite" . It is the second to last parser rule named ClassReferences in the grammar, at this exact line: '"value"' ':' '"' typeValue = RefType '"'.

I have had a similar error before which was fixed by setting lexer to being greedy:
parserGenerator = {
options = {
backtrackLexer = true
}
}

This however doesn't solve it this time, and I don't really get how changing Composite to Aggregate or Association fixes the problem.

The grammar:
Model:
   {Model}
   '[' tables += Table? (',' tables += Table)* ']'
;

Table:
   ClassTable | ClassReferencesTable
;

ClassTable:
   {ClassTable}
   '{'
      '"Name"' ':' '"Class"' ','
      '"Table"' ':' '['
           class += Class? (',' class += Class)*
      ']'
   '}'
;

Class:
   '{'
      '"Name"' ':' '{'
         '"column"' ':' nameColumn = INT ','
         '"row"' ':' nameRow = INT ','
         '"value"' ':' name = STRING
      '}' (',')? 
      '"References"' ':' '['
         ((references += ClassReferences (',' references += ClassReferences)*) | (references += ClassReferencesReference (',' references += ClassReferencesReference)*))
      ']'
   '}'
;

ClassReferencesTable:
   {ClassReferencesTable}
   '{'
      '"Name"' ':' '"ClassReferences"' ','
      '"Table"' ':' '['
           references += ClassReferences? (',' references += ClassReferences)*
      ']'
   '}'
;

ClassReferencesReference:
   '{'
      '"Name"' ':' '{'
         '"column"' ':' column = INT ','
         '"row"' ':' row = INT ','
         '"value"' ':' name = [ClassReferences|STRING]
      '}'
   '}'
;

ClassReferences:
   '{'
      '"Type"' ':' '{'
         '"column"' ':' typeColumn = INT ','
         '"row"' ':' typeRow = INT ','
         '"value"' ':' '"' typeValue = RefType '"' 
      '}' (',')? 
      '"To"' ':' '{'
         '"column"' ':' toColumn = INT ','
         '"row"' ':' toRow = INT ','
         '"value"' ':' toValue = [Class|STRING]
      '}'
   '}'
;

RefType: 'Aggregate' | 'Composite' | 'Association';

terminal NULL: 'null';

terminal ID: '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*;

terminal STRING: '"OKZVVTSPKHOVYSMU' -> 'SQPSUQMWUPQSBXDT"';

terminal INT returns ecore::EInt: ('0'..'9')+;

terminal FLOAT: '-'? INT? '.' INT (('E'|'e') '-'? INT)?;

terminal BOOLEAN: 'true' | 'false';

terminal ML_COMMENT: '/*' -> '*/';

terminal SL_COMMENT: '//' !('\n'|'\r')* ('\r'? '\n')?;

terminal WS: (' '|'\t'|'\r'|'\n')+;

terminal ANY_OTHER: .;



The DSL:
[
  {
    "Name": "Class",
    "Table": [
      {
        "Name": {
          "column": 0,
          "row": 4,
          "value": "OKZVVTSPKHOVYSMUSchoolSQPSUQMWUPQSBXDT"
        },
        "References": [
          {
            "Type": {
              "column": 1,
              "row": 4,
              "value": "Composite" 
            },
            "To": {
              "column": 2,
              "row": 4,
              "value": "OKZVVTSPKHOVYSMUClassRoomSQPSUQMWUPQSBXDT"
            }
          }
        ]
      },
      {
        "Name": {
          "column": 0,
          "row": 5,
          "value": "OKZVVTSPKHOVYSMUClassRoomSQPSUQMWUPQSBXDT"
        },
        "References": [
          {
            "Type": {
              "column": 1,
              "row": 5,
              "value": "Aggregate"
            },
            "To": {
              "column": 2,
              "row": 5,
              "value": "OKZVVTSPKHOVYSMUSchoolSQPSUQMWUPQSBXDT"
            }
          }
        ] 
      }
    ]
  }
]

[Updated on: Sun, 30 May 2021 15:42] by Moderator

Re: Lexer mixes some inputs even though it is set to greedy? [message #1841802 is a reply to message #1841801] Sun, 30 May 2021 14:51 Go to previous messageGo to next message
Eclipse UserFriend
i have no idea on that. but maybe antlr lexers cannot handle all the " stuff you do. in the case the "Class" vs " Composite " even with backtracking
maybe jflex can handle this.
https://github.com/TypeFox/xtext-jflex

[Updated on: Sun, 30 May 2021 14:55] by Moderator

Re: Lexer mixes some inputs even though it is set to greedy? [message #1841803 is a reply to message #1841802] Sun, 30 May 2021 15:07 Go to previous messageGo to next message
Eclipse UserFriend
maybe you can also try to customize the lexer to be non greedy instead of backtracking only.
have zero idea if this has any sideeffects

			parserGenerator =CustomXtextAntlrGeneratorFragment2 {
				options= {
					backtrackLexer=true
				}
			}


class CustomXtextAntlrGeneratorFragment2 extends XtextAntlrGeneratorFragment2 {
		protected def compileLexerOptions(Grammar it, AntlrOptions options) '''
		«IF options.backtrackLexer»

			options {
				backtrack=true;
				memoize=true;
				greedy=false;
			}
		«ENDIF»
	'''
}
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841804 is a reply to message #1841803] Sun, 30 May 2021 15:27 Go to previous messageGo to next message
Eclipse UserFriend
Thanks, I'll give it a shot. Where am I supposed to put the CustomXtextAntlrGeneratorFragment2 class?
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841805 is a reply to message #1841804] Sun, 30 May 2021 15:42 Go to previous messageGo to next message
Eclipse UserFriend
Depends on how you build

If you use maven or gradle you need to create a separate module/plugin
Otherwise next wo workflow is ok
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841810 is a reply to message #1841805] Mon, 31 May 2021 02:24 Go to previous messageGo to next message
Eclipse UserFriend
Hi

Your grammar is provocative in a couple of ways and provocative grammars have an uncanny ability to target obscure tooling bugs/features ....

Your use of quotes in keywords causes an ambiguity wrt STRING. I strongly recommend your use conventional keyword spelling to avoid STRING backtracking.

Your failure to extract '"column"', '"row"', '"value"' as a common element may require backtracking since the keyword leads to multiple possible states. I strongly recommend that you factor out the tuple as a rule.

Regards

Ed Willink
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841815 is a reply to message #1841810] Mon, 31 May 2021 03:02 Go to previous messageGo to next message
Eclipse UserFriend
By conventional keyword spelling I assume you mean without double quotes? This is made to support JSON, so I don't see how I can support the double quotes then?

I'll look into extract column, row, value. Thanks
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841819 is a reply to message #1841815] Mon, 31 May 2021 04:13 Go to previous messageGo to next message
Eclipse UserFriend
Hi

You STRING challenge is potentially the same as is usually faced betweem keywords and IDs requiring a keywordOrID to allow keywords as ids. So you might need the same.

But when I look at STRING which has true usages I am baffled by its definition. It looks like a bit of development debug typing to define the impossible. Your mistake ? My stupidity?

Terminals such as ID and STRING are partially built-in to Xtext and so can be difficult to totally replace. I suggest you use a different spelling for a different purpose.

Regards

Ed Willink

Re: Lexer mixes some inputs even though it is set to greedy? [message #1841820 is a reply to message #1841819] Mon, 31 May 2021 04:44 Go to previous messageGo to next message
Eclipse UserFriend
i mean e.g. '"Class"'
vs
'"' 'Composite' '"'
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841822 is a reply to message #1841820] Mon, 31 May 2021 04:56 Go to previous messageGo to next message
Eclipse UserFriend
Ed Willink wrote on Mon, 31 May 2021 08:13
Hi

You STRING challenge is potentially the same as is usually faced betweem keywords and IDs requiring a keywordOrID to allow keywords as ids. So you might need the same.

But when I look at STRING which has true usages I am baffled by its definition. It looks like a bit of development debug typing to define the impossible. Your mistake ? My stupidity?

Terminals such as ID and STRING are partially built-in to Xtext and so can be difficult to totally replace. I suggest you use a different spelling for a different purpose.

Regards

Ed Willink


The STRING inputs are retrieved from the user and wrapped "OKZVVTSPKHOVYSMU and SQPSUQMWUPQSBXDT". This was done to allow the user to write " within the actual value field. OKZVVTSPKHOVYSMU and SQPSUQMWUPQSBXDT was chosen as random strings as it is not expected the user writes any of these strings.

Christian Dietrich wrote on Mon, 31 May 2021 08:44
i mean e.g. '"Class"'
vs
'"' 'Composite' '"'

I had a previous problem with this, but by enabling backtracking it fixed this problem. Why would it suddenly not work anymore?
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841823 is a reply to message #1841822] Mon, 31 May 2021 05:46 Go to previous messageGo to next message
Eclipse UserFriend
I have no idea. Did you try the greedy
Re: Lexer mixes some inputs even though it is set to greedy? [message #1841826 is a reply to message #1841823] Mon, 31 May 2021 07:50 Go to previous message
Eclipse UserFriend
Hi

Quote:
The STRING inputs are retrieved from the user and wrapped "OKZVVTSPKHOVYSMU and SQPSUQMWUPQSBXDT"


Ok. Though you might prefer to name the terminal WRAPPED_STRING to avoid confusing the casual reader and to allow a ValueConverter to unwrap in diagnostics.

Temporaily replacing the " in just STRING by perhaps # might help distinguish a backtracking issue

Regards

Ed Willink
Previous Topic:Using the Xtext.xtext grammar in another DSL gives "Couldn't resolve reference to EClassifier&q
Next Topic:Optional ordering with commas where first element can also be optional
Goto Forum:
  


Current Time: Sun Apr 20 23:24:54 EDT 2025

Powered by FUDForum. Page generated in 0.04621 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top