whitespaces not allowed in grammar [message #870725] |
Wed, 09 May 2012 09:21  |
Eclipse User |
|
|
|
I'm trying to build a Xtext Grammar Rule that accepts the following Syntax:
system check_hardware
system check_hardware
system check_hardware -all
system check_hardware -all -i
system check_hardware -all -i "HDD"
system -i check_hardware
system -i check_hardware
system -i check_hardware -all
system -i check_hardware -all -i
system -i check_hardware -all -i "HDD"
You can immagine, the keyword system announces: a systemcommand will follow.
Optional the keyword system is able to handle one option -i.
After that (instead of my example check_hardware) each linux command resectively programm can follow with its options, for example 'echo "Hello World"' or maybe 'grep -i test *'
My approach for that is the following rule:
system: 'system' '-i'? prog=ID ('-'ID STRING?)* ;
So we have the keyword system, an optional option -i the progam-/commandname (which has usually the Syntax of an ID) an finally the options for the command...
Here are some results, from this Rule:
system echo //OK
system -i echo //OK
system -i echo -n "Hello World!" //OK
system -i echo -i "Hello World!" //NOT OK -> "no viable alternative at input '-i'" //(every other character as an option except -i would be ok)
Can somebody tell me, what is wrong with my grammar?
Other open issue is regarding whitespaces...
How can I prevent xtext to accept:
system -i echo - n "Hello World!"
system-i echo -n "Hello World!"
system-iecho-n"Hello World!"
The last three inputs will be recognized as good. But they should be bad...
Great Thanks in advance.
|
|
|
|
|
Re: whitespaces not allowed in grammar [message #870799 is a reply to message #870778] |
Wed, 09 May 2012 14:03   |
Eclipse User |
|
|
|
On 2012-09-05 17:39, Tim Student wrote:
> Thanks the approach is clear so far, but can you tell it to me some more
> detailed?
>
> My Problem is, that I cannot handle whitespace characters. For example I
> would like to realise something like a variable assignment, that is
> restricted to:
>
> //only this should be valid:
> var="anystring";
>
> //and not
> var= "anystring";
> //and not
> var ="anystring";
> //and not
> var = "anystring";
>
>
> When my grammar says:
>
> variable: name=ID '=' varcontent=STRING;
>
> ... all the three above mentioned entries will be valid, becaus Xtext
> allows an arbitrary number of whitespaces between the single elements.
>
> What can I do there?
>
What I meant, but did not say - I wrote hidden(WS) - it should be the
other way around - you specify what should be hidden i.e.
hidden(SL_COMMENT, ML_COMMENT, ...) if you want to allow comments, or
hidden() if nothing should be hidden - you then specify WS where it is
allowed.
Rule hidden() :
variable: name=ID '=' varcontent=STRING;
should work.
- henrik
|
|
|
|
Re: whitespaces not allowed in grammar [message #875369 is a reply to message #870725] |
Tue, 22 May 2012 11:59  |
Eclipse User |
|
|
|
Tim Student wrote on Wed, 09 May 2012 15:21I'm trying to build a Xtext
system echo //OK
system -i echo //OK
system -i echo -n "Hello World!" //OK
system -i echo -i "Hello World!" //NOT OK -> "no viable alternative at input '-i'" //(every other character as an option except -i would be ok)
Can somebody tell me, what is wrong with my grammar?
The problem is your lexer tokens. Let me explain the problem to you.
Before the actual parser parses your input, the input is preprocessed by the so-called lexer. The lexer converts the input which is just a stream of characters to a so called token stream. Lets identify the tokens in your grammar:
'system'
'-i'
'-'
ID
STRING
These tokens follow directly from your grammar where ID and STRING are predefined tokens (also called terminals).
What the lexer now does, is to convert the stream of characters to a stream of tokens. Lets look at two of your examples:
system -i echo -n "Hello World!" ==> 'system' '-i' ID '-' ID STRING
system -i echo -i "Hello World!" ==> 'system' '-i' ID '-i' STRING
Note in the second example the "-i" is converted to '-i' instead of '-' ID. This is because the lexer prefers token '-i' over ID.
What your parser expects is the following stream of tokens:
'system' '-i'? ID ('-' ID STRING?)*
In the second example the parser cannot regognize the input because it sees the '-i' token but expects the '-' token.
I hope the problem is clear to you now.
There are several ways to fix this problem:
One solution would be to allow '-i' as an alternative to '-'ID:
system: 'system' '-i'? prog=ID (('-'ID | '-i') STRING?)* ;
This solution feels a little dirty for me.
Another solution which I would prefer is to allow any ID as flag after system and use the validator to ensure that only "-i" is used. I would also define an own token (terminal) for the flags:
system: 'system' FLAG? prog=ID (FLAG STRING?)*
terminal FLAG: '-' ('a'..'z'|'A'..'Z'|'0'..'9')+;
The own FLAG token also solves your problem with the spaces between the '-' and the flag characters.
Don't forget to implement the validation that ensures that the first FLAG just is "-i".
I hope that was helpful,
Felix
[Updated on: Tue, 22 May 2012 12:00] by Moderator
|
|
|
Powered by
FUDForum. Page generated in 0.03929 seconds