What is wrong with my grammar [message #1769726] |
Wed, 02 August 2017 18:37  |
Eclipse User |
|
|
|
Hi,
I don't like to ask overly broad questions in hopes someone will do my work for me. But I am stuck and don't have much clues as to what is wrong with my grammar. I have several issues to solve, so I am starting with something basic and minimal.
Have a look at this grammar:
/* Example:
/pls/;
/ {
p = "hello";
q = &label;
label: a {
r = 23;
}
}
*/
root:
v='/pls/' ';'
'/' '{'
properties+=property*
node+=Node*
'}'
;
Node:
(label=Label)? name=IDS
'{'
properties+=property*
nodes+=Node*
'}'
;
property:
name=IDS '=' (startLabel=Label)? (value=STRING | literal=Literal | ref=Reference) (endLabel=Label)? ';'
;
Label hidden():
name=IDS ':'
;
Reference hidden():
'&' label=[Label|IDS]
;
Literal returns ecore::ELong hidden():
('0x'|'0X')?
(
'a'|'b'|'c'|'d'|'e'|'f'|
'A'|'B'|'C'|'D'|'E'|'F'|
'0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'
)+
;
IDS returns ecore::EString hidden():
(
'a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'|'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z'|
'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'|
'0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'|','|'.'|'_'|'+'|'*'|'#'|'?'|'-'
)+
;
// Terminals
terminal STRING :
'"' ( '\\' . /* 'b'|'t'|'n'|'f'|'r'|'u'|'"'|"'"|'\\' */ | !('\\'|'"') )* '"'
;
terminal ML_COMMENT :
'/*' -> '*/'
;
terminal WS:
(' '|'\t'|'\r'|'\n')+
A few points on the choices above:
- A little while ago my grammar was working fine, when I had just one terminal that matched all identifiers and literals in the same rule. I used datatype rules to convert values and everything seemed to work fine
- Now I need to add expression handling, so I can't work with a simple terminal. As you can see identifiers can have characters like hyphen (-) :'( , and so can expressions. So I need context awareness. Which I understand terminals cannot have.
- So I decided to make datatype rules for both integer literals and identifiers so that i can use "-" and "+" in expressions
- Problem: This grammar doesn't compile. If I remove (endLabel=Label)? from property, it compiles. I understand it doesn't know where an integer ends an a label starts so it errors out. But why is that, when an integer literal rule is using hidden() which means any white-space would stop the rule from consuming more characters
- Problem: If (endLabel=Label)? is removed, and it compiles, it starts consuming strings such as "2 2" as a single literal or string such as " - a -" as an identifier... why is that? I don't want any spaces to be consumed like this.
- Problem: having specified individual characters as keywords, make them terminals. So when i double click a word in text editor it only selects one character. Also EVERY character appears as a keyword
Basically I am converting an existing bison grammar into Xtext and it uses start conditions which allow context specific lexical rules. What is the best way to define such a grammar in Xtext?
|
|
|
|
Re: What is wrong with my grammar [message #1769830 is a reply to message #1769736] |
Thu, 03 August 2017 15:51   |
Eclipse User |
|
|
|
Hi Christian,
Thanks for the reply and suggestions!
>> how does you datatype rule solve the expression problems 1+1 will be an IDS?
I did not add expression syntax to the example grammar above, because I did not want to complicate the scenario. However, I just wanted to let you know that I am not using terminals because identifiers can have for example a hyphen (-) and so can values that have expressions. So terminals cannot be used and I have rely on datatype rules (if i understand correctly). The real problem i am facing is that i can't seem to understand why a Literal or IDS rule is also consuming white-space characters.
>> how does the header of the grammar look like?
Here is the header:
grammar org.xtext.example.sdt.SysDefTree hidden(ML_COMMENT, WS)
generate sysDefTree "http://www.xtext.org/example/sdt/SysDefTree"
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
>> with grammar org.xtext.example.mydsl.MyDsl hidden(WS, ML_COMMENT) it generates with warnings. even if you have endlabel
For me, with endlabel I get errors, without it I can build. And the editor is generated.
>> could be there are still some bugs around the hidden stuff.
>> you can tell antlr to create a debug grammar. (e.g. to be opened with antlrworks)
I have generated an antlr debug grammar which I used to identify that Literal rule is consuming whitespace when it shouldn't. What is more confusing for me is that if I remove endlabel, and generate editor, it seems to work fine. That is it does not consume extra whitespace for the given Literal rule. However, when I use the debug grammar generated in antlrworks (v1.5.2) it seems to consume whitespace without any problems. So for a property like "s = 2 3 4;" antlrworks gives no errors and parses as a single literal "234" but the xtext generated editor gives error on this line. So to me it looks like there is some difference between Xtext and Antlr, that I am caught in between.
>> maybe you need an external / custom lexer e.g. built with jflex (there are some blogpost around the topic)
I can look at jflex, and if it is somewhat compatible with flex, i maybe able to use the original flex-bison source. I will try to find some blogs once I understand the above problem
>> maybe you can work around the buggy hidden support using explicit WS
I tried that but somewhat casually. I will have a look at it again, and report back.
>> the double klick problem has to do with the partitioning. same the coloring behaviour. so maybe you need to customire syntax highlighting. or you can somehow trick DefaultAntlrTokenToAttributeIdMapper to behave differently regarding "some" keywords
Thanks i will look at it once i solved the grammar issues.
|
|
|
|
|
|
|
Re: What is wrong with my grammar [message #1775993 is a reply to message #1769952] |
Wed, 08 November 2017 18:20  |
Eclipse User |
|
|
|
Coming back here to just give an update that I was able to plug in a custom lexer based on JFlex with the help of links you provided. It was not easy, but a fun experience, and certainly doable. Thanks for all the help!
[Updated on: Thu, 09 November 2017 13:04] by Moderator
|
|
|
Powered by
FUDForum. Page generated in 0.04401 seconds