|Cross-references for multi-word identifiers (cont.) [message #758525]
||Wed, 23 November 2011 15:14
Registered: August 2011
This is following on from a previous thread I posted quite a while ago, under "Cross-references for multi-word identifiers". (Sorry, I tried to post a link, but I haven't posted here enough to be allowed to yet!)
Basically, I'm writing a grammar which can accept some limited (very limited) natural language input. I recognise that this is stretching Xtext quite a lot, in particular because I'm not allowed to use any symbolic means to delimit separate phrases during parsing. So if I have:
Thales of Miletus lives on an island.
the parser has no way to tell that it should be groups as follows:
[Thales of Miletus] [lives on] an [island].
('an' left outside the square brackets because it's a keyword, that's all.)
I thought I could get around this by asking the user to specify which of certain kinds of phrase they wanted to use. So I want to accept
nounphrase Thales of Miletus.
verbphrase lives on.
Thales of Miletus lives on an island.
I had assumed that by specifying the sentence rule as something like:
Sentence : subject=[NounPhrase|MultiWordID] verb=[VerbPhrase|MultiWordID] (article='a' | article='an') object=[NounPhrase|MultiWordID] '.';
where the other rules are:
MultiWordID hidden() : ID (WS ID)*;
NounPhrase : 'nounphrase' name=MultiWordID '.';
VerbPhrase : 'verbphrase' name=MultiWordID '.';
then Xtext, with backtracking enabled, would be able to parse Sentences because it would already have the phrases to match against. It seems, though, that it doesn't work like that; I get errors saying "Couldn't resolve reference to NounPhrase 'Thales of Miletus lives on'", etc.
I can see why it's doing that - 'Thales of Miletus lives on' is the largest contiguous chunk matching the MultiWordID rule - but I don't understand why, given that the subject of a Sentence is specified as a cross-reference to an already-declared NounPhrase, it isn't looking up what possible values of the cross-reference there are.
I'm also getting another problem where a rule like:
IsSentence: subject=[NounPhrase|MultiWordID] 'is' verb=[PassiveVerbPhrase|MultiWordID] (article='a' | article='an') object=[NounPhrase|MultiWordID] '.';
doesn't seem to be being called at all - the error messages suggest that the parser is trying to handle
Thales of Miletus is seen on an island.
with the Sentence rule, not the IsSentence rule, and again, I don't know what's going on. I'd have expected backtracking and the presence of the keyword 'is' to have allowed this to be parsed too, but apparently it doesn't.
No doubt I've either missed something very very basic, or I'm trying the impossible, but I must admit I'm stuck at resolving this myself. Any help would be very much appreciated. Sorry if I haven't been very clear, do let me know if I can give any more details about anything.
|Re: Cross-references for multi-word identifiers (cont.) [message #761041 is a reply to message #760913]
||Mon, 05 December 2011 19:57
| Meinte Boersma
Registered: July 2009
Location: Leiden, Netherlands
Your grammar is unambiguous (or so ANTLR tells me), so the generated parser can only take the one path and will not backtrack at all. The MultiWordID datatype rule simply gobbles up all ID tokens (separated by WS) until it encounters a token that's neither, meaning one of the keywords '.', 'a', 'an' or 'is'. Also, the (Is)Sentence rules will consume one (1) ID token and will not use the MultiWordID, so unless your declarations have names consisting of just 1 ID, the references will not be resolved.|
Note that Xtext uses ANTLR which is LL(*)-tech generating predictive parsers - meaning they are essentially greedy and that's totally different from a true backtracking parser.
Xtext blogs: executable models...again? | workshop material | custom scoping with Xtend
Powered by FUDForum
. Page generated in 0.02035 seconds