Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » Cross-references for multi-word identifiers (cont.)
Cross-references for multi-word identifiers (cont.) [message #758525] Wed, 23 November 2011 15:14 Go to next message
SWAT  is currently offline SWAT Friend
Messages: 4
Registered: August 2011
Junior Member
Hi,

This is following on from a previous thread I posted quite a while ago, under "Cross-references for multi-word identifiers". (Sorry, I tried to post a link, but I haven't posted here enough to be allowed to yet!)

Basically, I'm writing a grammar which can accept some limited (very limited) natural language input. I recognise that this is stretching Xtext quite a lot, in particular because I'm not allowed to use any symbolic means to delimit separate phrases during parsing. So if I have:

Thales of Miletus lives on an island.

the parser has no way to tell that it should be groups as follows:

[Thales of Miletus] [lives on] an [island].

('an' left outside the square brackets because it's a keyword, that's all.)

I thought I could get around this by asking the user to specify which of certain kinds of phrase they wanted to use. So I want to accept

nounphrase Thales of Miletus.
verbphrase lives on.
nounphrase island.
Thales of Miletus lives on an island.

I had assumed that by specifying the sentence rule as something like:

Sentence : subject=[NounPhrase|MultiWordID] verb=[VerbPhrase|MultiWordID] (article='a' | article='an') object=[NounPhrase|MultiWordID] '.';

where the other rules are:

MultiWordID hidden() : ID (WS ID)*;
NounPhrase : 'nounphrase' name=MultiWordID '.';
VerbPhrase : 'verbphrase' name=MultiWordID '.';

then Xtext, with backtracking enabled, would be able to parse Sentences because it would already have the phrases to match against. It seems, though, that it doesn't work like that; I get errors saying "Couldn't resolve reference to NounPhrase 'Thales of Miletus lives on'", etc.

I can see why it's doing that - 'Thales of Miletus lives on' is the largest contiguous chunk matching the MultiWordID rule - but I don't understand why, given that the subject of a Sentence is specified as a cross-reference to an already-declared NounPhrase, it isn't looking up what possible values of the cross-reference there are.

I'm also getting another problem where a rule like:

IsSentence: subject=[NounPhrase|MultiWordID] 'is' verb=[PassiveVerbPhrase|MultiWordID] (article='a' | article='an') object=[NounPhrase|MultiWordID] '.';

doesn't seem to be being called at all - the error messages suggest that the parser is trying to handle

Thales of Miletus is seen on an island.

with the Sentence rule, not the IsSentence rule, and again, I don't know what's going on. I'd have expected backtracking and the presence of the keyword 'is' to have allowed this to be parsed too, but apparently it doesn't.

No doubt I've either missed something very very basic, or I'm trying the impossible, but I must admit I'm stuck at resolving this myself. Any help would be very much appreciated. Sorry if I haven't been very clear, do let me know if I can give any more details about anything.

Many thanks!
Re: Cross-references for multi-word identifiers (cont.) [message #758579 is a reply to message #758525] Wed, 23 November 2011 18:12 Go to previous messageGo to next message
Meinte Boersma is currently offline Meinte BoersmaFriend
Messages: 434
Registered: July 2009
Location: Leiden, Netherlands
Senior Member
To start with the 1st thing: a cross reference is not a symbol table. If you have a =[NounPhrase] somewhere, then another rule should exist with an assignment name=NounPhrase - using 'name' ensures that it's exposed to the linker. I don't see that in your grammar, hence the "Couldn't resolve reference"-errors.

Secondly, backtracking is a tricky thing: if you rely on it, then you're probably going to see some weird things. You didn't give your entire grammar: could it be that IsSentence indeed is never called? In that case, backtracking isn't going to help you out.


Re: Cross-references for multi-word identifiers (cont.) [message #760913 is a reply to message #758579] Mon, 05 December 2011 15:15 Go to previous messageGo to next message
SWAT  is currently offline SWAT Friend
Messages: 4
Registered: August 2011
Junior Member
Hi,

Thanks for the response and sorry for the long delay, I've been away from my desk for too long!

This is the full grammar I've got:

grammar org.xtext.example.nltest.NLTest with org.eclipse.xtext.common.Terminals

generate nLTest "NLTest URL (censored by forum software, not me!)"

Model:
sentences+=Sentence* issentences+=IsSentence* declaration+=Declaration*;

Declaration : NounPhrase | VerbPhrase | PassiveVerbPhrase;

Sentence : subject=[NounPhrase] verb=[VerbPhrase] (article='a' | article='an') object=[NounPhrase] '.';
IsSentence: subject=[NounPhrase] 'is' verb=[PassiveVerbPhrase] (article='a' | article='an') object=[NounPhrase] '.';

MultiWordID hidden() : ID (WS ID)*;
NounPhrase : 'nounphrase' name=MultiWordID '.';
VerbPhrase : 'verbphrase' name=MultiWordID '.';
PassiveVerbPhrase : 'passive' name=MultiWordID '.';


and on the following input:
Albert Albertson owns a dog..
Albert Albertson is nicknamed Al.
nounphrase Albert Albertson.
passive nicknamed.
verbphrase owns.
nounphrase dog.
nounphrase Al.

I get several errors:

No viable alternative at input '.'
Couldn't resolve reference to VerbPhrase 'dog'.
Mismatched input 'a' expecting RULE_ID.
Couldn't resolve reference to NounPhrase 'Albert Albertson owns a'.

just on the first line of input.

I can see that probably all of these come from the fact that it's trying to parse 'Albert Albertson owns a' as a noun phrase. What I'd expect is that when this fails, backtracking would lead to a correct parse, and this doesn't seem to be happening.

I'm from a Prolog background, ambiguous grammars backtracking are the norm for me, I confess I'm a bit lost as to why this doesn't work. Unrealistic expectations are almost certainly the reason!

Any further advice would be much appreciated!

Many thanks!
Re: Cross-references for multi-word identifiers (cont.) [message #761041 is a reply to message #760913] Mon, 05 December 2011 19:57 Go to previous message
Meinte Boersma is currently offline Meinte BoersmaFriend
Messages: 434
Registered: July 2009
Location: Leiden, Netherlands
Senior Member
Your grammar is unambiguous (or so ANTLR tells me), so the generated parser can only take the one path and will not backtrack at all. The MultiWordID datatype rule simply gobbles up all ID tokens (separated by WS) until it encounters a token that's neither, meaning one of the keywords '.', 'a', 'an' or 'is'. Also, the (Is)Sentence rules will consume one (1) ID token and will not use the MultiWordID, so unless your declarations have names consisting of just 1 ID, the references will not be resolved.

Note that Xtext uses ANTLR which is LL(*)-tech generating predictive parsers - meaning they are essentially greedy and that's totally different from a true backtracking parser.


Previous Topic:Documentation on functions called in xtext editor !!
Next Topic:advanced qualified names
Goto Forum:
  


Current Time: Fri Apr 26 20:02:11 GMT 2024

Powered by FUDForum. Page generated in 0.04348 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top