|
|
|
|
Re: Exception when testing language [message #801536 is a reply to message #801447] |
Sat, 18 February 2012 15:04 |
Henrik Lindberg Messages: 2509 Registered: July 2009 |
Senior Member |
|
|
On 2012-18-02 12:51, Vlad Dumitrescu wrote:
> Hello Henrik and thanks for your heroic attempt to make sense of my
> scribblings! I probably tried too hard to simplify the context.
>
> There are two issues in my mind:
> 1. if the grammar is nonsense, couldn't the compiler give a warning or
> error? It feels like the root cause for a NPE in the serializer could
> have been detected by the compiler.
>
An NPE is never good, should have done something else. Note that there
are valid grammars that are not serializeable - they still serve a
purpose but are essentially one way (they can be used to parse). The
serialization validation should have caught the problem IMO.
> 2. what would be the right way to write the grammar rules?
>
Depends on what you really want... which is what I am trying to figure
out :)
> There are four expressions that I need to parse here:
> * ExprMax * ExprMax(arg) which is a function call
> * ExprMax#type which is a typed expression
> * #type which is a type constructor
>
> The way I started with it was something like
>
>
> Expression: FunCall | TypedExpr;
> FunCall: ExprMax ('(' ')')?;
> TypedExpr: ExprMax? => '#' ID;
>
>
This is ambiguous since a FunCall can consist of only an ExprMax, and it
is impossible to detect if a following '#' ID belongs to the ExprMax or
is free standing. Using 'a' for literal, given "a#b" is this a FunCall
followed by a TypedExpr or a TypedExpr?
Try something like:
Expression : TypedExpression ;
TypedExpression
: FunctionCall
({TypedExpression.left = current}
'#' type = ID)?
;
FunctionCall
: PrimaryExpression
({FunctionCall.left = current}
'(' args += Expression* ')')?
;
PrimaryExpression
: Literal
| ParenthesizedExpression
;
Literal : value = ID;
ParenthesizedExpression : '(' Expression ')' ;
This allows any primary expression to be used as LHS in FunctionCall,
and anything can be typed. If you need to disallow certain input - say
"(a)#b" simply add validation to TypedExpression and check the class of
the LHS. Likewise, if you do not allow "(a)()", simply check LHS of
FunctionCall for acceptable class.
a => (Literal a)
a#b => (TypedExpression (Literal a) b)
a() => (FunctionCall (Literal a))
a()#b => (TypeExpression (FunctionCall (Literal a)) b)
etc.
Did you want #ID to mean "type creation expression"? You could do this:
PrimaryExpression
: Literal
| ParenthesizedExpression
| TypeDeclaration
;
TypeDeclaration
: '#' type = ID
;
You also need to have a separator (e.g. ',') between the arguments in
the function call as Expression* becomes ambiguous.
This could work but results in possibility to express things that seems
nonsensical - i.e. passing a TypeDeclaration as an argument, calling it
etc. i.e. these expression would be valid:
#b
#b()
#b#c
a(#b, #c)
....
You probably want to have it at a higher level in the grammar.
Something like this perhaps:
Statements :
statements += Statement+
Statement : Expression '.' | TypeDeclaration ;
Note that a separator between expressions is required.
> but that is ambiguos on ExprMax. Maybe it would be enough to set "k=2"
> on the ANTLR grammar? Problem is that I get syntax errors when I try it
> - "backtrack=true" works, but "k=2" gives "no viable alternative at
> input '2'"...
>
These are the kinds of things that are getting you into trouble - you
have to start with a grammar that is not ambiguous. When you turn on
backtracking without knowing why, you are basically letting the parser
guess for you. If you set k to a number you are basically setting
look-ahead to 2, better to leave it to its default.
> best regards,
> Vlad
>
Hope that helps you.
- henrik
|
|
|
|
|
|
Powered by
FUDForum. Page generated in 0.04871 seconds