How to convert STRING terminal to ecore::EInt when parse XML using xtext grammar? [message #1776902] |
Wed, 22 November 2017 10:57  |
Eclipse User |
|
|
|
I am doing some experiment about how to use xtext to parse XML, since I want to use XML to represent AST instead of pure text.
But I meet some issue about STRING terminal conversion.
For instance, I want to parse this xml node:
<MyEnum name = "TestEnum">
<MyEnumLiteral name = "Unknown" value = "-1" />
<MyEnumLiteral name = "First" value = "0"/>
<MyEnumLiteral name = "Second" value = "1"/>
</MyEnum >
My expectation is the value attribute is int.
The grammar is :
MyEnum:
'<MyEnum' 'name' '=' name=STRING '>'
literals += MyEnumLiteral*
'</MyEnum>'
;
MyEnumLiteral:
'<MyEnumLiteral' 'name' '=' name=STRING ('value' '=' value=STRING)? '/>'
;
After running mwe2 workflow, the generated class MyEnumLiteral contains two non-expected interfaces:
String getValue(); // <---- It is String, my expectationint is int...
void setValue(String value);
So, how to specify an underlying data type in grammar?
When parsing the terminal STRING, it could automatically convert the STRING to ecore::EInt? (If convert failed, the entire parsing process fails and report error.That is OK.)
I hope xtext provide some feature like: {ecore::EInt}, for instance:
MyEnumLiteral:
'<MyEnumLiteral' 'name' '=' name=STRING ('value' '=' value={ecore::EInt} STRING)? '/>'
Very thanks.
[Updated on: Wed, 22 November 2017 11:14] by Moderator
|
|
|
Re: How to convert STRING terminal to ecore::EInt when parse XML using xtext grammar? [message #1776903 is a reply to message #1776902] |
Wed, 22 November 2017 11:09   |
Eclipse User |
|
|
|
is having: value=STRING just for syntax? if no change it to value=INT
if yes change it to
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
Model:
value=STRING_THAT_IS_ACTUALLY_AN_INT
;
STRING_THAT_IS_ACTUALLY_AN_INT returns ecore::EInt:
STRING
;
and implement a value converter
//new
public class MyVC extends DefaultTerminalConverters {
@Inject
private MyINTValueConverter myINTValueConverter;
@ValueConverter(rule = "STRING_THAT_IS_ACTUALLY_AN_INT")
public IValueConverter<Integer> STRING_THAT_IS_ACTUALLY_AN_INT() {
return myINTValueConverter;
}
// TODO make this better
public static class MyINTValueConverter implements IValueConverter<Integer> {
@Override
public Integer toValue(String string, INode node) throws ValueConverterException {
if (string==null) {
throw new ValueConverterException("Couldn't convert '" + string + "' to an int value.", node, null);
}
string = string.trim();
if (string.length()<3) {
throw new ValueConverterException("Couldn't convert '" + string + "' to an int value.", node, null);
}
try {
int intValue = Integer.parseInt(string.substring(1, string.length()-1), 10);
return Integer.valueOf(intValue);
} catch (NumberFormatException e) {
throw new ValueConverterException("Couldn't convert '" + string + "' to an int value.", node, e);
}
}
@Override
public String toString(Integer value) throws ValueConverterException {
return "\""+value+"\"";
}
}
}
// adapt
class MyDslRuntimeModule extends AbstractMyDslRuntimeModule {
override bindIValueConverterService() {
MyVC
}
}
|
|
|
|
|
|
|
Re: How to convert STRING terminal to ecore::EInt when parse XML using xtext grammar? [message #1777395 is a reply to message #1777291] |
Wed, 29 November 2017 02:57   |
Eclipse User |
|
|
|
Jan Koehnlein wrote on Tue, 28 November 2017 08:35Why not use EMF directly to parse XML?
I know that xtext can generate the ecore model (model/generated/MyDsl.ecore) for xtext grammar automatically.
We can load ecore model instance by register ecore model or embed schemaLocation to ecore model instance file.
I use xtext to parse xml for two main reasons:
(1) I want to be able to edit xml (as DSL code) in xtext generated IDE. This maximizes the use of xtext's Validation and Content Assist.
Although some xml editors (eg: oxygen xml editor) also supports Content Assist, but not so perfect as xtext IDE (some feature like scoping maybe need an extra plugin).
(2) This experiment is only for my personal interest, not company project.
I see that:
xtext grammar == XML Schema / RelaxNG (Compact verion)
xtext validation == XML Schematron
xtext IDE == XML Editor (eg: oxygen xml editor)
xtext generator/Interpreter == XML XSLT
edit dsl code == edit XML instance as dsl AST node
So what I want to do is to write two piece of code, one is for xtext, another is for xml and compare their differences. (maybe the third for lisp-expression) These are all for learning purpose.
In this case, the generated MyDsl.ecore or MyDsl.xsd (.ecore can be converted to .xsd) is not very useful for me.
And, the generated MyDsl.xsd is strongly dependent on ecore, that is not pure xsd. (as I mentioned before, I consider xsd as type2 grammar)
It also won't automatically generate schematron validation file for me.
[Updated on: Wed, 29 November 2017 10:28] by Moderator
|
|
|
Re: How to convert STRING terminal to ecore::EInt when parse XML using xtext grammar? [message #1777401 is a reply to message #1777306] |
Wed, 29 November 2017 03:15   |
Eclipse User |
|
|
|
Ed Willink wrote on Tue, 28 November 2017 09:57Hi
Indeed EMF will do the parser for free. Re-implementing XML parsing is not particularly easy.
However autogenerating a parser from its Ecore metamodel is a very promising research direction that should enable standard EMF models to be loaded significantly (perhaps 3-fold) quicker and facilitate non-standard in memory representations that can be very beneficial for model to model transformation. See https://bugs.eclipse.org/bugs/show_bug.cgi?id=507391
Hi, Willink,
In your context, what is Ecore metamodel ? and What is a parser from its Ecore metamodel?
In general, we define the ecore model (the instance of ecore metamodel), not ecore metamodel.
The ecore metamodel is some class like: EClass, EEnum, EAtribute EReference, ...
Even in a DSL, we have MyClass, MyEnum, MyField,.. those are all ecore model (the instance of ecore metamodel).
[Updated on: Wed, 29 November 2017 03:16] by Moderator
|
|
|
|
Re: How to convert STRING terminal to ecore::EInt when parse XML using xtext grammar? [message #1777412 is a reply to message #1777407] |
Wed, 29 November 2017 04:50   |
Eclipse User |
|
|
|
Ed Willink wrote on Wed, 29 November 2017 09:21Hi
Ecore is its own metamodel.
An autogenerated parser for Ecore has keywords such as "<xml", "<ePackage", "xmi" which a tokenizer converts direct to integers that a large state table can dispatch in accordance with the LALR analysis. The parser can therefore work using substrings, integers, states and tables using compile-time knowledge. How much speed advantage/size penalty this gives over SAX that builds an intermediate set of objects is something of a matter of guesswork and personal belief until a prototype has been built and instrumented.
Regards
Ed Willink
I understand what you said.
That is great!
Is the autogenerated parser generated from xtext?
If it is, we can use xtext IDE to edit ecore model xml directly, and maximizes the use of xtext's Validation and Content Assist.
I am very interesting, could you create a ZIP for me to learn?
serioadamo97@gmail.com
Very thanks.
[Updated on: Wed, 29 November 2017 04:52] by Moderator
|
|
|
|
Re: How to convert STRING terminal to ecore::EInt when parse XML using xtext grammar? [message #1777930 is a reply to message #1777432] |
Wed, 06 December 2017 05:13  |
Eclipse User |
|
|
|
Hi
You motivated me to have a further play. A mostly working alternative Ecore parser can be found in the ewillink/528050 branch of the Eclipse OCL GIT.
It appears that a non-standard metamodel-driven LALR parser can load an Ecore file perhaps two times faster than the standard EMF SAXParser. There are opportunities for further improvement but I doubt that a further factor of two is available.
BUT much more significant is the comparison between cold parsing (first use with a new JVM) and warm (after many hundreds of repeated usages). For both approaches, a cold parse is at least 50 times slower than warm.
Since the standard parser has diverse usage, it is likely to be warmed up anyway. A non-standard parser is therefore only worth consideration for huge files or many files with the same metamodel or a non-standard memory representation.
Regards
Ed Willink
|
|
|
Powered by
FUDForum. Page generated in 0.08244 seconds