Home » Archived » GMT (Generative Modeling Technologies) » [TCS] Inverted Coma symbol
[TCS] Inverted Coma symbol [message #378437] |
Fri, 06 July 2007 14:19 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hi,
I have a problem to insert the symbol inverted coma (ie " ) in symbole
table of my .tcs file.
I have tried to handle it with the '\' escape character like this :
template include :
"#""include" "\""
;
...
symbols {
...
invcoma "\""
}
But I have got this error message from antlr :
[antlr]./file.g:290.178 : unexpected char : '/'
[antlr]TokenStreamException : unexpected char : '-'
Has somebody a solution to defined the inverted coma as a symbol ?
Thanks,
Remi Viaux
PS : I'm just a beginner in TCS so I hope my question won't be too naïve.
|
|
| |
Re: [TCS] Inverted Coma symbol [message #378448 is a reply to message #378442] |
Mon, 09 July 2007 08:34 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hello Frederic,
Yes, of course " is used to delimited strings and keywords but I want it
to take the role of a simple character.
How can I do ?
in the .g file I can see that the symbol is correctly interpreted :
include returns ....
SHARP "include" INVCOMA temp=identifier {ei.set(ret,"machin",temp);}
INVCOMA)
...
...
INVCOMA
: """
;
Thanks,
Remi Viaux
|
|
|
Re: [TCS] Inverted Coma symbol [message #378451 is a reply to message #378448] |
Mon, 09 July 2007 09:14 |
Eclipse User |
|
|
|
Originally posted by: quentin.glineur.obeo.fr
Hi Remi,
It seems that you are trying redefine something that TCS already does
another way.
Among the first thing to build a recognizer is to define what is an
identifier and what is a literal.
Typically, tokens after an include directive (enclosed by '"') are
string literals.
Thus, your lexer part should contain a rule to tokenize string literals:
STRING
: '\\"'!
( '\\n' {newline();}
| ~('\\"'|'\\n')
)*
'\\"'!
;
Then your syntax part should contain a primitive template:
primitiveTemplate stringSymbol for String using STRING:
value = "%token%",
serializer="'\"' + %value%.toCString() + '\"'";
and a rule looking like this one :
template includeDirective :
"#" "include" include_file{as=stringSymbol}
;
Regards,
Quentin Glineur
Remi Viaux a écrit :
> Hello Frederic,
>
> Yes, of course " is used to delimited strings and keywords but I want it
> to take the role of a simple character.
> How can I do ?
>
> in the .g file I can see that the symbol is correctly interpreted :
>
> include returns ....
> SHARP "include" INVCOMA temp=identifier {ei.set(ret,"machin",temp);}
> INVCOMA)
> ..
> ..
> INVCOMA
> : """
> ;
>
> Thanks,
>
> Remi Viaux
>
|
|
|
Re: [TCS] Inverted Coma symbol [message #378504 is a reply to message #378451] |
Tue, 17 July 2007 11:48 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hi Quentin,
I have tried to rewrite this new primitive template and it's theems to
recognize "___" as stringSymbol.
But I think this can't resolve my problem, because I don't want to
recognize string in code (that's what the lexer do I think), I want to use
" as a keyword like other symbols like '#' can already do.
The purpose is to make the difference between :
' include "local" ' and ' include <global> '
Thanks,
Remi Viaux
Quentin Glineur wrote:
> Hi Remi,
> It seems that you are trying redefine something that TCS already does
> another way.
> Among the first thing to build a recognizer is to define what is an
> identifier and what is a literal.
> Typically, tokens after an include directive (enclosed by '"') are
> string literals.
> Thus, your lexer part should contain a rule to tokenize string literals:
> STRING
> : '\"'!
> ( '\n' {newline();}
> | ~('\"'|'\n')
> )*
> '\"'!
> ;
> Then your syntax part should contain a primitive template:
> primitiveTemplate stringSymbol for String using STRING:
> value = "%token%",
> serializer="'"' + %value%.toCString() + '"'";
> and a rule looking like this one :
> template includeDirective :
> "#" "include" include_file{as=stringSymbol}
> ;
> Regards,
> Quentin Glineur
> Remi Viaux a écrit :
>> Hello Frederic,
>>
>> Yes, of course " is used to delimited strings and keywords but I want it
>> to take the role of a simple character.
>> How can I do ?
>>
>> in the .g file I can see that the symbol is correctly interpreted :
>>
>> include returns ....
>> SHARP "include" INVCOMA temp=identifier {ei.set(ret,"machin",temp);}
>> INVCOMA)
>> ..
>> ..
>> INVCOMA
>> : """
>> ;
>>
>> Thanks,
>>
>> Remi Viaux
>>
|
|
|
Re: [TCS] Inverted Coma symbol [message #378513 is a reply to message #378504] |
Wed, 18 July 2007 10:08 |
Eclipse User |
|
|
|
Originally posted by: quentin.glineur.obeo.fr
This is a multi-part message in MIME format.
--------------000506030407020804090605
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Hi,
Remi Viaux a
|
|
|
Re: [TCS] Inverted Coma symbol [message #378524 is a reply to message #378513] |
Mon, 23 July 2007 11:56 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hi,
It's seems impossible to redefine " as a keyword, I tried many solutions
with backslashes but it doesn't work ...
I'm yet trying to redefine a primitive template to recognize strings like
this : "< file.h >" . But I don't know how to write it ...
I don't know if I have to change the STRING lexer, write another lexer or
just create a primitive template with a serilizer.
Can someone help me with the ANTLR syntax ?
Thanks,
Remi Viaux
Quentin Glineur wrote:
> Hi,
> Remi Viaux a écrit :
>> Hi Quentin,
>>
>> I have tried to rewrite this new primitive template and it's theems to
>> recognize "___" as stringSymbol.
> You are right !
>> But I think this can't resolve my problem, because I don't want to
>> recognize string in code (that's what the lexer do I think), I want to
>> use " as a keyword like other symbols like '#' can already do.
> If it is what you REALLY want, the way to put it in the symbols list is
> to use 3 backslash (it is due to special escape character rendering.
>> The purpose is to make the difference between :
>> ' include "local" ' and ' include <global> '
>>
> Maybe you could define another primitiveTemplate dedicated to
> <something> token recognition.
> Then having your class for "include" statement abstract and defining two
> class inheriting from this previous in order to define their respective
> textual form.
> KM3 :
> abstract class includeStmt {
> attribute includedFile : String;
> }
> class localIncludeStmt extends includeStmt {
> }
> class globalIncludeStmt extends includeStmt{
> }
> TCS :
> template includeStmt abstract;
> template localIncludeStmt :
> "include" includedFile {as=stringLiteral}
> ;
> template globalIncludeStmt :
> "include" includedFile {as=chevronLiteral}
> ;
> I recommend you so because I have a hunch about problems that defining a
> token colliding with other could lead to.
> HTH,
> Quentin GLINEUR
>> Thanks,
>>
>> Remi Viaux
>>
>> Quentin Glineur wrote:
>>
>>> Hi Remi,
>>
>>> It seems that you are trying redefine something that TCS already does
>>> another way.
>>> Among the first thing to build a recognizer is to define what is an
>>> identifier and what is a literal.
>>> Typically, tokens after an include directive (enclosed by '"') are
>>> string literals.
>>
>>> Thus, your lexer part should contain a rule to tokenize string literals:
>>
>>> STRING
>>> : '"'!
>>> ( 'n' {newline();}
>>> | ~('"'|'n')
>>> )*
>>> '"'!
>>> ;
>>
>>> Then your syntax part should contain a primitive template:
>>
>>> primitiveTemplate stringSymbol for String using STRING:
>>> value = "%token%",
>>> serializer="'"' + %value%.toCString() + '"'";
>>
>>> and a rule looking like this one :
>>
>>> template includeDirective :
>>> "#" "include" include_file{as=stringSymbol}
>>> ;
>>
>>
>>> Regards,
>>
>>> Quentin Glineur
>>
>>> Remi Viaux a écrit :
>>>> Hello Frederic,
>>>>
>>>> Yes, of course " is used to delimited strings and keywords but I want
>>>> it to take the role of a simple character.
>>>> How can I do ?
>>>>
>>>> in the .g file I can see that the symbol is correctly interpreted :
>>>>
>>>> include returns ....
>>>> SHARP "include" INVCOMA temp=identifier {ei.set(ret,"machin",temp);}
>>>> INVCOMA)
>>>> ..
>>>> ..
>>>> INVCOMA
>>>> : """
>>>> ;
>>>>
>>>> Thanks,
>>>>
>>>> Remi Viaux
>>>>
>>
>>
|
|
|
Re: [TCS] Inverted Coma symbol [message #378526 is a reply to message #378524] |
Mon, 23 July 2007 13:31 |
Eclipse User |
|
|
|
Originally posted by: quentin.glineur.obeo.fr
This is a multi-part message in MIME format.
--------------000208030900000207040602
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Hi,
Remi Viaux a
|
|
|
Re: [TCS] Inverted Coma symbol [message #378528 is a reply to message #378526] |
Wed, 25 July 2007 12:18 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hi,
Quentin Glineur wrote:
> Hi,
> Remi Viaux a écrit :
>> Hi,
>>
>> It's seems impossible to redefine " as a keyword, I tried many solutions
>> with backslashes but it doesn't work ...
> Indeed, the way ANTLR works makes the thing tricky: the lexer tokenizes
> the character stream before passing it to the parser. So if it finds
> double quotes, it will search in its lexer rules how to tokenize it,
> *independently* of what you have defined in your parser rules !
Thanks for the exponation about how ANTLR works it helps me a lot !
> Think of your parser rules (your templates) as mere filters. When token
> stream passes through a filter, it is handled the way defined by the
> rule (in our case, creating the corresponding model element).
>> I'm yet trying to redefine a primitive template to recognize strings
>> like this : "< file.h >" . But I don't know how to write it ... I don't
>> know if I have to change the STRING lexer, write another lexer or just
>> create a primitive template with a serilizer.
>> Can someone help me with the ANTLR syntax ?
> Try this in the lexer part:
> protected CHEVRON_REGEX
> : '<'
> ( ~('>'))*
> '>'
> ;
I have try this Lexer but, It seems that it doesn't work. At the injection
I've got the error : "expecting CHEVRON_REGEX found '<'".
> or even better, in the TCS part:
> token CHEVRON_PATTERN : multiLine(start = "<", end = ">");
I don't understand what this solution is. Is it TCS or ANTLR syntax ?
where should I put it in my TCS part ? in the template ? or in a primitive
template ?
Regards,
Remi Viaux
> Regards,
> Quentin GLINEUR
>>
>> Thanks,
>>
>> Remi Viaux
>>
>> Quentin Glineur wrote:
>>
>>> Hi,
>>
>>
>>> Remi Viaux a écrit :
>>>> Hi Quentin,
>>>>
>>>> I have tried to rewrite this new primitive template and it's theems
>>>> to recognize "___" as stringSymbol.
>>
>>> You are right !
>>
>>>> But I think this can't resolve my problem, because I don't want to
>>>> recognize string in code (that's what the lexer do I think), I want
>>>> to use " as a keyword like other symbols like '#' can already do.
>>
>>> If it is what you REALLY want, the way to put it in the symbols list
>>> is to use 3 backslash (it is due to special escape character rendering.
>>
>>>> The purpose is to make the difference between :
>>>> ' include "local" ' and ' include <global> '
>>>>
>>
>>> Maybe you could define another primitiveTemplate dedicated to
>>> <something> token recognition.
>>
>>> Then having your class for "include" statement abstract and defining
>>> two class inheriting from this previous in order to define their
>>> respective textual form.
>>
>>> KM3 :
>>> abstract class includeStmt {
>>> attribute includedFile : String;
>>> }
>>
>>> class localIncludeStmt extends includeStmt {
>>> }
>>
>>> class globalIncludeStmt extends includeStmt{
>>> }
>>
>>> TCS :
>>> template includeStmt abstract;
>>
>>> template localIncludeStmt :
>>> "include" includedFile {as=stringLiteral}
>>> ;
>>
>>> template globalIncludeStmt :
>>> "include" includedFile {as=chevronLiteral}
>>> ;
>>
>>> I recommend you so because I have a hunch about problems that defining
>>> a token colliding with other could lead to.
>>
>>> HTH,
>>
>>> Quentin GLINEUR
>>
>>>> Thanks,
>>>>
>>>> Remi Viaux
>>>>
>>>> Quentin Glineur wrote:
>>>>
>>>>> Hi Remi,
>>>>
>>>>> It seems that you are trying redefine something that TCS already
>>>>> does another way.
>>>>> Among the first thing to build a recognizer is to define what is an
>>>>> identifier and what is a literal.
>>>>> Typically, tokens after an include directive (enclosed by '"') are
>>>>> string literals.
>>>>
>>>>> Thus, your lexer part should contain a rule to tokenize string
>>>>> literals:
>>>>
>>>>> STRING
>>>>> : '"'!
>>>>> ( 'n' {newline();}
>>>>> | ~('"'|'n')
>>>>> )*
>>>>> '"'!
>>>>> ;
>>>>
>>>>> Then your syntax part should contain a primitive template:
>>>>
>>>>> primitiveTemplate stringSymbol for String using STRING:
>>>>> value = "%token%",
>>>>> serializer="'"' + %value%.toCString() + '"'";
>>>>
>>>>> and a rule looking like this one :
>>>>
>>>>> template includeDirective :
>>>>> "#" "include" include_file{as=stringSymbol}
>>>>> ;
>>>>
>>>>
>>>>> Regards,
>>>>
>>>>> Quentin Glineur
>>>>
>>>>> Remi Viaux a écrit :
>>>>>> Hello Frederic,
>>>>>>
>>>>>> Yes, of course " is used to delimited strings and keywords but I
>>>>>> want it to take the role of a simple character.
>>>>>> How can I do ?
>>>>>>
>>>>>> in the .g file I can see that the symbol is correctly interpreted :
>>>>>>
>>>>>> include returns ....
>>>>>> SHARP "include" INVCOMA temp=identifier
>>>>>> {ei.set(ret,"machin",temp);} INVCOMA)
>>>>>> ..
>>>>>> ..
>>>>>> INVCOMA
>>>>>> : """
>>>>>> ;
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Remi Viaux
>>>>>>
>>>>
>>>>
>>
>>
|
|
|
Re: [TCS] Inverted Coma symbol [message #378530 is a reply to message #378528] |
Thu, 26 July 2007 07:58 |
Eclipse User |
|
|
|
Originally posted by: quentin.glineur.obeo.fr
This is a multi-part message in MIME format.
--------------070308050301010806090300
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Hello,
Remi Viaux a
|
|
|
Re: [TCS] Inverted Coma symbol [message #378532 is a reply to message #378530] |
Thu, 26 July 2007 13:43 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hi Quentin,
I almost succeded !
The problem with the direct approach CHEVRON_REGEX was the "protected"
word.
So after remove it, it works proprely.
I have this for know :
_______________________________________________________
TCS
---------------
primitiveTemplate chevronExpression for string using CHEVRON_REGEX
value = "%token%";
Template StdIncludeExpression
"#""include"
file {as=chevronExpression}
Lexer
----------------
CHEVRON_REGEX
: '<'
( ~('>'))*
'>'
;
_________________________________________________________
There remain still some problems :
- I think there might be ambiguity problems with superior and inferior
symbols in other expression (ANTLR rise some warnings), but for now I
don't care.
- In the file token I recover all the CHEVRON expression i.e a string
with "<" at the beginning and ">" at the end.
Whereas I want just the string that is between the CHEVRON.
- The final problem is about the other template wich use STRING lexer.
But this lexer doesn't handle between ' and not between ". So I try to
modify it to recognize things like this "file.h". But for know I don't
succeded.
Do you have any solution to recover the string in chevronExpression,
without '<' '>' symbols ?
Regards,
Remi Viaux
|
|
|
Re: [TCS] Inverted Coma symbol [message #378534 is a reply to message #378532] |
Thu, 26 July 2007 15:36 |
Eclipse User |
|
|
|
Originally posted by: quentin.glineur.obeo.fr
This is a multi-part message in MIME format.
--------------060305060200080400090505
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Hi Remi,
Remi Viaux a
|
|
| |
Re: [TCS] Inverted Coma symbol [message #602259 is a reply to message #378442] |
Mon, 09 July 2007 08:34 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hello Frederic,
Yes, of course " is used to delimited strings and keywords but I want it
to take the role of a simple character.
How can I do ?
in the .g file I can see that the symbol is correctly interpreted :
include returns ....
SHARP "include" INVCOMA temp=identifier {ei.set(ret,"machin",temp);}
INVCOMA)
...
...
INVCOMA
: """
;
Thanks,
Remi Viaux
|
|
|
Re: [TCS] Inverted Coma symbol [message #602263 is a reply to message #378448] |
Mon, 09 July 2007 09:14 |
Eclipse User |
|
|
|
Originally posted by: quentin.glineur.obeo.fr
Hi Remi,
It seems that you are trying redefine something that TCS already does
another way.
Among the first thing to build a recognizer is to define what is an
identifier and what is a literal.
Typically, tokens after an include directive (enclosed by '"') are
string literals.
Thus, your lexer part should contain a rule to tokenize string literals:
STRING
: '\\"'!
( '\\n' {newline();}
| ~('\\"'|'\\n')
)*
'\\"'!
;
Then your syntax part should contain a primitive template:
primitiveTemplate stringSymbol for String using STRING:
value = "%token%",
serializer="'\"' + %value%.toCString() + '\"'";
and a rule looking like this one :
template includeDirective :
"#" "include" include_file{as=stringSymbol}
;
Regards,
Quentin Glineur
Remi Viaux a écrit :
> Hello Frederic,
>
> Yes, of course " is used to delimited strings and keywords but I want it
> to take the role of a simple character.
> How can I do ?
>
> in the .g file I can see that the symbol is correctly interpreted :
>
> include returns ....
> SHARP "include" INVCOMA temp=identifier {ei.set(ret,"machin",temp);}
> INVCOMA)
> ..
> ..
> INVCOMA
> : """
> ;
>
> Thanks,
>
> Remi Viaux
>
|
|
|
Re: [TCS] Inverted Coma symbol [message #602309 is a reply to message #378451] |
Tue, 17 July 2007 11:48 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hi Quentin,
I have tried to rewrite this new primitive template and it's theems to
recognize "___" as stringSymbol.
But I think this can't resolve my problem, because I don't want to
recognize string in code (that's what the lexer do I think), I want to use
" as a keyword like other symbols like '#' can already do.
The purpose is to make the difference between :
' include "local" ' and ' include <global> '
Thanks,
Remi Viaux
Quentin Glineur wrote:
> Hi Remi,
> It seems that you are trying redefine something that TCS already does
> another way.
> Among the first thing to build a recognizer is to define what is an
> identifier and what is a literal.
> Typically, tokens after an include directive (enclosed by '"') are
> string literals.
> Thus, your lexer part should contain a rule to tokenize string literals:
> STRING
> : '\"'!
> ( '\n' {newline();}
> | ~('\"'|'\n')
> )*
> '\"'!
> ;
> Then your syntax part should contain a primitive template:
> primitiveTemplate stringSymbol for String using STRING:
> value = "%token%",
> serializer="'"' + %value%.toCString() + '"'";
> and a rule looking like this one :
> template includeDirective :
> "#" "include" include_file{as=stringSymbol}
> ;
> Regards,
> Quentin Glineur
> Remi Viaux a écrit :
>> Hello Frederic,
>>
>> Yes, of course " is used to delimited strings and keywords but I want it
>> to take the role of a simple character.
>> How can I do ?
>>
>> in the .g file I can see that the symbol is correctly interpreted :
>>
>> include returns ....
>> SHARP "include" INVCOMA temp=identifier {ei.set(ret,"machin",temp);}
>> INVCOMA)
>> ..
>> ..
>> INVCOMA
>> : """
>> ;
>>
>> Thanks,
>>
>> Remi Viaux
>>
|
|
|
Re: [TCS] Inverted Coma symbol [message #602708 is a reply to message #378504] |
Wed, 18 July 2007 10:08 |
Eclipse User |
|
|
|
Originally posted by: quentin.glineur.obeo.fr
This is a multi-part message in MIME format.
--------------000506030407020804090605
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Hi,
Remi Viaux a
|
|
|
Re: [TCS] Inverted Coma symbol [message #602748 is a reply to message #378513] |
Mon, 23 July 2007 11:56 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hi,
It's seems impossible to redefine " as a keyword, I tried many solutions
with backslashes but it doesn't work ...
I'm yet trying to redefine a primitive template to recognize strings like
this : "< file.h >" . But I don't know how to write it ...
I don't know if I have to change the STRING lexer, write another lexer or
just create a primitive template with a serilizer.
Can someone help me with the ANTLR syntax ?
Thanks,
Remi Viaux
Quentin Glineur wrote:
> Hi,
> Remi Viaux a écrit :
>> Hi Quentin,
>>
>> I have tried to rewrite this new primitive template and it's theems to
>> recognize "___" as stringSymbol.
> You are right !
>> But I think this can't resolve my problem, because I don't want to
>> recognize string in code (that's what the lexer do I think), I want to
>> use " as a keyword like other symbols like '#' can already do.
> If it is what you REALLY want, the way to put it in the symbols list is
> to use 3 backslash (it is due to special escape character rendering.
>> The purpose is to make the difference between :
>> ' include "local" ' and ' include <global> '
>>
> Maybe you could define another primitiveTemplate dedicated to
> <something> token recognition.
> Then having your class for "include" statement abstract and defining two
> class inheriting from this previous in order to define their respective
> textual form.
> KM3 :
> abstract class includeStmt {
> attribute includedFile : String;
> }
> class localIncludeStmt extends includeStmt {
> }
> class globalIncludeStmt extends includeStmt{
> }
> TCS :
> template includeStmt abstract;
> template localIncludeStmt :
> "include" includedFile {as=stringLiteral}
> ;
> template globalIncludeStmt :
> "include" includedFile {as=chevronLiteral}
> ;
> I recommend you so because I have a hunch about problems that defining a
> token colliding with other could lead to.
> HTH,
> Quentin GLINEUR
>> Thanks,
>>
>> Remi Viaux
>>
>> Quentin Glineur wrote:
>>
>>> Hi Remi,
>>
>>> It seems that you are trying redefine something that TCS already does
>>> another way.
>>> Among the first thing to build a recognizer is to define what is an
>>> identifier and what is a literal.
>>> Typically, tokens after an include directive (enclosed by '"') are
>>> string literals.
>>
>>> Thus, your lexer part should contain a rule to tokenize string literals:
>>
>>> STRING
>>> : '"'!
>>> ( 'n' {newline();}
>>> | ~('"'|'n')
>>> )*
>>> '"'!
>>> ;
>>
>>> Then your syntax part should contain a primitive template:
>>
>>> primitiveTemplate stringSymbol for String using STRING:
>>> value = "%token%",
>>> serializer="'"' + %value%.toCString() + '"'";
>>
>>> and a rule looking like this one :
>>
>>> template includeDirective :
>>> "#" "include" include_file{as=stringSymbol}
>>> ;
>>
>>
>>> Regards,
>>
>>> Quentin Glineur
>>
>>> Remi Viaux a écrit :
>>>> Hello Frederic,
>>>>
>>>> Yes, of course " is used to delimited strings and keywords but I want
>>>> it to take the role of a simple character.
>>>> How can I do ?
>>>>
>>>> in the .g file I can see that the symbol is correctly interpreted :
>>>>
>>>> include returns ....
>>>> SHARP "include" INVCOMA temp=identifier {ei.set(ret,"machin",temp);}
>>>> INVCOMA)
>>>> ..
>>>> ..
>>>> INVCOMA
>>>> : """
>>>> ;
>>>>
>>>> Thanks,
>>>>
>>>> Remi Viaux
>>>>
>>
>>
|
|
|
Re: [TCS] Inverted Coma symbol [message #602754 is a reply to message #378524] |
Mon, 23 July 2007 13:31 |
Eclipse User |
|
|
|
Originally posted by: quentin.glineur.obeo.fr
This is a multi-part message in MIME format.
--------------000208030900000207040602
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Hi,
Remi Viaux a
|
|
|
Re: [TCS] Inverted Coma symbol [message #602762 is a reply to message #378526] |
Wed, 25 July 2007 12:18 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hi,
Quentin Glineur wrote:
> Hi,
> Remi Viaux a écrit :
>> Hi,
>>
>> It's seems impossible to redefine " as a keyword, I tried many solutions
>> with backslashes but it doesn't work ...
> Indeed, the way ANTLR works makes the thing tricky: the lexer tokenizes
> the character stream before passing it to the parser. So if it finds
> double quotes, it will search in its lexer rules how to tokenize it,
> *independently* of what you have defined in your parser rules !
Thanks for the exponation about how ANTLR works it helps me a lot !
> Think of your parser rules (your templates) as mere filters. When token
> stream passes through a filter, it is handled the way defined by the
> rule (in our case, creating the corresponding model element).
>> I'm yet trying to redefine a primitive template to recognize strings
>> like this : "< file.h >" . But I don't know how to write it ... I don't
>> know if I have to change the STRING lexer, write another lexer or just
>> create a primitive template with a serilizer.
>> Can someone help me with the ANTLR syntax ?
> Try this in the lexer part:
> protected CHEVRON_REGEX
> : '<'
> ( ~('>'))*
> '>'
> ;
I have try this Lexer but, It seems that it doesn't work. At the injection
I've got the error : "expecting CHEVRON_REGEX found '<'".
> or even better, in the TCS part:
> token CHEVRON_PATTERN : multiLine(start = "<", end = ">");
I don't understand what this solution is. Is it TCS or ANTLR syntax ?
where should I put it in my TCS part ? in the template ? or in a primitive
template ?
Regards,
Remi Viaux
> Regards,
> Quentin GLINEUR
>>
>> Thanks,
>>
>> Remi Viaux
>>
>> Quentin Glineur wrote:
>>
>>> Hi,
>>
>>
>>> Remi Viaux a écrit :
>>>> Hi Quentin,
>>>>
>>>> I have tried to rewrite this new primitive template and it's theems
>>>> to recognize "___" as stringSymbol.
>>
>>> You are right !
>>
>>>> But I think this can't resolve my problem, because I don't want to
>>>> recognize string in code (that's what the lexer do I think), I want
>>>> to use " as a keyword like other symbols like '#' can already do.
>>
>>> If it is what you REALLY want, the way to put it in the symbols list
>>> is to use 3 backslash (it is due to special escape character rendering.
>>
>>>> The purpose is to make the difference between :
>>>> ' include "local" ' and ' include <global> '
>>>>
>>
>>> Maybe you could define another primitiveTemplate dedicated to
>>> <something> token recognition.
>>
>>> Then having your class for "include" statement abstract and defining
>>> two class inheriting from this previous in order to define their
>>> respective textual form.
>>
>>> KM3 :
>>> abstract class includeStmt {
>>> attribute includedFile : String;
>>> }
>>
>>> class localIncludeStmt extends includeStmt {
>>> }
>>
>>> class globalIncludeStmt extends includeStmt{
>>> }
>>
>>> TCS :
>>> template includeStmt abstract;
>>
>>> template localIncludeStmt :
>>> "include" includedFile {as=stringLiteral}
>>> ;
>>
>>> template globalIncludeStmt :
>>> "include" includedFile {as=chevronLiteral}
>>> ;
>>
>>> I recommend you so because I have a hunch about problems that defining
>>> a token colliding with other could lead to.
>>
>>> HTH,
>>
>>> Quentin GLINEUR
>>
>>>> Thanks,
>>>>
>>>> Remi Viaux
>>>>
>>>> Quentin Glineur wrote:
>>>>
>>>>> Hi Remi,
>>>>
>>>>> It seems that you are trying redefine something that TCS already
>>>>> does another way.
>>>>> Among the first thing to build a recognizer is to define what is an
>>>>> identifier and what is a literal.
>>>>> Typically, tokens after an include directive (enclosed by '"') are
>>>>> string literals.
>>>>
>>>>> Thus, your lexer part should contain a rule to tokenize string
>>>>> literals:
>>>>
>>>>> STRING
>>>>> : '"'!
>>>>> ( 'n' {newline();}
>>>>> | ~('"'|'n')
>>>>> )*
>>>>> '"'!
>>>>> ;
>>>>
>>>>> Then your syntax part should contain a primitive template:
>>>>
>>>>> primitiveTemplate stringSymbol for String using STRING:
>>>>> value = "%token%",
>>>>> serializer="'"' + %value%.toCString() + '"'";
>>>>
>>>>> and a rule looking like this one :
>>>>
>>>>> template includeDirective :
>>>>> "#" "include" include_file{as=stringSymbol}
>>>>> ;
>>>>
>>>>
>>>>> Regards,
>>>>
>>>>> Quentin Glineur
>>>>
>>>>> Remi Viaux a écrit :
>>>>>> Hello Frederic,
>>>>>>
>>>>>> Yes, of course " is used to delimited strings and keywords but I
>>>>>> want it to take the role of a simple character.
>>>>>> How can I do ?
>>>>>>
>>>>>> in the .g file I can see that the symbol is correctly interpreted :
>>>>>>
>>>>>> include returns ....
>>>>>> SHARP "include" INVCOMA temp=identifier
>>>>>> {ei.set(ret,"machin",temp);} INVCOMA)
>>>>>> ..
>>>>>> ..
>>>>>> INVCOMA
>>>>>> : """
>>>>>> ;
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Remi Viaux
>>>>>>
>>>>
>>>>
>>
>>
|
|
|
Re: [TCS] Inverted Coma symbol [message #602768 is a reply to message #378528] |
Thu, 26 July 2007 07:58 |
Eclipse User |
|
|
|
Originally posted by: quentin.glineur.obeo.fr
This is a multi-part message in MIME format.
--------------070308050301010806090300
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Hello,
Remi Viaux a
|
|
|
Re: [TCS] Inverted Coma symbol [message #602776 is a reply to message #378530] |
Thu, 26 July 2007 13:43 |
Eclipse User |
|
|
|
Originally posted by: remi.viaux.mbda-systems.com
Hi Quentin,
I almost succeded !
The problem with the direct approach CHEVRON_REGEX was the "protected"
word.
So after remove it, it works proprely.
I have this for know :
_______________________________________________________
TCS
---------------
primitiveTemplate chevronExpression for string using CHEVRON_REGEX
value = "%token%";
Template StdIncludeExpression
"#""include"
file {as=chevronExpression}
Lexer
----------------
CHEVRON_REGEX
: '<'
( ~('>'))*
'>'
;
_________________________________________________________
There remain still some problems :
- I think there might be ambiguity problems with superior and inferior
symbols in other expression (ANTLR rise some warnings), but for now I
don't care.
- In the file token I recover all the CHEVRON expression i.e a string
with "<" at the beginning and ">" at the end.
Whereas I want just the string that is between the CHEVRON.
- The final problem is about the other template wich use STRING lexer.
But this lexer doesn't handle between ' and not between ". So I try to
modify it to recognize things like this "file.h". But for know I don't
succeded.
Do you have any solution to recover the string in chevronExpression,
without '<' '>' symbols ?
Regards,
Remi Viaux
|
|
|
Re: [TCS] Inverted Coma symbol [message #602782 is a reply to message #378532] |
Thu, 26 July 2007 15:36 |
Eclipse User |
|
|
|
Originally posted by: quentin.glineur.obeo.fr
This is a multi-part message in MIME format.
--------------060305060200080400090505
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 8bit
Hi Remi,
Remi Viaux a
|
|
|
Goto Forum:
Current Time: Thu Sep 26 00:18:05 GMT 2024
Powered by FUDForum. Page generated in 0.07625 seconds
|