Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » TMF (Xtext) » How to change lexer to copy tokens from other files(Lexer change to copy tokens from other files)
How to change lexer to copy tokens from other files [message #1785730] Thu, 19 April 2018 05:58 Go to next message
Virag Purnam is currently offline Virag PurnamFriend
Messages: 142
Registered: June 2014
Senior Member
I am giving example below and I will explain my problem scenarios.

Grammar:
grammar com.hpe.nsdee.lexer.LexerDsl with org.eclipse.xtext.common.Terminals hidden ( COPY_STMT, WS )

generate lexerDsl "http://www.hpe.com/nsdee/lexer/LexerDsl"

Model:
	greetings+=Greeting*;
	
Greeting:
	'Hello' name=ID '!';


terminal COPY_STMT	: 'Copy' -> ';';


Model files(test1.lex):
Hello virag! 
Copy c:\temp\test2.lex;
Purnam!         
Hello newName! 


test2.lex contents which is outside workspace:
Hello anotherName!
Hello 


Parser throws error at "Purnam!".
What I want to achieve is at "Copy c:\temp\test2.lex;", DslLexer should return tokens from test2.lex file. Actual contents if DslLexer does that.
Hello virag!
Hello anotherName!
Hello 
Purnam!         
Hello newName! 

In this case parser will not give any error.
How can I achieve this?

I tried changing in DslLexer directly for hard coded values.
I returned Token Hello with correct token state type when I get Copy statement. It worked.
But I want it to dynamic and this is example grammar.
Actual grammar will have more combinations.

Thanks in advance.

Thanks and regards,
Virag Purnam

[Updated on: Thu, 19 April 2018 06:00]

Report message to a moderator

Re: How to change lexer to copy tokens from other files [message #1785733 is a reply to message #1785730] Thu, 19 April 2018 07:31 Go to previous messageGo to next message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7655
Registered: July 2009
Senior Member
Hi

In https://www.eclipse.org/forums/index.php/mv/msg/1092639/1785018/#msg_1785018 you seemed to accept that my recommendation that overriding org.eclipse.xtext.parser.IParser.createTokenStream provides considerable flexibility to mess around with the token sequence without affecting the parsing. I'm not sure why you are asking the same question again.

Regards

Ed Willink
Re: How to change lexer to copy tokens from other files [message #1785739 is a reply to message #1785733] Thu, 19 April 2018 08:20 Go to previous messageGo to next message
Virag Purnam is currently offline Virag PurnamFriend
Messages: 142
Registered: June 2014
Senior Member
Hi Mr. Ed Willink,

Thanks for your reply.
I tried the approach to override createTokenStream but it did not work fully.
In my case I need to get Tokens from other file. In example from file "test2.lex".
How can I get tokens from "test2.lex", which is outside of workspace?
c:\\temp\\test2.lex

I want to achieve all these in lexer without touching parser.
So seeking some help here.

Thanks in advance.
Best regards,
Virag Purnam
Re: How to change lexer to copy tokens from other files [message #1785745 is a reply to message #1785739] Thu, 19 April 2018 09:33 Go to previous messageGo to next message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7655
Registered: July 2009
Senior Member
Hi

"outside of workspace" is irrelevant. Either the source is readable or it isn't, in which case you are dead. Since you have an irregular import you have to recognize the import in order to trigger an irregular read and tokenization so that you can feed the tokens to the reference. Since you may have idiot users or complex usage patterns there may be a cyclic copy a into b and copy b into a. You need to cache the tokenizations for re-use and trap cyclic re-use.

Regards

Ed Willink
Re: How to change lexer to copy tokens from other files [message #1785795 is a reply to message #1785745] Fri, 20 April 2018 03:35 Go to previous messageGo to next message
Virag Purnam is currently offline Virag PurnamFriend
Messages: 142
Registered: June 2014
Senior Member
Hi,

Thanks for the reply.
In your example project shared below runtime module change is confusing.
Can you please explain this?
And how can I write the same for my problem scenario?

@Override
	public Class<? extends org.eclipse.xtext.parser.IParser> bindIParser() {
		return RetokenizingCompleteOCLParser.class;
	}

	public static class RetokenizingCompleteOCLParser extends CompleteOCLParser
	{
		@Override
		protected XtextTokenStream createTokenStream(TokenSource tokenSource) {
			return super.createTokenStream(new RetokenizingTokenSource(tokenSource, getTokenDefProvider().getTokenDefMap()));
		}
	}



Thanks and regards,
Virag Purnam

[Updated on: Fri, 20 April 2018 03:36]

Report message to a moderator

Re: How to change lexer to copy tokens from other files [message #1786557 is a reply to message #1785795] Mon, 07 May 2018 08:26 Go to previous message
Virag Purnam is currently offline Virag PurnamFriend
Messages: 142
Registered: June 2014
Senior Member
Hello,

I have added below code in runtime module.

@Override
	public Class<? extends org.eclipse.xtext.parser.IParser> bindIParser() {
		return RetokenizingParser.class;
	}

	public static class RetokenizingParser extends LexerDslParser
	{
		@Override
		protected XtextTokenStream createTokenStream(TokenSource tokenSource) {
			return super.createTokenStream(new CustomTokenStream(tokenSource, getTokenDefProvider().getTokenDefMap()));
		}
	}


CustomTokenStream class:
package com.hpe.nsdee.lexer;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import java.net.MalformedURLException;
import java.net.URI;
import java.net.URL;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;

import org.antlr.runtime.ANTLRReaderStream;
import org.antlr.runtime.CommonToken;
import org.antlr.runtime.Token;
import org.antlr.runtime.TokenSource;
import org.eclipse.xtext.parser.IParseResult;
import org.eclipse.xtext.parser.antlr.Lexer;
import org.eclipse.xtext.parser.antlr.LexerBindings;
import org.eclipse.xtext.parser.antlr.XtextTokenStream;
import org.eclipse.xtext.resource.XtextResource;
import org.eclipse.xtext.ui.editor.model.XtextDocumentProvider;
import com.google.inject.Inject;
import com.google.inject.Provider;
import com.google.inject.name.Named;

public class CustomTokenStream implements TokenSource {

	protected final TokenSource tokenSource;
	protected final Map<Integer, String> tokenDefMap;
	protected final LinkedList<Token> queue = new LinkedList<Token>();
	public static Integer value = 0;
	public static List<String> words = new ArrayList<>();
	public static Integer lineNumber = 0;
	public static Integer charPositionInLine = 0;
	public static Integer start = 0;
	public static Integer stop = 0;
	
	@Inject
	@Named(LexerBindings.RUNTIME)
	private Provider<Lexer> lexerProvider;

	public CustomTokenStream(TokenSource tokenSource, Map<Integer, String> tokenDefMap) {
		this.tokenSource = tokenSource;
		this.tokenDefMap = tokenDefMap;
		this.words = new ArrayList<>();
		lineNumber = 0;
		charPositionInLine = 0;
		start = 0;
		stop = 0;
	}

	@Override
	public String getSourceName() {
		return tokenSource.getSourceName();
	}

	@Override
	public Token nextToken() {
		while (words.size() > 0) {
			String word = words.get(0);
			CommonToken commonToken = getToken(word);
			words.remove(words.get(0));
			System.out.println(commonToken.toString());
			return commonToken;
		}
		Token nextToken = tokenSource.nextToken();
		if (nextToken.getText() != null && !nextToken.getText().isEmpty() && nextToken.getText().startsWith("Copy")) {			
			List<String> wordsFromFile = readFileAndReturnWords("c:\\temp\\test1.lex");
			words.addAll(wordsFromFile);
			lineNumber = nextToken.getLine();
			start = ((CommonToken) nextToken).getStartIndex();
			stop = ((CommonToken) nextToken).getStopIndex();
			String word = words.get(0);
			charPositionInLine = nextToken.getCharPositionInLine();
			CommonToken commonToken = getToken(word);
			words.remove(words.get(0));
			System.out.println(commonToken.toString());
			return commonToken;
		}
		if(nextToken.getText() == null) {
			return Token.EOF_TOKEN;
		}
		System.out.println(nextToken.toString());
		return nextToken;
	}

	private CommonToken getToken(String word) {
		CommonToken commonToken = null;
		switch (word) {
		case "Hello":
			commonToken = new CommonToken(12, "Hello");
			break;
		case "!":
			commonToken = new CommonToken(13, "!");
			break;
		case "\r\n":
			commonToken = new CommonToken(10, " ");
			break;
		default:
			commonToken = new CommonToken(4, word);
			break;
		}
		commonToken.setLine(lineNumber);		
//		charPositionInLine = charPositionInLine + commonToken.getText().length();
		commonToken.setCharPositionInLine(0);
//		commonToken.setStartIndex(start+1);
//		start = start + commonToken.getText().length();
//		commonToken.setStopIndex(start);
		return commonToken;
	}

	private List<String> readFileAndReturnWords(String filePath) {
		List<String> words = new ArrayList<>();
		try (InputStream fis = new FileInputStream(filePath);
				InputStreamReader isr = new InputStreamReader(fis, Charset.forName("UTF-8"));
				BufferedReader br = new BufferedReader(isr);) {
			String line;
			while ((line = br.readLine()) != null) {
				String[] wordArray = line.split(" ");
				for (String word : wordArray) {
					if (!word.trim().isEmpty()) {
						words.add(word);
						words.add("\r\n");
					}
				}
			}
			return words;

		} catch (Exception e) {
			// TODO: handle exception
		}
		return new ArrayList<>();

	}

}



model files:
Hello Aneesh!
Copy C:\\temp\\test1.lex; 
Hello arun!
Hello Anitha!
Hello Senthil!


test1.lex
Hello name1 !
Hello name2 !
Hello name3 !
Hello name4 !
Hello name5 !
Hello name6 !


Correct output:
Hello Aneesh!
Hello name1 !
Hello name2 !
Hello name3 !
Hello name4 !
Hello name5 !
Hello name6 !
Hello arun!
Hello Anitha!
Hello Senthil!


But after this dsl editor lines and positions goes wrong on edit.
On edit, changes are reflecting at wrong place.
I need help here.
How can I update editor with proper line numbers and how to modify text region?


Thanks and regards,
Virag Purnam
Previous Topic:Define grammar in Xtext for optional array dimensions
Next Topic:DSL template functions (specialization)
Goto Forum:
  


Current Time: Thu Apr 25 15:07:39 GMT 2024

Powered by FUDForum. Page generated in 0.03494 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top