Eclipse Community Forums: Eclipse Titan » Titan Architecture Internals: On the structure of Abstract Syntax Tree (AST) nodes.

Home » Eclipse Projects » Eclipse Titan » Titan Architecture Internals: On the structure of Abstract Syntax Tree (AST) nodes.((Some background information that is important to know when developing Titan))

Show: Today's Messages :: Show Polls :: Message Navigator

Titan Architecture Internals: On the structure of Abstract Syntax Tree (AST) nodes. [message #1748940]

Wed, 30 November 2016 09:56

Kristof Szabados

Messages: 82
Registered: July 2009

Member

The mapping of TTCN-3 modules to the Java/C++ AST elements might not seem like it but has a very simple logic.
And although the C++ classes in the compiler and the Java classes in the Designer look very different, they mostly have the same abstract behavior/logic.
(Please note, that the previous post on where files/classes are located is not repeated here.
Please also note, that ASN.1 involves some specific procedures that are best described in a standalone topic)

To identify the similarities and differences let's take a look based on interesting points.

1)
The AST (Abstract Syntax/Semantic Tree) is mapped into classes, where the abstract branches are class members.
The class hierarchy is very similar on abstract level in both the compiler and the Designer... but not in the actual implementation.

Example 1: a TTCN-3 module can have a name, imports, definitions, a control part, a with attribute part
On the Java side the TTCN3Module class (in package org.eclipse.titan.designer.AST.TTCN3.definitions) has:

	protected String name;	/*inherited from base Module class */
	private final List<ImportModule> importedModules;
	protected Definitions definitions;
	private ControlPart controlpart;
	protected WithAttributesPath withAttributesPath;

On the compiler side the Module class (in compiler2/ttcn3/AST_ttcn3.{hh/cc}) has:

	Identifier *modid; /*module identifier inherited from common::Module */
	Imports *imp;
	Definitions *asss;
	ControlPart* controlpart;
	WithAttribPath* w_attrib_path;

Example 2: a TTCN-3 function can have a name, formal parameter list, runs on reference, a return type and a statement block and with attributes.
On the Java side the Def_Function class (in package org.eclipse.titan.designer.AST.TTCN3.definitions) has:

	protected Identifier identifier; /* function name inherited from assignment */
	private final FormalParameterList formalParList;
	private final Reference runsOnRef;
	private final Type returnType;
	private final StatementBlock block;
	protected WithAttributesPath withAttributesPath; /* inherited from Definition class */

On the compiler side the Def_Function class (compiler2/ttcn3/AST_ttcn3.{hh/cc}) has:

	Identifier *id; /* inherited from assignment class */
	FormalParList *fp_list; /* inherited from Def_Function_Base class */
	Reference *runs_on_ref;
	Type *return_type; /* inherited from Def_Function_Base class */
	StatementBlock *block;
	WithAttribPath* w_attrib_path; /*inherited from Definition class */

Example 3: a statement block can have many statements
On the Java side the StatementBlock class (in package org.eclipse.titan.designer.AST.TTCN3.statements) has:

	private final List<Statement> statements;

On the compiler side the StatementBlock class (compiler2/ttcn3/Statement.{hh/cc}) has:

	vector<Statement> stmts;

1/a)
In most of the cases the name and type of the member closely reflects what data it is holding.

1/b)
Minor differences in naming conventions.

We tried to keep the names similar between the C++ and Java sides.
But it is also clear to see on the above example, that they follow somewhat different conventions:

- as work on the compile started in 2000 and it is written in C/C++ the member names are following the short naming conventions of that time (for example reducing Formal Parameters to "fp_").
- on the Java side we decided to go with longer, more expressive names (FormalParList instead of fp_list) In the end they are the same, just different naming conventions.

1/c)
A bigger difference in the mapping can be seen in statements, types, values, templates

On the Java side these are handled with class inheritance.
On the compiler side their sub-classes are not actually sub-classes but handled by using unions for members and switches for codes with many branches.

Example 4: the expression to add 2 values
On the Java side the AddExpression class (in package org.eclipse.titan.designer.AST.TTCN3.values.expressions) has:

	private final Value value1;
	private final Value value2;

On the compiler side this can be found in the compiler2/Value.{hh/cc}

	"V_EXPR, /**< expressions */" valuetype_t represents that the value is an expression.
	"OPTYPE_ADD, // v1 v2" operationtype_t represents that the value is an expression and addition.

and in the union

Value *v1;
	Value *v2; /* represent the two value parameters the addition expression can have */

The reason for this difference lies in the difference of language support, purpose and prevalent code style at the writing of the tools.

- Language support: Java does not support this kind of union/switch based operation so we had to go with deeper object hierarchies.
- Difference of purpose:
In the compiler:
-- Speed is important so we tried to use the most efficient way of writing code.
-- These union structures also provide tight memory usage.
-- And the compiler is not expected to be extended with new features by third parties.
On the Java side:
-- As this form of unions is not supported, we can best save memory by storing only the precise members of each sub-class in its class (if something needs only 1 member it should not receive unnecessary extra members).
-- The Designer is expected to be extended, to have high-level semantic functionalities and to be extended by third parties with high-level functionalities (for example Titanium does static checking and visualization and we are also experimenting with large scale refactoring).
-- Speed is still very important ... but as the Designer is used in an interactive environment optimization for speed has a very different meaning.

While the compiler is perceived to be fast for taking only a few seconds before C/C++ compiler works for minutes/hours on the code ... users expect split second
reaction times from the Designer plugin in Eclipse IDE (more on this in a separate post later).

1/d)
Based on the above reasons and options there are also smaller differences in how the class hierarchy looks like.

For example, in the compiler both Def_function and Def_ExtFunction derives from Def_Function_Base as a base class, to handle common operations.
On the Java side Def_Function and Def_ExtFunction derive directly from Definition as we deemed their semantic behavior too different to have a common base class.

Yet in the end the final classes used provide the same behavior on both sides.

2)
There are some extra members for optimization.
These are the same on both sides, but also very different.

A good example to demonstrate the similarity and difference is found in reference resolution.
When semantic checking starts both the compiler and the Designer start to work on a list of definitions, ordered by their appearance in the file (as the parser found them).
This is a good data structure for checking these definitions in order ... but is very inefficient when we check where a reference is pointing.

For this reason, the semantic checking of a module first checks that definition names are unique and builds a hashmap of the definitions (with their names as keys).
When deciding where a reference is pointing this will mean a simple lookup instead of a linear search (per scope) ... as there are millions of references in a typical project this is a very important optimization.

For this reason, in the compiler the Definitions class has 2 members:

	vector<Definition> ass_v; /* the list of definitions as received from the parser */
	map<string, Definition> ass_m; /* the hashmap of definitions */

On the Java side in the Definition class we have:

	private final List<Definition> definitions; /* the list of definitions as received from the parser */
	private HashMap<String, Definition> definitionMap; /* the hashmap of definitions */

These seem to be the same, but they differ greatly in their usage.
In the compiler we do the conversion only once, after which the hashmaps are used further (the original vectors could be deleted).
In the Designer the original definition list is important as the hashmap might need to be re-freshed based on it (I will show this in a separate topic later).

Please note that the exact same optimization logic is also present in StatementBlocks ... with all of the definition and label names extracted and stored in hashmaps for faster access.

In several cases we also store the results of lengthy calculations as members. For example, during semantic check we determine if a function could be started on a component, so afterward the code just needs to check this property instead of calculating it again-and-again (types, encodings, etc... could also be stored like this).
Usually the "final" members come from the parser, while other members are calculated during semantic check.

3)
In Java there are some extra final static String members.
They are used for reporting error messages and are extracted from the code, so that we don't duplicate them.

This is done to have a better style in the code and also to help in debugging the code.
Please note that we also usually try to keep phrases in the error message together, so that searching for them can be easy.
(In contrast to cutting of according to the line length, that could make it hard to search for the reporting of specific error messages)

4)
The last kind of members not coming directly from the source code are parts of the machinery operating the tools.

For example, Location is used by the tools to know where to report the errors to in the code.
In the compiler all settings (and many other AST nodes) are derived from the Location class (compiler2/Setting.hh).
In the Designer all settings (and many other AST nodes) have a member of type Location with the name location.

Another example is the checked/lastTimeChecked members.
In the compiler these are bool (true or false) set to true when an AST node was semantically checked.
In the Designer this is a CompilationTimeStamp set to the last timestamp of when the AST node was last checked.
(more on this difference in next topic when I discuss how the compiler and Designer are optimized differently)

Report message to a moderator

Previous Topic:	Eclipse TITAN for MAC OSX
Next Topic:	Can't find "RT1/PreGenRecordOf.hh" when build

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Thu Apr 25 11:55:26 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter