Eclipse Community Forums: Eclipse Titan » Titan Architecture Internals: On how the compiler and the Designer are optimized differently

Home » Eclipse Projects » Eclipse Titan » Titan Architecture Internals: On how the compiler and the Designer are optimized differently((Some background information that is important to know when developing Titan))

Show: Today's Messages :: Show Polls :: Message Navigator

Titan Architecture Internals: On how the compiler and the Designer are optimized differently [message #1750053]

Thu, 15 December 2016 14:16

Kristof Szabados

Messages: 82
Registered: July 2009

Member

When looking at the compiler and the Designer from the outside at first they might seem to be doing the same thing: reading and checking TTCN-3 files.
- The core compiler reads TTCN-3 and ASN.1 files, checks them for syntactic and semantic errors, and finally generates C/C++ code.
- The Designer reads TTCN-3 and ASN.1 files, checks them for syntactic and semantic error, and offers some IDE functionalities.

Although they seem to do the same things in most of their source code too ... they are actually working in very different constraints, and optimized for different use cases:
The compiler:
- is usually used rarely and in the background, as part of a CI system, maybe doing nightly builds and builds after each commit.
- is viewed by users as part of the build process, which starts with TTCN-3 and ASN.1 files and finishes with the creation of the executable tests ... also involving GCC/CLANG. In a full build the compiler might take up as little as only 1%-5% of the build process.
- as such optimization of the compiler is done to minimize the whole build time (the compiler could take a few more seconds, if it can save minutes on the C/C++ compiler)
- when invoked all code is generated (even if just into memory at first).
- the memory consumption is usually not a problem ... as GCC/CLANG will require much more from the same system to finish the build.
- when there is an error in the code the build stops and the user needs to correct it before the code can compile.
When the compiler finds and unparsable piece of code like "variable_name." it can report and error and stop processing.
- users don't expect interaction.
- extending the generated code can be done will scripts and modified makefiles if needed.

The Designer:
- up close and personal, this is the tool used to write TTCN-3, ASN.1 code.
- after each keystroke the user can expect to have some syntactic and semantic checks done to ensure that the code just written is ok.
(without optimization this would mean 6-10 semantic check within a second)
- As this is the editing interface interactivity is expected.
Usually expecting below 0.1 second response time ... whatever the operation was.
- the TTCN-3 code is most of the time erroneous (sometimes on several levels) as it is being written.
But the tool must not stop working.
- In fact the Designer is expected to work best and offer helpful support when the code has the most problems.
When the user presses ctrl+space after an unparsable piece of code like "variable_name." Designer is expected to list all possible completion methods that could make the code correct.
- advanced interaction, like outline views, module structure graph visualization, code smell checking, refactoring are expected to operate ... also interactively.

For these reasons, even though most of the code seems to be similar, they follow very different ways of workings:
The compiler:
- reads all TTCN-3 and ASN.1 files it received as parameters.
- does the semantic checking.
- if no error is found, generates the C/C++ code.
- exits leaving no information behind.

The Designer works in loops:
In the first loop:
- reads all TTCN-3 and ASN.1 files that are in the project (or in related projects)
- does the syntactic and semantic analysis.
- keeps the results in memory to speed up further operations.
In subsequent loops (after something was edited):
- using the information present in memory tries to calculate the minimum amount of data that needs to be re-processed
- does the syntactic and semantic checking using the already existing information, to minimize work.
- keeps the results in memory to speed up further operations.
If some data is needed by the user or a tool:
- the information is always available in memory to be used.

This however complicates things for the Designer to meet the user's harder needs:
- while the compiler is always starting from zero, the Designer has to keep up the consistency.
In the compiler this means that a single boolean "checked" variable per AST node is enough to know if the was already checked during the semantic processing, or it still needs to be checked (in the semantic graph the order of processing is not linear)
In the Designer the AST nodes have to have a timestamp ("lastTimeChecked") showing in which semantic checking cycle that node was last checked. If it was already checked in an earlier loop and the change triggering the current loop cannot change its semantic properties ... we don't need to re-analyze it.
This timestamp is created at the beginning of each semantic check loop ... and passed down from the Module level to each checked AST node via their "check" function (that does the semantic checking).
Please note, that at any point in time the semantic graph can have coexisting regions that were last checked in very different points in time.
- To read the files as fast as possible the syntactic checking of files is happening in parallel, before the semantic checking starts (in org.eclipse.titan.designer.parsers.ProjectSourceSyntacticAnalyzer the internalDoAnalyzeSyntactically function).
As many files are processed at the same as many cores there are in the user's machine (if not configured otherwise).
- In the compiler before the semantic check we know that all nodes need to be checked ... in the Designer the nodes in the semantic graph that might be affected by a change have to be discovered (in org.eclipse.titan.designer.AST.brokenpartsanalyzers.BrokenPartsViaReferences).
/* using data from the previous semantic check iteration ... that might have changed just now */
- while the compiler can run in its own process (as no interactivity is needed) ... the Designer (and all functions accessing its data) needs to be prepared to work as a distributed system.
The user (or system) can trigger changes in many files and many line at the same time.
The user could also try to invoke some functionality while the syntactic and semantic checking of the last changes are still running.
... invoke operations on incorrect code ... and in situations where they cannot realistically work (try renaming a word doc to a TTCN-3 file, open it and ask for code completion).
For this reason operations reaching the semantic data are run in WorkspaceJob -s to provide protection from concurrent access.

Report message to a moderator

Re: Titan Architecture Internals: On how the compiler and the Designer are optimized differently [message #1750056 is a reply to message #1750053]

Thu, 15 December 2016 14:19

Kristof Szabados

Messages: 82
Registered: July 2009

Member

One example for this difference is how surprisingly the Designer (written in Java) might need less memory to process the same TTCN-3 code as the compiler.

In the compiler:
When the compiler runs it can always assume that the code will be correct and it will need to generate large amounts of C/C++ code.
In this generated code there will be error messages in each type and many statements, that in their respective cases need to be logged for the user.
Each type and problematic statement needing to be identified accurately, so that the user knows where to look for the correction.
As such it is a good decision to calculate these unique names during the semantic check and store them for each possible type separately (the "fullname" member of Nodes set via set_fullname functions).

In the Designer:
We can assume that there are only a few errors/warnings displayed to the user at any time ... or he/she will work towards reducing their numbers.
This means just a 1-2 stopping issues if any, or a few hundred warning at any point of time ... even in several million lines of code and at every keystroke.
+ as in an IDE the user can just jump to the location of an error with duple-click most of the error messages can be re-phrased to not even need this form of precise textual identification.
For this reason, the Designer does not calculate and store this information during semantic checking ... instead calculates it only when and where it is needed in an error message (in the getFullName functions).

This way the maximum amount of memory consumed by the compiler (measured with for example top) can be higher than what the Designer uses (measured after garbage collection).

Report message to a moderator

Previous Topic:	Can I build Titan on windows by MinGW direct, not use the msys?
Next Topic:	Does Titan support "address" in port?

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Fri Apr 26 22:29:06 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter