Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Eclipse Projects » Eclipse Titan » Titan Architecture Internals: On parsing and checking ASN.1(Some background information that is important to know when developing Titan)
Titan Architecture Internals: On parsing and checking ASN.1 [message #1753448] Mon, 06 February 2017 15:23
Kristof Szabados is currently offline Kristof SzabadosFriend
Messages: 28
Registered: July 2015
Junior Member
ASN.1 has some special features, that make the separation of parsing and semantic checking hard.
One set of them is:
- it is possible to define for each Information Object Class the syntax with which its Information Objects should be described.
- these Information Objects (described by unique syntaxes) can appear in the file before the definition of their Information Object Class (that defines how to read the unique syntax)
- without semantic information in some cases it is not possible to tell what kind of assignment we have parsed.
(even in simple cases this depends on factors like the name of the assignment starting with a lower case letter or an upper case letter and having only capital letters in its name or not)

This means that while processing ASN.1 files the parsing of the text cannot be as cleanly separated from semantic analysis as in case of TTCN-3.
For example: During parsing the syntax that can be used to describe Information Objects is not known.

To solve this problem, we process ASN.1 in a recursive way (designed and first implemented in Titan by Matyas Forstner for his MSc Diploma work).
At first ASN.1 files are parsed ... with each block (region between "{" and "}") only tokenized and represented in the AST as a Block.
Then during semantic checking, Titan checks these parsed assignments.
- If an assignment does not have a block to be processed the semantic checking can process that assignment.
- If the assignment does have a block and it is not an Information Object (like a value) Titan parses the tokens stored inside the block according to the 'type' of the assignment.
- if the assignment is an Information Object (has a block), then Titan first processes the referenced Information Object Class ... to discover the syntax, builds a simple processing engine from it and uses this engine to parse tokens inside the block.
Should any of this parsing find a new block it is yet again parsed only as a block, to be processed when the semantic checking reaches it later.

The parsing and semantic checking cycle is progressing recursively.

Let us walk through a relatively simple example:
c-ASNSequence1 ASNSequenceType ::= { field1 1, field2 { field1 1, field2 "akarmi"}}

ASNSequenceType ::= SEQUENCE {
	field1 INTEGER,
	field2 ASNSequenceType2}
ASNSequenceType2 ::= SEQUENCE {
	field1 INTEGER,
	field2 GeneralString}

/* we have 2 record types (one used as a field in the other) and define a constant of this type */

Step 1:
The ASN.1 module is parsed.
Finding 3 assignments: c-ASNSequence1, ASNSequenceType and ASNSequenceType2.
Each of which has blocks that were not yet parsed.
At this point the semantic checking can only make sure, that each assignment name is unique, so that we can reference them ... but not much more.
And we start the semantic checking of the assignments

Step 2:
The first assignment to be processed is c-ASNSequence1.
It is represented in the Designer by the Undefined_Assignment_O_or_V class as at this point it is not possible to know if it is a value or an object.
To classify it we check what kind of setting it might be referencing and if it can be a value based on its name.
As its name starts with lowercase letter and the reference refers to a type setting, we can deduce that we have a value that is described by a block that we don't know anything about yet.
So we set the realAssignment field (of this Undefined_Assignment_O_or_V instance) to be a Value_Assignment that is parameterized with an Undefined_Block_Value class, itself parameterized with the not yet processed block.
Before proceeding with the semantic analysis of this Value_Assignment, its type must be processed first (so that we can check the value against it).

Step 3:
/* please note that this step and some further ones that check the type are in the implementation interwoven with the previous one (and done when we classify the assignment) ... I'm only separating it to simplify the explanation */
We have an ASN1_Sequence_Type with a block not yet processed, so we parse it (using the parseBlockSequence) into a list of component fields (which might have blocks at this time) and check these components.
At this point we can only check that the field names, of the parsed component fields, are unique.
We start checking the parsed component fields themselves.

Step 4:
The first component field defines an INTEGER field, so it can be semantically checked simply (it also has no unprocessed blocks)
The second field however refers to ASNSequenceType2, which needs to be checked before proceeding.
So type checking part of Step3 and Step4 (this step) is applied to it.
/* this is the recursive part */

Step n:
After both ASNSequenceType and ASNSequenceTYpe2 have been processed the recursion is resolved to the point where the Value_Assignment built from c-ASNSequence1 is now checked against the types.
/*as the semantic checking is optimized to check each assignment/definition only once per timestamp we don't have to worry about re-checking the already processed types again. */
At this point we still don't have a 'real' value to work with, only a block wrapped into the Undefined_Block_Value class.
So when the ASN1_Sequence_Type class tries to check it, it is first converted into a sequence value (in the UNDEFINED_BLOCK branch of the checkThisValue function of ASN1_Sequence_Type).
Which triggers the parsing of the block using the parseBlockSequenceValue function of the Undefined_Block_Value class (please note only the external block at this time).
And the fields of this newly produced value are checked.

Step n+1:
The first field is simple to check as it was parsed to be an integer, so it only needs to be checked against ASNSequenceType's first field (which is an integer too).
However, the second field contained a block and so was parsed into an Undefined_Block_Value.
To process it we need to do the recursive block parsing shown in Step n and this Step.


After this example, showing how Information Object Classes and Information Objects are processed is relatively easy, as it is "just" a specific case.

For example:

getCustomersNum OPERATION ::=
    OPCODE IS SOMETHING LIKE get-customers-num-op-type-code
    I HEARBY REQUEST THESE ARGUMENTS Get-customers-num-req-pars-type
    I BELIEVE Get-customers-num-ind-pars-type IS A GOOD RESPONSE ARGUMENT
    ERRORS WERE MADE FOR EXAMPLE { wrong-product | wrong-department }


    &opcode INTEGER UNIQUE,
    &ExceptionList ERROR OPTIONAL

/* the example is based on ROSE, modified to be a fun reward for those who have read so far Wink
In the above example the "OPERATION" Information Object class has a "WITH SYNTAX" part.
In this part "OPERATION" describes a syntax that can be used to describe Information Objects of this Information Object Class.

The semantic checking of an Information Object Class (after checking the field specifications) Titan checks if the withSyntaxBlock field is not null (indicating that it contains the tokens of a 'WITH SYNTAX' block).
To build the parser based on a with syntax block Titan uses the ObjectClassSyntax_Builder class.
This will be essentially a simple tree, even the building of which is done by a visitor.
At each block level it:
- first invokes the "pr_special_ObjectClassSyntax_Builder" function/rule of the ASN.1 parser to build a list of ObjectClassSyntax_Node -s, that describe the found syntax.
- adds these nodes to the current level in the syntax tree.
- if during the parsing it finds a new block: a new level is added to the tree and a new builder object is created to process the tokens in the block into the tree elements on this level.

When the semantic checking encounters an Information Object (Object_Definition):
- it will first determine and check its governing Information Object Class (thus also creating the special parser for the special syntax if needed).
- parse the block of the Information Object (it was not yet processed so far).
- an ObjectClassSyntax_Parser object is created.
- this parser object traverses the syntax tree and the tokens found in the block at the same time.
- checking that all literals are in the expected order, reading out field settings
- analyze the recovered field settings.


Please note 1:
This really makes it possible to process ASN.1 files.
On each level the parsers only process the tokens representing the actual level ... never going one step further.
This way a parser cannot encounter syntax it is not able to process (unless it is syntactically incorrect).

The user defined syntax in "getCustomersNum" is only encountered when the information Object is checked against its information Object Class, using a special parser used in that place for this purpose.
The general parser will simply 'skip over' that block while first processing the module.

Please note 2:
When the special parser is built from the 'WITH SYNTAX' block of an Information Object Class Titan uses the ANTLR parser it has for ASN.1 files.
But the dynamically built parser of Information Objects is very simple.
This parser either finds the right expected literals in the right places or not, it will not do complex LL/LR processing or recovery strategies.
Previous Topic:Using the address type in Titan
Next Topic:TITAN Support of TTCN-3 Extension Package
Goto Forum:

Current Time: Thu Mar 22 04:30:55 GMT 2018

Powered by FUDForum. Page generated in 0.01830 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software