N4JS Language Specification

1. Introduction

This specification defines the N4JS language.

In general, the N4JS JavaScript dialect used is identical to the standard ECMAScript as defined in the 6th edition of ECMA-262, also known as ECMAScript 2015, referred to as [ECMA15a].

1.1. Notation

1.1.1. Grammar Notation

For the specification of the syntax and structure of elements, we use a slightly augmented similar to the grammar language of Xtext Grammar Language.

Similar to [ECMA11a], we define types with properties only for the purpose of explanation and usage within this specification. We use the Xtext notation style to assign values to meta-properties. Particularly, we use the Xtext notation for collection (+=) and boolean (?=) values. These properties are written in italics. Enumerations are defined similar to Xtext. In order to allow the specification of default values, which are often defined by omitting the value, we always define the literal explicitly if it can be defined by the user.

The following lists informally defines the grammar:

Terminal: Terminals (or terminal strings) are enclosed in single quotes, e.g., terminal.
Enumerations: Rules which contain only terminals used as values for properties are marked with enum for enumeration.
Properties: Values of non-terminals, e.g., other rules, can be assigned to properties. The property name and the assignment are not part of the original syntax and only used for the meta description. E.g., name=Identifier.
Collection Properties: If a property is a collection, values are added to that list via +=. E.g.,property+=Value .
Boolean Properties: Boolean properties are set to false by default, if the value (usually a terminal) is found, the boolean value is set to true. Often, the name of the property is similar to the terminal. E.g., final?='final'?.

Properties of a non-terminal are sometimes listed again below the grammar. In that case, often pseudo properties are introduced which are derived from other properties and which are only used for simplification.

1.1.2. Type Judgments and Rules and Constraints Notation

1.1.2.1. Typing Rules and Judgments

Definition: Rule

We use the common notation for rules such as type inference rules ^[1], that is

\frac{p r e m i s e s}{c o n c l u s i o n} rule name

$p r e m i s e s$ is the rule’s premises (e.g., the expression to be inferred), $c o n c l u s i o n$ the result of the rule. $r u l e n a m e$ is an optional condition which may be omitted.

Both parts of the rule may contain multiple expressions, which are concatenated via 'and'.

For example, the following

\frac{P_{1} P_{2} P_{3}}{C}

can be read as

if $P_{1}$ , $P_{2}$ , and $P_{3}$ are all true, then $C$ is true as well.

The following judgments (with relation symbols) are used:

subtype $<$: -
type $:$: in which the left hand side is a declaration or expression, and the right hand side a type. We also use $[[. . .]]$ as a function returning the (inferred) type of an expression.
expectedTypeIn $⊲ :$: a relation with three arguments: $c o n t a i n e r ⊲ e x p r e s s i o n : t y p e$ means, that $e x p r e s s i o n$ is expected to be a subtype of $t y p e$ inside $c o n t a i n e r$

The following statement, for example, defines transitivity of subtypes (in a simplified manner):

\frac{Γ ⊢ B < : A Γ ⊢ C < : B}{Γ ⊢ C < : A}

$⊢$ is the context containing (bound) type variables etc., $⊢$ can be read as entails. Thus, the rule can be read as follows:

if the type B is a subtype of type A in context $Γ$ (i.e. with constraints on type variables specified in $Γ$ ), and if type C is a subtype of B, then C is also a subtype of A in context $Γ$ .

In rules, we sometimes omit the environment if it is not needed. New information is sometimes added to the environment, in particular, substitutions (that is binding type variables to a type). The set of substitutions is written with $θ$ (theta). If new substitutions are explicitly added to that set, we write $θ (V \leftarrow T)$ ( $V$ is substituted with type $T$ ). Often, these bindings are computed from a parameterized type reference which declares type arguments which are bound to the type variables of the generic declaration. In this case we simply write $θ (p)$ , in which $p$ is the parameterized type declaration. As these new substitutions must become part of a (newly) created environment, we then usually write $Γ \leftarrow θ (p)$ . These substitutions are usually omitted.

1.1.2.2. Types of an Element

A variable or other typed element may be associated with three types:

Declared type: the type explicitly specified in the code, e.g., var s: string.
Inferred type: the type inferred by the type inferencer, e.g., var s = "Hello" infers the type of s to string. I.e. $Γ ⊢ s : s t r i n g$ will be true, or $[[s]] < : s t r i n g$ . If an element is annotated with a type ,i.e. it has a declared type, the inferred type will always be the declared type.
Actual type: the actual type of a variable during runtime. This type information is not available at compile time and ignored in this specification.

These types are not type declarations but type references, in fact, as they may be parameterized. For the sake of simplicity, we often omit the $R e f$ suffix to shorten formulas. Consequently, we define the following properties and pseudo properties for typed elements such as variables:

declaredType $R e f$: The explicitly declared type, this is usually a real property of the construct. Not all elements allow the specification of a declared type, such as expressions.
inferredType $R e f$ or $[[. . .]]$: This pseudo property is the inferred type computed by the type inferencer.
type $R e f$: A pseudo property for elements with a $d e c l a r e d T y p e$ property. It is similar to the inferred type, i.e. $e . t y p e = [[e]]$

1.2. Auxiliary Functions

This section describes some auxiliary functions required for definition of type inference rules later on.

1.2.1. Binding

Binding an identifier (variable reference) to a variable declaration (or variable definition) is not part of this specification as this is standard ECMAScript functionality. However, some valid ECMAScript bindings are permitted due to visibility constraints.

Definition: Binding Relation

We define a pseudo relation

b i n d : V a r i a b l e R e f e r e n c e \times V a r i a b l e D e c l a r a t i o n

which binds a reference, i.e. an identifier, to a declaration (e.g.,variable declaration).

Binding of variable references to declaration is defined by ECMAScript already. Type references only occur in type expressions, how these are handled is explained in Type Expressions.

We usually omit this binding mechanism in most rules and use the reference similarly to the declaration or definition it is bound to. If a variable reference $r$ , for example, is bound to a variable declaration $D$ , i.e. $b i n d (r, D)$ , we simply write $r . t y p e$ instead of $b i n d (r, D), D . t y p e$ to refer to the type expression (of the variable).^[2]

A DeclaredType references the type declaration by its simple name that has been imported from a module specifier. We define the method $b i n d$ for declared types as well:

Definition: Binding Relation of Types

We define a pseudo relation

b i n d : D e c l a r e d T y p e \times C l a s s | I n t e r f a c e | E n u m

which binds a type reference, i.e. a simple name, to the type declaration.

1.2.2. Merging Types

In some cases we have to merge types, e.g., types of a union type or item types of an array. For that purpose, we define a method $m e r g e$ as follows.

Definition: Merge Function

We define a pseudo function

m e r g e : T y p e \times . . . \times T y p e \Rightarrow P (T y p e)

The idea of this function is to remove duplicates. For example; if a union type contains two type expressions $t e_{1}$ and $t e_{k}$ , and if $τ (t e_{1}) = τ (t e_{2})$ , then $m e r g e (τ (t e_{1}), τ (t e_{2}))$ contains only one element. The order of the elements is lost, however.

1.2.2.1. Logic Formulas

In general, we use a pragmatic mixture of pseudo code, predicate logic, and OCL. Within constraints (also within the inference rules), the properties defined in the grammar are used.

In some rules, it is necessary to type the rule variables. Instead of explicitly checking the metatype (via $μ (X) = : M e t a T y p e$ ), we precede the variable with the type, that is: $M e t a T y p e X$ .

Instead of "type casting" elements, often properties are simply accessed. If an element does not define that element, it is either assumed to be false or null by default.

If a property $p$ is optional and not set, we write $p = n u l l$ to test its absence. Note that $p = n u l l$ is different from $p = N u l l$ , as the latter refers to the null type. Non-terminals may implicitly be subclasses. In that case, the concrete non-terminal, or type, of a property may be subject for a test in a constraint.

1.2.3. Symbols and Font Convention

Variables and their properties are printed in italic when used in formulas (such as rules). A dot-notation is used for member access, e.g. $v . n a m e$ . Also defined functions are printed in italic, e.g., $a c c (r, D)$ . Properties which define sets are usually ordered and we assume 0-indexed access to elements, the index subscripted, e.g., $v . m e t h o d s_{i}$ .

We use the following symbols and font conventions:

$\land$ , $\lor$ , $\oplus$ , $\neg$: Logical and, or, exclusive or (xor), and not.
$\Rightarrow$ , $\Leftrightarrow$ , $if . . .$ , $then . . .$ , $else . . .$: Logical implication, if and only if, and if-then-else.
$true$ , $false$ , $null$ , $\emptyset$: Boolean true, boolean false, null (i.e., not specified, e.g., $v . s u p =$ means that there are is no $s u p$ (super class) specified), empty set.
$\in$ , $\notin$ , $\cup$ , $\cap$ , $| x |$: Element of, not an element of, union set, intersection set, cardinality of set x.
$P (X)$: Power set of $X$ , i.e. $P (X) = \{U : U \subseteq X\}$ .
$\exists$ , $∄$ , $\forall$: Exists, not exists, for all; we write $\exists x, . . ., z : P (x, . . ., z)$ and say

"there exists $x, . . ., z$ such that predicate $P$ is true".

Note that $∄ x : P (x) \Leftrightarrow \forall x : \neg P (x)$ .
$μ (. . .)$: (mu) read "metatype of"; metatype of a variable or property, e.g.,

$if . . . μ (x) = : C l a s s then . . . else . . .$
$\bar{x}$: Sequence of elements $x_{1}, . . ., x_{n}$ . E.g., if we want to define a constraint that the owner of a members of a class $C$ is the class, we simply write

$C . \bar{m e m b e r s} . o w n e r = C$

instead of

$\forall m \in C . m e m b e r s : m . o w n e r = C$

or even more complicated with index variables.

Sequences are 1-based, e.g., a sequence $s$ with length $| s | = n$ , has elements $s_{1}, . . ., s_{n}$ .

N4JS Language Specification

Quick Links