Next: , Previous: Types reference, Up: Compiler


4.10 UL syntax

In this section, we describe the syntax of User Language (UL) grammar files, using the Extended Backus Naur Form (EBNF) as defined in the XML specification of the W3C (see http://www.w3.org/TR/REC-xml#sec-notation).

4.10.1 UL lexical syntax

In this section, we lay out the lexical syntax of the UL.

4.10.1.1 Keywords

Here are the keywords of the UL:

     <keyword> ::= args | attrs |
                   bool | bot |
                   card |
                   defattrstype | defclass | defdim | defentry |
                   defentrytype | defgrammar | deflabeltype | deftype |
                   dim | dims |
                   entry |
                   infty | int | ints | iset |
                   label | list |
                   vec |
                   output |
                   ref |
                   set | string |
                   top | tuple | tv |
                   useclass | usedim | useoutput | useprinciple |
                   valency
4.10.1.2 Operators

Here are the operators of the UL:

     <operator> ::= { | } | ( | ) | * | & | '| | '@ | [ | ] | < | > |
                    $ | . | :: | _ | ^ | ! | ? | + | # | :
4.10.1.3 Identifiers

Identifiers consist of letters and the underscore:

     <id> ::= [a-zA-Z_]+
4.10.1.4 Integers

Integers consist of numbers:

     <int> ::= [0-9]+
4.10.1.5 Strings

Strings can be quoted using single quotes (<sstring>), double quotes (<dstring>), or guillemet quotes (<gstring>). You can freely choose between the different kinds of quotes. Inside the quotes, you can write strings using any characters from the ISO 8859-1 character set. We write . for “any character from the ISO 8859-1 character set”:

     <sstring> ::= '.+'
     
     <dstring> ::= ".+"
     
     <gstring> ::= «.+»
4.10.1.6 End of line comments

End of line comments are written using the percent symbol %.

4.10.1.7 Balanced comments

Balanced comments start with /* and end with */.

4.10.1.8 Includes

Files can be included using the \input directive. For example to include the file Chorus_header.ul, you write:

     \include "Chorus_header.ul"

4.10.2 UL context-free syntax

In this section, we lay out the context-free syntax of the UL. We write all keywords in lower case, and all non-terminals in upper case letters. We use single quotes to escape the meta characters (, ), [, ], ?, *, +, #, |, and ..

4.10.2.1 Start symbol (S)

The start symbol of our context-free grammar is S:

     S ::= Defgrammar*
4.10.2.2 Grammar definitions (Defgrammar)

Here is the UL Syntax for grammar definitions:

     Defgrammar ::= defdim Constant { Defdim* }
                |   defclass Constant Constant* { Class }
                |   defentry { Class* }
                |   usedim Constant

defdim Constant { Defdim* } defines a dimension with identifier Constant, and dimension definitions Defdim*.

defclass Constant Constant* { Class } defines a lexical class with identifier Constant, class variables Constant*, and class body Class.

defentry { Class* } defines a lexical entry defined by class bodies Class*.

usedim Constant uses the dimension with identifier Constant.

4.10.2.3 Dimension definitions (Defdim)

Here is the UL syntax for dimension definitions:

     Defdim ::= defattrstype Type
            |   defentrytype Type
            |   deflabeltype Type
            |   deftype Constant Type
            |   useprinciple Constant { Useprinciple* }
            |   output Constant
            |   useoutput Constant

defattrstype Type defines the attributes type Type.

defentrytype Type defines the entry type Type.

deflabeltype Type defines the label type Type.

deftype Constant Type defines the type Type with identifier Constant.

useprinciple Constant { Useprinciple* } uses the principle with identifier Constant and dimension and argument mappings Useprinciple*.

output Constant chooses output Constant.

useoutput Constant uses output Constant.

4.10.2.4 Principle use instructions (Useprinciple)

Here is the UL syntax for principle use instructions:

     Useprinciple ::= dims { VarTermFeat* }
                  |   args { VarTermFeat* }

dims { VarTermFeat* } is the dimension mapping VarTermFeat*.

args { VarTermFeat* } is the argument mapping VarTermFeat*.

4.10.2.5 Types (Type)

This is the UL syntax of types:

     Type ::= { Constant* }
          |   set '(' Type ')'
          |   iset '(' Type ')'
          |   tuple '(' Type* ')'
          |   list '(' Type ')'
          |   valency '(' Type ')'
          |   { TypeFeat+ }
          |   { : }
          |   vec '(' Type_1 Type_2 ')'
          |   card
          |   int
          |   ints
          |   string
          |   bool
          |   ref '(' Constant ')'
          |   Constant
          |   label '(' Constant ')'
          |   tv '(' Constant ')'
          |   '(' Type ')'

{ Constant* } is a finite domain consisting of the constants Constant*.

set '(' Type ')' is a accumulative set with domain Type.

iset '(' Type ')' is a intersective set with domain Type.

tuple '(' Type* ')' is a tuple with projections Type*.

list '(' Type ')' is a list with domain Type.

valency '(' Type ')' is a valency with domain Type.

{ TypeFeat+ } is a record with features TypeFeat+.

{ : } is the empty record.

vec '(' Type_1 Type_2 ')' is a vector with fields Type_1 and values of type Type_2.

card is a cardinality set.

int is an integer.

ints is a set of integers.

string is a string.

bool is a boolean.

ref '(' Constant ')' is a type reference to the type with identifier Constant.

Constant is a shortcut for ref '(' Constant ')'.

label '(' Constant ')' is an label reference to the label type on the dimension referred to by dimension variable Constant.

tv '(' Constant ')' is a type variable.

'(' Type ')' encapsulates type Type.

4.10.2.6 Class bodies (Class)

Here is the UL syntax of a lexical class body:

     Class ::= dim Constant Term
           |   useclass Constant
           |   useclass Constant { VarTermFeat* }
           |   Constant
           |   Constant { VarTermFeat* }
           |   Class_1 & Class_2
           |   Class_1 '|' Class_2
           |   '(' Class ')'

dim Constant Term defines the entry Term for the dimension with identifier Constant.

useclass Constant uses the lexical class with identifier Constant.

Constant is a shortcut for useclass Constant.

useclass Constant { VarTermFeat* } uses the lexical class with identifier Constant and class parameters VarTermFeat*.

Constant { VarTermFeat* } is a shortcut for useclass Constant { VarTermFeat* }.

Class & Class is the conjunction of Class_1 and Class_2.

Class '|' Class is the disjunction of Class_1 and Class_2.

'(' Class ')' brackets class Class.

4.10.2.7 Terms (Term)

Here is the UL syntax of terms:

     Term ::= Constant
          |   Integer
          |   top
          |   bot
          |   Featurepath
          |   CardFeat
          |   { Term* }
          |   '[' Term* ']'
          |   { Recspec+ }
          |   { : }
          |   $ Setgen
          |   $ '(' ')'
          |   Term :: Type
          |   Term_1 & Term_2
          |   Term_1 '|' Term_2
          |   Term_1 @ Term_2
          |   '<' Term* '>'
          |   '(' Term ')'

Constant is a constant.

Integer is an integer.

top is lattice top.

bot is lattice bottom.

Featurepath is a feature path.

CardFeat is a cardinality specification.

{ Term* } is a set of the elements Term*.

'[' Term* ']' is a list of the elements Term* (in this order).

{ Recspec+ } is a record with specification Recspec+.

{ : } is the empty record.

$ Setgen introduces set generator expression with set generator expression body Setgen.

$ '(' ')' is the empty set generator expression.

Term :: Type is a type annotation of term Term with type Type.

Term_1 & Term_2 is the conjunction of Term_1 and Term_2.

Term_1 '|' Term_2 is the disjunction of Term_1 and Term_2.

Term_1 @ Term_2 is the concatenation of Term_1 and Term_2. Concatenation is restricted to strings.

'<' Term* '>' is an order generator specification of a list of elements Term*.

'(' Term ')' brackets term Term.

4.10.2.8 Feature paths (Featurepath)

Here is the UL syntax of feature paths:

     Featurepath ::= Root '.' Constant '.' Aspect ('.' Constant)+
     
     Root ::= _|^
     
     Aspect ::= attrs|entry

Root '.' Constant '.' Aspect ('.' Constant)+ is a feature path with root variable Root, dimension variable Constant, aspect Aspect, and the list fields ('.'Constant)+.

4.10.2.9 Record specifications (Recspec)

Here is the UL syntax of record specifications:

     Recspec ::= TermFeat
             |   Recspec_1 & Recspec_2
             |   Recspec_1 '|' Recspec_2
             |   '(' Recspec ')'

TermFeat is a feature.

Recspec_1 & Recspec_2 is the conjunction of Recspec_1 and Recspec_2.

Recspec_1 '|' Recspec_2 is the disjunction of Recspec_1 and Recspec_2.

'(' Recspec ')' brackets record specification Recspec.

4.10.2.10 Set generator expression bodies (Setgen)

Here is the UL syntax of set generator expression bodies:

     Setgen ::= Constant
            |   Setgen_1 & Setgen_2
            |   Setgen_1 '|' Setgen_2
            |   '(' Setgen ')'

Constant is a constant.

Setgen_1 & Setgen_2 is the conjunction of Setgen_1 and Setgen_2.

Setgen_1 '|' Setgen_2 is the disjunction of Setgen_1 and Setgen_2.

'(' Setgen ')' brackets set generator expression body Setgen.

4.10.2.11 Constants (Constant)

Here is the UL syntax of constants:

     Constant ::= <id> | <sstring> | <dstring> | <gstring>

I.e. a constant is either an identifier (<id>), a single quoted string (<sstring>), a double quoted string (<dstring>), or a guillemot quoted string (<gstring>).

4.10.2.12 Integers (Integer)

Here is the UL syntax of constants:

     Integer ::= <int> | infty

I.e. an integer is either an integer (<int>) or the keyword for “infinity” (infty).

4.10.2.13 Features (ConstantFeat, TermFeat, VarTermFeat, and CardFeat)

Here is the UL syntax of features:

     ConstantFeat ::= Constant_1 : Constant_2
     
     TermFeat ::= Constant : Term
     
     VarTermFeat ::= Constant : Term
     
     TypeFeat ::= Constant : Type
     
     CardFeat ::= Constant Card

ConstantFeat is a feature with field Constant_1 and value Constant_2.

TermFeat and VarTermFeat are features with field Constant and value Term.

TypeFeat is a feature with field Constant and value Type.

CardFeat is a cardinality specification with field Constant and cardinality set Card.

4.10.2.14 Cardinality sets (Card)

Here is the UL syntax of cardinality sets:

     Card ::= !
          |   '?'
          |   '*'
          |   '+'
          |   '#' { Integer* }
          |   '#' '[' Integer_1 Integer_2 ']'

! is cardinality set {0}.

'?' is the cardinality set {0,1}.

'*' is the cardinality set {0,...,infty} where infty means “infinity”.

'+' is the cardinality set {1,...,infty}.

'#' { Integer* } is the cardinality set including the integers Integer*.

'#' '[' Integer_1 Integer_2 ']' is the cardinality set including the closed interval between Integer_1 and Integer_2.