Next: , Previous: UL syntax, Up: Compiler


4.11 XML syntax

In this section, we describe the syntax of the XML language by going through the Document Type Declaration (DTD) of it. To validate your own XML language files, we recommend the use of an XML validator such as the free rxp (available here: http://www.cogsci.ed.ac.uk/~richard/rxp.html). To understand the XML terminology, we recommend to read the W3C XML specification (available here: http://www.w3.org/TR/REC-xml).

Note that we provide an example grammar file in the XML language in Grammars/Acl01.xml. The grammar file defines exactly the same grammar as Grammars/Acl01.ul, so that it is easy to compare the two grammar file input languages.

4.11.1 Parameter entities

The DTD for the XML language begins with the definition of a couple of parameter entities:

     <!ENTITY % type "(typeDomain|typeSet|typeISet|typeTuple|typeList|
                       typeRecord|typeValency|typeCard|typeVec|
                       typeInt|typeInts|typeString|typeBool|typeRef|
                       typeLabelRef|typeVariable)">
     <!ENTITY % types "(%type;*)">
     <!ENTITY % term "(constant|variable|integer|top|bot|
                       constantCard|constantCardSet|constantCardInterval|
                       variableCard|variableCardSet|variableCardInterval|
                       set|list|record|setGen|featurePath|annotation|
                       conj|disj|concat|order|feature|varFeature)">
     <!ENTITY % terms "(%term;*)">
     <!ENTITY % class "(classDimension|useClass|classConj|classDisj)">
     <!ENTITY % classes "(%class;*)">
     <!ENTITY % recSpec "(feature|varFeature|recordConj|recordDisj)">
     <!ENTITY % recSpecs "(%recSpec;*)">
     <!ENTITY % setGenSpec "(constant|setGenConj|setGenDisj)">
     <!ENTITY % setGenSpecs "(%setGenSpec;*)">
     
     <!ENTITY principleDefs SYSTEM "../../Solver/Principles/principles.xml">
     <!ENTITY outputDefs SYSTEM "../../Outputs/outputs.xml">

The parameter entity type corresponds to the enumerated type encompassing the elements typeDomain, typeSet, typeISet, typeTuple, typeList, typeRecord, typeValency, typeCard, typeVec, typeInt, typeInts, typeString, typeBool, typeRef, typeLabelRef, and typeVariable.

The parameter entity types corresponds to zero or more occurrences of the parameter entity type.

The parameter entity term corresponds to the enumerated type encompassing the elements constant, variable, integer, top, bot, constantCard, constantCardSet, constantCardInterval, variableCard, variableCardSet, variableCardInterval, set, list, record, setGen, featurePath, annotation, conj, disj, concat, order, feature and varFeature.

The parameter entity terms corresponds to zero or more occurrences of the parameter entity term.

The parameter entity class corresponds to the enumerated type encompassing the elements classDimension, useClass, classConj, classDisj.

The parameter entity classes corresponds to zero or more occurrences of the parameter entity class.

The parameter entity recSpec corresponds to the enumerated type encompassing the elements feature, varFeature, recordConj, recordDisj.

The parameter entity recSpecs corresponds to zero or more occurrences of the parameter entity recSpec.

The parameter entity setGenSpec corresponds to the enumerated type encompassing the elements constant, setGenConj, setGenDisj.

The parameter entity setGenSpecs corresponds to zero or more occurrences of the parameter entity setGenSpec.

The parameter entity principleDefs corresponds to the system identifier "../../Solver/Principles/principles.xml", an XML file which declares all available principle identifiers. Since XDK grammar files do not contain principle definitions but only principle uses, this is how the XML grammar file “knows” the principle identifiers which can be used. Note that you can adapt the path of the system identifier to a more suitable one (whether it is suitable will depend on your XML validator).

The parameter entity outputDefs corresponds to the system identifier "../../Outputs/outputs.xml", an XML file which declares all available output identifiers. Since XDK grammar files do not contain output definitions but only output uses, this is how the XML grammar file “knows” the output identifiers which can be used. Note that you can adapt the path of the system identifier to a more suitable one (whether it is suitable will depend on your XML validator).

4.11.2 Elements

4.11.2.1 Root element (grammar)

The root element type of the XML language is grammar, defined as follows:

     <!ELEMENT grammar (principleDef*,outputDef*,useDimension*,
                        dimension*,classDef*,entry*)>

I.e. a grammar file in the XML language starts with zero or more principle definitions (principleDef*), then zero or more output definitions (outputDef*), then zero or more dimension uses (useDimension*), then zero or more dimension definitions (dimension*), then zero or more lexical class definitions (classDef*), and finally zero or more lexical entry definitions (entry*).

4.11.2.2 Principle definitions (principleDef)

XDK grammar files do not include principle definitions. The principle definitions in the XML language only introduce the principle identifiers to enable the file to be validated properly:

     <!ELEMENT principleDef EMPTY>
     <!ATTLIST principleDef id ID #REQUIRED>

The principleDef element has the required attribute id which is an XML ID corresponding to the principle identifier.

4.11.2.3 Output definitions (outputDef)

XDK grammar files do not include output definitions. The output definitions in the XML language only introduce the output identifiers to enable the file to be validated properly:

     <!ELEMENT outputDef EMPTY>
     <!ATTLIST outputDef id ID #REQUIRED>

The outputDef element has the required attribute id which is an XML ID corresponding to the output identifier.

4.11.2.4 Dimension use (useDimension)

Here is the syntax for using dimensions:

     <!ELEMENT useDimension EMPTY>
     <!ATTLIST useDimension idref IDREF #REQUIRED>

The useDimension element has the required attribute idref which is an XML ID reference corresponding to the dimension identifier.

4.11.2.5 Dimension definition (dimension)

Here is the syntax for defining dimensions:

     <!ELEMENT dimension (attrsType?,entryType?,labelType?,typeDef*,
                          usePrinciple*,output*,useOutput*)>
     <!ATTLIST dimension id ID #REQUIRED>

I.e. a dimension definition starts with zero or one definitions of the attributes type (attrsType?), then with zero or one definitions of the entry type (entryType?), then with zero or one definitions of the label type (labelType?). Then, it continues with zero or more additional type definitions (typeDef*), then zero or more used principle (usePrinciple*), zero or more chosen outputs (output*), and finally zero or more used outputs (useOutput*).

It has the required attribute id which is an XML ID corresponding to the dimension identifier.

4.11.2.6 Attributes type (attrsType)

Here is the syntax for defining the attributes type:

     <!ELEMENT attrsType %type;>

I.e. the attrsType element has one obligatory child which is a type.

4.11.2.7 Entry type (entryType)

Here is the syntax for defining the entry type:

     <!ELEMENT entryType %type;>

I.e. the entryType element has one obligatory child which is a type.

4.11.2.8 Label type (labelType)

Here is the syntax for defining the label type:

     <!ELEMENT labelType %type;>

I.e. the labelType element has one obligatory child which is a type.

4.11.2.9 Choosing an output (output)

Here is the syntax for choosing an output:

     <!ELEMENT output EMPTY>
     <!ATTLIST output idref IDREF #REQUIRED>

The output element has the required attribute idref which is an XML ID reference corresponding to the output identifier.

4.11.2.10 Using an output (useOutput)

Here is the syntax for using an output:

     <!ELEMENT useOutput EMPTY>
     <!ATTLIST useOutput idref IDREF #REQUIRED>

The useOutput element has the required attribute idref which is an XML ID reference corresponding to the output identifier.

4.11.2.11 Type definition (typeDef)

Here is the syntax for defining a type:

     <!ELEMENT typeDef %type;>
     <!ATTLIST typeDef id ID #REQUIRED>

I.e. the typeDef element has one child which is a type.

It has the required attribute id which is an XML ID corresponding to the type identifier.

4.11.2.12 Types (type parameter entity)

Here is the syntax of types:

     <!ELEMENT typeDomain (constant*)>
     <!ELEMENT typeSet %type;>
     <!ELEMENT typeISet %type;>
     <!ELEMENT typeTuple %types;>
     <!ELEMENT typeList %type;>
     <!ELEMENT typeRecord (typeFeature*)>
     <!ELEMENT typeValency %type;>
     <!ELEMENT typeCard EMPTY>
     <!ELEMENT typeVec (%type;,%type;)>
     <!ELEMENT typeInt EMPTY>
     <!ELEMENT typeInts EMPTY>
     <!ELEMENT typeString EMPTY>
     <!ELEMENT typeBool EMPTY>
     <!ELEMENT typeRef EMPTY>
     <!ATTLIST typeRef idref IDREF #REQUIRED>
     <!ELEMENT typeLabelRef EMPTY>
     <!ATTLIST typeLabelRef data NMTOKEN #REQUIRED>
     <!ELEMENT typeVariable EMPTY>

typeDomain is a finite domain of constants constant*.

typeSet is an accumulative set with domain %type;.

typeISet is an intersective set with domain %type;.

typeTuple is a tuple with projections %types;.

typeList is a list with domain %type;.

typeRecord is a record with features typeFeature*.

typeValency is a valency with domain %type;.

typeCard is a cardinality set.

typeVec is a vector with fields and value type (%type;,%type;).

typeInt is an integer.

typeInts is a set of integers.

typeString is a string.

typeBool is a boolean.

typeRef is a type reference to the type identifier specified by its required idref attribute (an XML ID reference).

typeLabelRef is a reference to the label type of the dimension variable specified by the required data attribute (an XML name token). typeVariable is a type variable.

4.11.2.13 Features (typeFeature and feature)

Here is the syntax for type features:

     <!ELEMENT typeFeature %type;>
     <!ATTLIST typeFeature
               data NMTOKEN #REQUIRED>
     <!ELEMENT feature %term;>
     <!ATTLIST feature
               data NMTOKEN #REQUIRED>
     <!ELEMENT varFeature %term;>
     <!ATTLIST varFeature
               data NMTOKEN #REQUIRED>

The typeFeature element has one child which is a type (%type;), and the required attribute data (an XML name token) which is its field.

The feature element has one child which is a term (%term;), and the required attribute data (an XML name token) which is its field.

The varFeature element has one child which is a term (%term;), and the required attribute data (an XML name token) which is its field.

4.11.2.14 Principle use (usePrinciple)

Here is the syntax for using principles:

     <!ELEMENT usePrinciple (dim*,arg*)>
     <!ATTLIST usePrinciple idref IDREF #REQUIRED>
     
     <!ELEMENT dim EMPTY>
     <!ATTLIST dim
               var NMTOKEN #REQUIRED
               idref IDREF #REQUIRED>
     
     <!ELEMENT arg %term;>
     <!ATTLIST arg
               var NMTOKEN #REQUIRED>

The usePrinciple element has zero or more dim children which establish the dimension mapping, followed by zero or more arg children which establish the argument mapping. It has the required attribute idref which is an XML ID reference to the used principle identifier.

The dim element has the required attributes var (an XML name token), and idref (an XML ID reference). var is the dimension variable, and idref is the dimension ID to which the former is bound.

The arg element has one child which is a term (%term;). It has the required attribute var (an XML name token). var is the argument variable to which the term is bound.

4.11.2.15 Class definitions (classDef)

Here is the syntax for class definitions:

     <!ELEMENT classDef (variable*,%classes;)>
     <!ATTLIST classDef
               id ID #REQUIRED>

I.e. the classDef element has zero or more variable children, and one child corresponding to the parameter entity classes (%classes;).

It has the required attribute id, an XML ID corresponding to the class identifier.

4.11.2.16 Class bodies

Here is the syntax for class bodies:

The parameter entity classes corresponds to either of the elements classDimension, useClass, classConj, or classDisj:

     <!ELEMENT classDimension %term;>
     <!ATTLIST classDimension idref IDREF #REQUIRED>
     
     <!ELEMENT useClass (feature*)>
     <!ATTLIST useClass idref IDREF #REQUIRED>
     
     <!ELEMENT classConj %classes;>
     
     <!ELEMENT classDisj %classes;>

The classDimension element specifies a dimension entry (%term;) for the dimension with the identifier given by the required attribute idref (an XML ID reference).

The useClass element specifies the use of a lexical class with the class identifier given by the required attribute idref (an XML ID reference). The parameters of this class are specified as a list of features (features*).

The classConj element specifies the conjunction of its children.

The classDisj element specifies the disjunction of its children.

4.11.2.17 Lexical entries (entry)

Here is the syntax for lexical entries:

     <!ELEMENT entry %classes;>

I.e. the entry element specifies a lexical entry as a list of class bodies (%classes;).

4.11.2.18 Terms (term parameter entity)

Here is the syntax for terms:

     <!ELEMENT constant EMPTY>
     <!ATTLIST constant data NMTOKEN #REQUIRED>
     <!ELEMENT integer EMPTY>
     <!ATTLIST integer data NMTOKEN #REQUIRED>
     <!ELEMENT top EMPTY>
     <!ELEMENT bot EMPTY>
     <!ELEMENT variable EMPTY>
     <!ATTLIST variable data NMTOKEN #REQUIRED>
     <!ELEMENT constantCard EMPTY>
     <!ATTLIST constantCard
               data NMTOKEN #REQUIRED
     	  card (one|opt|any|geone) "one">
     <!ELEMENT constantCardSet (integer*)>
     <!ATTLIST constantCardSet
               data NMTOKEN #REQUIRED>
     <!ELEMENT constantCardInterval (integer,integer)>
     <!ATTLIST constantCardInterval
               data NMTOKEN #REQUIRED>
     <!ELEMENT variableCard EMPTY>
     <!ATTLIST variableCard
               data NMTOKEN #REQUIRED
     	  card (one|opt|any|geone) "one">
     <!ELEMENT variableCardSet (integer*)>
     <!ATTLIST variableCardSet
               data NMTOKEN #REQUIRED>
     <!ELEMENT variableCardInterval (integer,integer)>
     <!ATTLIST variableCardInterval
               data NMTOKEN #REQUIRED>
     <!ELEMENT set %terms;>
     <!ELEMENT list %terms;>
     <!ELEMENT record %recSpecs;>
     <!ELEMENT recordConj %recSpecs;>
     <!ELEMENT recordDisj %recSpecs;>
     <!ELEMENT setGen %setGenSpecs;>
     <!ELEMENT setGenConj %setGenSpecs;>
     <!ELEMENT setGenDisj %setGenSpecs;>
     <!ELEMENT featurePath (constant*)>
     <!ATTLIST featurePath
               root (down|up) #REQUIRED
               dimension NMTOKEN #REQUIRED
               aspect (entry|attrs) #REQUIRED>
     <!ELEMENT annotation (%term;,%type;)>
     <!ELEMENT conj %terms;>
     <!ELEMENT disj %terms;>
     <!ELEMENT concat %terms;>

The constant element defines a constant. It has the required attribute data (an XML name token) which is the constant itself

The integer element defines an integer. It has the required attribute data (an XML name token) which is the integer itself.

The top element corresponds to lattice top.

The bot element corresponds to lattice bottom.

The variable element defines a variable. It has the required attribute data (an XML name token) which is the variable itself.

The constantCard element defines a cardinality specification. It has the attributes data (an XML name token) and card, of which data is required and card is optional (with attribute default one). data corresponds to the field of the cardinality specification, and card to the cardinality set. Here, one corresponds to ! in the UL, opt to ?, any to *, and geone to +.

The constantCardSet element also defines a cardinality specification. It has zero or more integer children and the required attribute data (an XML name token). data is the field of the cardinality specification. The integer children the set of integers in the cardinality set.

The constantCardInterval element also defines a cardinality specification. It has two children and the required attribute data (an XML name token). data is the field of the cardinality specification. The two integers define the cardinality set by a closed interval.

variableCard, variableCardSet and variableCardInterval have variable instead of constant features.

The set element specifies a set of terms (%terms;).

The list element specifies a list of terms (%terms;).

The record element specifies a record. Therefore, it utilizes record specifications (%recSpecs;). A record specification is either a feature (feature), a variable feature (varFeature), a conjunction of record specifications (recordConj), or a disjunction of record specifications (recordDisj).

The setGen element specifies a set generator expression. The body of a set generator expression is a list of specifications (%setGenSpecs;). A set generator expression specification is either a constant (constant), a conjunction of set generator expression specifications (setGenConj), or a disjunction of set generator expression specifications (setGenDisj).

The featurePath element specifies a feature path. The required attribute root (down or up) corresponds to the root variabe of the feature path, the required attribute dimension to the dimension variable, and the required attribute aspect to the aspect (entry or attrs). The constant children of the featurePath element correspond to the fields of the feature path. Note that the root variable value down corresponds to _ in the UL, and up to ^.

The annotation element specifies a type annotation for a term. Its first child is a term (%term;), and its second child a type (%type;).

The conj element specifies the conjunction of a list of terms (%terms;).

The disj element specifies the disjunction of a list of terms (%terms;).

The concat element specifies the concatenation of a list of terms (%terms;). Concatenation is restricted to strings.

The order element specifies an order generator for a list of terms (%terms;).