In this section, we describe the syntax of the XML language by going
through the Document Type Declaration (DTD) of it. To validate
your own XML language files, we recommend the use of an XML validator
such as the free rxp
(available here:
http://www.cogsci.ed.ac.uk/~richard/rxp.html). To understand
the XML terminology, we recommend to read the W3C XML specification
(available here: http://www.w3.org/TR/REC-xml).
Note that we provide an example grammar file in the XML language in
Grammars/Acl01.xml
. The grammar file defines exactly the same
grammar as Grammars/Acl01.ul
, so that it is easy to compare the
two grammar file input languages.
The DTD for the XML language begins with the definition of a couple of parameter entities:
<!ENTITY % type "(typeDomain|typeSet|typeISet|typeTuple|typeList| typeRecord|typeValency|typeCard|typeVec| typeInt|typeInts|typeString|typeBool|typeRef| typeLabelRef|typeVariable)"> <!ENTITY % types "(%type;*)"> <!ENTITY % term "(constant|variable|integer|top|bot| constantCard|constantCardSet|constantCardInterval| variableCard|variableCardSet|variableCardInterval| set|list|record|setGen|featurePath|annotation| conj|disj|concat|order|feature|varFeature)"> <!ENTITY % terms "(%term;*)"> <!ENTITY % class "(classDimension|useClass|classConj|classDisj)"> <!ENTITY % classes "(%class;*)"> <!ENTITY % recSpec "(feature|varFeature|recordConj|recordDisj)"> <!ENTITY % recSpecs "(%recSpec;*)"> <!ENTITY % setGenSpec "(constant|setGenConj|setGenDisj)"> <!ENTITY % setGenSpecs "(%setGenSpec;*)"> <!ENTITY principleDefs SYSTEM "../../Solver/Principles/principles.xml"> <!ENTITY outputDefs SYSTEM "../../Outputs/outputs.xml">
The parameter entity type
corresponds to the enumerated
type
encompassing the elements
typeDomain
, typeSet
, typeISet
, typeTuple
,
typeList
, typeRecord
, typeValency
,
typeCard
, typeVec
,
typeInt
, typeInts
, typeString
, typeBool
,
typeRef
, typeLabelRef
, and typeVariable
.
The parameter entity types
corresponds to zero or more occurrences
of the parameter entity type
.
The parameter entity term
corresponds to the enumerated type
encompassing the elements constant
, variable
,
integer
, top
, bot
, constantCard
,
constantCardSet
, constantCardInterval
,
variableCard
, variableCardSet
,
variableCardInterval
, set
, list
, record
,
setGen
, featurePath
, annotation
,
conj
, disj
, concat
, order
, feature
and varFeature
.
The parameter entity terms
corresponds to zero or more occurrences
of the parameter entity term
.
The parameter entity class
corresponds to the enumerated type
encompassing the elements classDimension
, useClass
,
classConj
, classDisj
.
The parameter entity classes
corresponds to zero or more occurrences
of the parameter entity class
.
The parameter entity recSpec
corresponds to the enumerated type
encompassing the elements feature
, varFeature
,
recordConj
, recordDisj
.
The parameter entity recSpecs
corresponds to zero or more occurrences
of the parameter entity recSpec
.
The parameter entity setGenSpec
corresponds to the enumerated type
encompassing the elements constant
, setGenConj
,
setGenDisj
.
The parameter entity setGenSpecs
corresponds to zero or more occurrences
of the parameter entity setGenSpec
.
The parameter entity principleDefs
corresponds to the
system identifier
"../../Solver/Principles/principles.xml"
, an XML file which
declares all available principle identifiers. Since XDK grammar files
do not contain principle definitions but only principle uses, this is
how the XML grammar file “knows” the principle identifiers which can
be used. Note that you can adapt the path of the system identifier to
a more suitable one (whether it is suitable will depend on your XML
validator).
The parameter entity outputDefs
corresponds to the
system identifier
"../../Outputs/outputs.xml"
, an XML file which declares all
available output identifiers. Since XDK grammar files do not contain
output definitions but only output uses, this is how the XML grammar
file “knows” the output identifiers which can be used. Note that you
can adapt the path of the system identifier to a more suitable one
(whether it is suitable will depend on your XML validator).
The root element type
of the XML language is grammar
, defined as follows:
<!ELEMENT grammar (principleDef*,outputDef*,useDimension*, dimension*,classDef*,entry*)>
I.e. a grammar file in the XML language starts with zero or more
principle definitions (principleDef*
), then zero or more output
definitions (outputDef*
), then zero or more dimension uses
(useDimension*
), then zero or more dimension definitions
(dimension*
), then zero or more lexical class definitions
(classDef*
), and finally zero or more lexical entry definitions
(entry*
).
XDK grammar files do not include principle definitions. The principle definitions in the XML language only introduce the principle identifiers to enable the file to be validated properly:
<!ELEMENT principleDef EMPTY> <!ATTLIST principleDef id ID #REQUIRED>
The principleDef
element has the required attribute
id
which is an XML ID corresponding to the principle
identifier.
XDK grammar files do not include output definitions. The output definitions in the XML language only introduce the output identifiers to enable the file to be validated properly:
<!ELEMENT outputDef EMPTY> <!ATTLIST outputDef id ID #REQUIRED>
The outputDef
element has the required attribute id
which is an XML ID corresponding to the output identifier.
Here is the syntax for using dimensions:
<!ELEMENT useDimension EMPTY> <!ATTLIST useDimension idref IDREF #REQUIRED>
The useDimension
element has the required attribute
idref
which is an XML ID reference corresponding to the
dimension identifier.
Here is the syntax for defining dimensions:
<!ELEMENT dimension (attrsType?,entryType?,labelType?,typeDef*, usePrinciple*,output*,useOutput*)> <!ATTLIST dimension id ID #REQUIRED>
I.e. a dimension definition starts with zero or one definitions of the
attributes type (attrsType?
), then with zero or one definitions
of the entry type (entryType?
), then with zero or one
definitions of the label type (labelType?
). Then, it continues
with zero or more additional type definitions (typeDef*
), then
zero or more used principle (usePrinciple*
), zero or more
chosen outputs (output*
), and finally zero or more used
outputs (useOutput*
).
It has the required attribute id
which is an XML ID
corresponding to the dimension identifier.
Here is the syntax for defining the attributes type:
<!ELEMENT attrsType %type;>
I.e. the attrsType
element has one obligatory child which is a
type.
Here is the syntax for defining the entry type:
<!ELEMENT entryType %type;>
I.e. the entryType
element has one obligatory child which is a
type.
Here is the syntax for defining the label type:
<!ELEMENT labelType %type;>
I.e. the labelType
element has one obligatory child which is a
type.
Here is the syntax for choosing an output:
<!ELEMENT output EMPTY> <!ATTLIST output idref IDREF #REQUIRED>
The output
element has the required attribute idref
which is an XML ID reference corresponding to the output identifier.
Here is the syntax for using an output:
<!ELEMENT useOutput EMPTY> <!ATTLIST useOutput idref IDREF #REQUIRED>
The useOutput
element has the required attribute idref
which is an XML ID reference corresponding to the output identifier.
Here is the syntax for defining a type:
<!ELEMENT typeDef %type;> <!ATTLIST typeDef id ID #REQUIRED>
I.e. the typeDef
element has one child which is a type.
It has the required attribute id
which is an XML ID
corresponding to the type identifier.
Here is the syntax of types:
<!ELEMENT typeDomain (constant*)> <!ELEMENT typeSet %type;> <!ELEMENT typeISet %type;> <!ELEMENT typeTuple %types;> <!ELEMENT typeList %type;> <!ELEMENT typeRecord (typeFeature*)> <!ELEMENT typeValency %type;> <!ELEMENT typeCard EMPTY> <!ELEMENT typeVec (%type;,%type;)> <!ELEMENT typeInt EMPTY> <!ELEMENT typeInts EMPTY> <!ELEMENT typeString EMPTY> <!ELEMENT typeBool EMPTY> <!ELEMENT typeRef EMPTY> <!ATTLIST typeRef idref IDREF #REQUIRED> <!ELEMENT typeLabelRef EMPTY> <!ATTLIST typeLabelRef data NMTOKEN #REQUIRED> <!ELEMENT typeVariable EMPTY>
typeDomain
is a finite domain of constants constant*
.
typeSet
is an accumulative set with domain %type;
.
typeISet
is an intersective set with domain %type;
.
typeTuple
is a tuple with projections %types;
.
typeList
is a list with domain %type;
.
typeRecord
is a record with features typeFeature*
.
typeValency
is a valency with domain %type;
.
typeCard
is a cardinality set.
typeVec
is a vector with fields and value type
(%type;,%type;)
.
typeInt
is an integer.
typeInts
is a set of integers.
typeString
is a string.
typeBool
is a boolean.
typeRef
is a type reference to the type identifier specified by
its required idref
attribute (an XML ID reference).
typeLabelRef
is a reference to the label type of the dimension
variable specified by the required data
attribute (an XML
name token).
typeVariable
is a type variable.
Here is the syntax for type features:
<!ELEMENT typeFeature %type;> <!ATTLIST typeFeature data NMTOKEN #REQUIRED> <!ELEMENT feature %term;> <!ATTLIST feature data NMTOKEN #REQUIRED> <!ELEMENT varFeature %term;> <!ATTLIST varFeature data NMTOKEN #REQUIRED>
The typeFeature
element has one child which is a type (%type;
),
and the required attribute data
(an XML name token) which is its field.
The feature
element has one child which is a term
(%term;
), and the required attribute data
(an XML name
token) which is its field.
The varFeature
element has one child which is a term
(%term;
), and the required attribute data
(an XML name
token) which is its field.
Here is the syntax for using principles:
<!ELEMENT usePrinciple (dim*,arg*)> <!ATTLIST usePrinciple idref IDREF #REQUIRED> <!ELEMENT dim EMPTY> <!ATTLIST dim var NMTOKEN #REQUIRED idref IDREF #REQUIRED> <!ELEMENT arg %term;> <!ATTLIST arg var NMTOKEN #REQUIRED>
The usePrinciple
element has zero or more dim
children
which establish the dimension mapping, followed by zero or more
arg
children which establish the argument mapping. It has the
required attribute idref
which is an XML ID reference to the
used principle identifier.
The dim
element has the required attributes var
(an XML
name token), and idref
(an XML ID reference). var
is
the dimension variable, and idref
is the dimension ID to which
the former is bound.
The arg
element has one child which is a term
(%term;
). It has the required attribute var
(an XML name
token). var
is the argument variable to which the term is
bound.
Here is the syntax for class definitions:
<!ELEMENT classDef (variable*,%classes;)> <!ATTLIST classDef id ID #REQUIRED>
I.e. the classDef
element has zero or more variable
children, and one child corresponding to the parameter entity
classes
(%classes;
).
It has the required attribute id
, an XML ID corresponding to
the class identifier.
Here is the syntax for class bodies:
The parameter entity classes
corresponds to either of the
elements classDimension
, useClass
, classConj
, or
classDisj
:
<!ELEMENT classDimension %term;> <!ATTLIST classDimension idref IDREF #REQUIRED> <!ELEMENT useClass (feature*)> <!ATTLIST useClass idref IDREF #REQUIRED> <!ELEMENT classConj %classes;> <!ELEMENT classDisj %classes;>
The classDimension
element specifies a dimension entry
(%term;
) for the dimension with the identifier given by the
required attribute idref
(an XML ID reference).
The useClass
element specifies the use of a lexical class with
the class identifier given by the required attribute idref
(an
XML ID reference). The parameters of this class are specified as a
list of features (features*
).
The classConj
element specifies the conjunction of its
children.
The classDisj
element specifies the disjunction of its
children.
Here is the syntax for lexical entries:
<!ELEMENT entry %classes;>
I.e. the entry
element specifies a lexical entry as a list of
class bodies (%classes;
).
Here is the syntax for terms:
<!ELEMENT constant EMPTY> <!ATTLIST constant data NMTOKEN #REQUIRED> <!ELEMENT integer EMPTY> <!ATTLIST integer data NMTOKEN #REQUIRED> <!ELEMENT top EMPTY> <!ELEMENT bot EMPTY> <!ELEMENT variable EMPTY> <!ATTLIST variable data NMTOKEN #REQUIRED> <!ELEMENT constantCard EMPTY> <!ATTLIST constantCard data NMTOKEN #REQUIRED card (one|opt|any|geone) "one"> <!ELEMENT constantCardSet (integer*)> <!ATTLIST constantCardSet data NMTOKEN #REQUIRED> <!ELEMENT constantCardInterval (integer,integer)> <!ATTLIST constantCardInterval data NMTOKEN #REQUIRED> <!ELEMENT variableCard EMPTY> <!ATTLIST variableCard data NMTOKEN #REQUIRED card (one|opt|any|geone) "one"> <!ELEMENT variableCardSet (integer*)> <!ATTLIST variableCardSet data NMTOKEN #REQUIRED> <!ELEMENT variableCardInterval (integer,integer)> <!ATTLIST variableCardInterval data NMTOKEN #REQUIRED> <!ELEMENT set %terms;> <!ELEMENT list %terms;> <!ELEMENT record %recSpecs;> <!ELEMENT recordConj %recSpecs;> <!ELEMENT recordDisj %recSpecs;> <!ELEMENT setGen %setGenSpecs;> <!ELEMENT setGenConj %setGenSpecs;> <!ELEMENT setGenDisj %setGenSpecs;> <!ELEMENT featurePath (constant*)> <!ATTLIST featurePath root (down|up) #REQUIRED dimension NMTOKEN #REQUIRED aspect (entry|attrs) #REQUIRED> <!ELEMENT annotation (%term;,%type;)> <!ELEMENT conj %terms;> <!ELEMENT disj %terms;> <!ELEMENT concat %terms;>
The constant
element defines a constant. It has the required
attribute data
(an XML name token) which is the constant itself
The integer
element defines an integer. It has the required
attribute data
(an XML name token) which is the integer itself.
The top
element corresponds to lattice top.
The bot
element corresponds to lattice bottom.
The variable
element defines a variable. It has the required
attribute data
(an XML name token) which is the variable
itself.
The constantCard
element defines a cardinality
specification. It has the attributes data
(an XML name token)
and card
, of which data
is required and card
is
optional (with attribute default
one
). data
corresponds to the field of the cardinality
specification, and card
to the cardinality set. Here,
one
corresponds to !
in the UL, opt
to ?
,
any
to *
, and geone
to +
.
The constantCardSet
element also defines a cardinality
specification. It has zero or more integer
children and the
required attribute data
(an XML name token). data
is the
field of the cardinality specification. The integer
children
the set of integers in the cardinality set.
The constantCardInterval
element also defines a cardinality
specification. It has two children
and the required attribute
data
(an XML name token). data
is the field of the
cardinality specification. The two integers define the cardinality set
by a closed interval.
variableCard
, variableCardSet
and
variableCardInterval
have variable instead of constant features.
The set
element specifies a set of terms (%terms;
).
The list
element specifies a list of terms (%terms;
).
The record
element specifies a record. Therefore, it utilizes
record specifications (%recSpecs;
). A record specification is
either a feature (feature
), a variable feature
(varFeature
), a conjunction of record specifications
(recordConj
), or a disjunction of record specifications
(recordDisj
).
The setGen
element specifies a set generator expression. The
body of a set generator expression is a list of specifications
(%setGenSpecs;
). A set generator expression specification is
either a constant (constant
), a conjunction of set generator
expression specifications (setGenConj
), or a disjunction of set
generator expression specifications (setGenDisj
).
The featurePath
element specifies a feature path. The required
attribute root
(down
or up
) corresponds to the root
variabe of the feature path, the required attribute dimension
to
the dimension variable, and the required attribute aspect
to the
aspect (entry
or attrs
). The constant
children of
the featurePath
element correspond to the fields of the feature
path. Note that the root variable value down
corresponds to
_
in the UL, and up
to ^
.
The annotation
element specifies a type annotation for a
term. Its first child is a term (%term;
), and its second child
a type (%type;
).
The conj
element specifies the conjunction of a list of terms
(%terms;
).
The disj
element specifies the disjunction of a list of terms
(%terms;
).
The concat
element specifies the concatenation
of a list of terms (%terms;
). Concatenation is restricted to
strings.
The order
element specifies an order generator
for a list of terms (%terms;
).