5.2 Formal Presentation

In this section, we briefly outline the formal presentation of dependency grammar presented in [Duc99a]. The basic idea is that each node of the dependency tree must be assigned a lexical entry from the lexicon, and that certain principles of well-formedness must be verified. For example, the lexical entry stipulates a valency, and the node must have precisely the outgoing dependency edges that realize this valency.

A dependency grammar is a 7-tuple

$\TUP{\setof{Words},\setof{Cats},\setof{Args},\setof{Comps},\setof{Mods},\setof{Lexicon},\setof{Rules}}$

$\setof{Words}$: a finite set of strings notating fully inflected forms of words.
$\setof{Cats}$: a finite set of categories such as n (noun), det (determiner), or vfin (finite verb).
$\setof{Args}$: a finite set of (what we call) agreement tuples such as <masc sing 3 nom>.
$\setof{Comps}$: a finite set of complement roles, such as subject or zu_infinitive.
$\setof{Mods}$: a finite set of modifier roles, such as adj (adjectives), disjoint from $\setof{Comps}$ . We write $\setof{Roles}=\setof{Comps}\uplus\setof{Mods}$ for the set of all types of roles.
$\setof{Lexicon}$: a finite set of lexical entries (see below).
$\setof{Rules}$: a finite family of binary predicates, indexed by role labels, expressing local grammatical principles: for each $\rho\in\setof{Roles}$ , there is $\Gamma_\rho\in\setof{Roles}$ such that $\Gamma_\rho(w_1,w_2)$ characterizes the grammatical admissibility of a dependency edge labeled $\rho$ from mother to daughter .

A lexical entry is an attribute value matrix (AVM) with signature:

$\avm{% {string}{\setof{Words}}% {cat}{\setof{Cats}}% {agr}{\setof{Agrs}}% {comps}{\mbox{$2^{\text{\setof{Comps}}}$}}}$

and we write attribute access in functional notation. If is a lexical entry, then $\Feature{string}(e)$ is the full form of the corresponding word, $\Feature{cat}(e)$ is the category, $\Feature{agr}(e)$ is the agreement tuple, and $\Feature{comps}(e)$ the valency expressed as a set of complement roles.

5.2.1 Dependency Tree

We assume given a set of nodes $\VV$ which for simplicity we identify with the set of integers $\{1,\ldots,n\}$ , and a set $\EE\subseteq\VV\times\VV\times\setof{Roles}$ of labeled directed edges between these node. $(\VV,\EE)$ is a directed graph, in the classical sense, with labeled edges. We restrict our attention to finite graphs that are also trees.

A dependency tree is then defined as a pair $\TT\equiv((\VV,\EE),\Feature{entry})$ of a tree as stipulated above and a function $\Feature{entry}$ mapping each node in $\VV$ to a lexical entry in $\setof{Lexicon}$ .

5.2.2 Well-Formedness Principles

Not all dependency trees as described above are grammatically admissible. We now describe the conditions of admissibility, aka well-formedness.

While we have identified nodes with integers, we still prefer to write w_i (and often just or ) instead of to remind the reader that they correspond to words and to distinguish them from other occurrences of integers that have no such interpretation. We also write $w\EDGE\rho w'$ to represent an edge labeled $\rho$ from mother to daughter .

First, any complement required by a node's valency must be realized precisely once:

$\forall w\in\VV,\ \forall\rho\in\Feature{entry}(w),\ \exists! w'\in\VV,\ w\EDGE\rho w'\in\EE$

Second, if there is an outgoing edge from , then it must be labeled by a modifier role or by a complement role in 's valency:

$\forall w\EDGE\rho w'\in\EE,\ \rho\in\setof{Mods}\UNION \Feature{comps}(\Feature{entry}(w))$

Third, whenever there is an edge $w\EDGE\rho w'$ , then the grammatical condition $\Gamma_\rho(w,w')$ for $\Gamma_\rho\in\setof{Rules}$ must be satisfied in $\TT$ :

$\forall w\EDGE\rho w'\in\EE,\ \TT\models\Gamma_\rho(w,w')$