File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/89/j89-3001_intro.xml

Size: 4,695 bytes

Last Modified: 2025-10-06 14:04:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="J89-3001">
  <Title>PRACTICAL PARSING OF GENERALIZED PHRASE STRUCTURE GRAMMARS</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 DEVINITIONS
</SectionTitle>
    <Paragraph position="0"> Preliminary definitions. The following definitions, of standard terms of formal language theory, are given in, for example, Salomaa (1973).</Paragraph>
    <Paragraph position="1"> An alphabet is a finite non-empty set. The elements of an alphabet are called letters. A word over an alphabet V is a finite string consisting of zero or more letters of V, whereby the same letter may occur 140 Computational Linguistics, Volume 15, Number 3, September 1989 Anthony J. Fisher Practical Parsing of Generalized Phrase Structure Grammars several times. The set of all words over an alphabet V is denoted by W(V). For any V, W(V) is infinite.</Paragraph>
    <Paragraph position="2"> We denote by H(S) the powerset of S, which is the set of all subsets of a set S.</Paragraph>
    <Paragraph position="3"> Definitions. A generalised phrase structure grammar (GPSG) is an ordered 7-tuple G = (VF, VT, Xo, R, F, Fe, Fr), where: VF is a finite set of features; V T is a finite set of terminals, VF N VT~ = ~); Xo is the starting category, a finite subset of VF; R is the set of rules, a finite set of ordered pairs P Q, such that P is a subset of V F (i.e. P ~ H(VF)), and Q is a word over the alphabet V = II(V F) U VT; F is the FCR set, a function from II(VF) to {true, false}; F e is the set of percolating features, a subset of VF'~ and Far is the set of trickling features, a subset of V e.</Paragraph>
    <Paragraph position="5"> that the following all hold: 1. aP'/3 = Pand aQ' I Q'2 * * * Q'n/3 = Q 2. P&amp;quot;--&gt; Q&amp;quot;l Q&amp;quot;2 * * * Q&amp;quot;n E R 3. \[Extension:\] (i) P&amp;quot;C P' (ii) if Q'i ~ vr: Q&amp;quot;i = Q'i (i = I, 2 ..... n) ifQ'iCH(VF): Q&amp;quot;i C Q'i (i = 1,2 ..... n) 4. \[FCR constraints:\] F(P') 5. \[Propagation Constraints:\] (i) Q'i ~ VT v (Q'i ('1 Fe) C_ P' (i = 1, 2 ..... n) (ii) Q'~ E VT v (P' fq FT) C_ Q'i (i = 1, 2 ..... n).  We denote by if* the reflexive and transitive closure of ~ .</Paragraph>
    <Paragraph position="6"> The language generated by G, written L(G), is defined by</Paragraph>
    <Paragraph position="8"> End of definitions.</Paragraph>
    <Paragraph position="9"> Informally, the definitions of GPSG, ~, ~* and L(G) are the standard definitions of a context-free grammar, modified by defining the non-terminal of the standard CFG definition as a set of features. Furthermore, the standard definition of ~ has been extended to take into account feature matching by the free addition of features to categories specified in rules (extension), the FCR constraints, and the propagation constraints.</Paragraph>
    <Paragraph position="10"> The original definition of GPSGs postulated a set of FCRs whose conjunction is required to hold; this conjunction has been collapsed into a single Boolean function F in our definition. F is required to hold on each non-terminal node of the parse tree by virtue of condition 4 above. (It is unnecessary to specify that F(Q'i) must hold; this is implied by the use of the reflexive and transitive closure of ~ in the definition of L(G).) Finally, condition 5(i) requires that, if P' is the mother of a non-terminal node Q'i (considering P' and Q'i as nodes in a parse tree), then for each feature fwhich has been defined as percolating from daughter to mother (i.e. is a member of Fe), if f is present on the daughter node, then it must be present also on the mother.</Paragraph>
    <Paragraph position="11"> Condition 5(ii) is the corresponding statement for trickling features.</Paragraph>
    <Paragraph position="12"> Our definition is given in terms of features that can be either present or absent from a category, whereas in the original definition a feature has a value. This distinction is merely a mathematical device to simplify the definition and the discussion of the algorithm which will follow. Our definition can be related to the original definition by reading &amp;quot;feature&amp;quot; as &amp;quot;feature-value pair&amp;quot;. For example, a &amp;quot;real&amp;quot; GPSG might contain a feature PAST which can take a value which is either + or -.</Paragraph>
    <Paragraph position="13"> We interpret that as two separate features, say PAST+ and PAST-. The standard definition would require that a category may not contain both PAST+ and PAST-.</Paragraph>
    <Paragraph position="14"> This can be expressed by conjoining the FCR -q(PAST+ /~ PAST-) to the FCR set. Such an FCR is called a group FCR.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML