File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-1023_metho.xml

Size: 23,342 bytes

Last Modified: 2025-10-06 14:12:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-1023">
  <Title>STOCK OF SHARED KNOWLEDGE - - A TOOL FOR SOLVING PRONOMINAL ANAPHORA</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. MOTIVATION
</SectionTitle>
    <Paragraph position="0"> In our analysis, we work within the framework of the functional generative description (see Sgall, Haji~v~ and Panevovg, 1986). We represent the meaning structure of a sentence as a dependency tree rooted in the main verb, the nodes of the tree being labelled by lexical and morphological meanings. The edges denote the underlying grammatical relations between nodes. All nodes of the tree can be either contextually bound (CB) - if the objects they denote are &amp;quot;given', &amp;quot;known&amp;quot; from the context - or non-bound (NB) - if they introduce new information into discourse.</Paragraph>
    <Paragraph position="1"> The meaning of a sentence represented by such a tree is then viewed as divided into two parts - a topic (T), &amp;quot;stating&amp;quot; what the sentence is about, and a focus (F), commenting or developing the topic.</Paragraph>
    <Paragraph position="2"> The topic-focus articulation OVA) of a semence can be specified according to the sentence structure as follows (eL Sgall 1979): (i) F contains the main verb iff the verb is NB; (ii) F contains all daughter nodes of the verb which are N'B, together with all nodes subordinated to them (which in tam are either NB or CB); (iii) if the verb together with all daughter nodes is CB (and, therefore, none of (i),(ii) applies), F is defined with respect to a deeper embedded node.</Paragraph>
    <Paragraph position="3"> ( This ease is rather rare and we do not consider it in our analysis for the sake of simplicity.) (iv) T consists of all the nodes not contained in F.</Paragraph>
    <Paragraph position="4"> Thus, for the purpose of this paper, only the difference between NB nodes and CB nodes on the first level of dependency is taken into consideration while specifying TFA of a sentence. We would like to show in the sequel that there is a linguistic evidence which suggests .that deeper levels of syntactic embedding (at least the second level) be accounted for in the resolution of anaphora.</Paragraph>
    <Paragraph position="5"> For the sake of simplicity, we represent the sentence schematically:</Paragraph>
    <Paragraph position="7"> where G(I) is a group of CB nodes on the first level of dependency (belonging to T), G(3), G(5) are CB nodes on the second level of dependency (belonging to T and F respectively), G(2) is a group of NB nodes on the first level of dependency (belonging to F) and G(4), G(6) are NB nodes on the second level of dependency (belonging te T and F respectively). 2,1 Level of dependency in a syntactic tree Let us introduce one of the examples which show the necessity of further extension of the scale in the SSK. Consider the following sample of text  - Hoskovee (1989): Ex.l: (1) At the railway station I saw a dog with long ears.</Paragraph>
    <Paragraph position="8"> (2) It was funny to observe them dangling in the wind.</Paragraph>
    <Paragraph position="9"> (3) I wondered how he happened to get there.  According to our Coling '90 paper there is no distinction in the SSK between dog and ears. Both are contained in focus of (1), which means that they have the highest degree of salience in the next sentence. Such an account does not explain the fact, that the above introduced order of sentences is ACIT.S DE COLING-92, N^NrEs, 23-28 ^O't~T 1992 l 2 8 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 possible and the order (1)-(3)-(2) does not constitute a coherent text - it seems to be impossible to refer to ears from the third sentence using the personal pronoun them as a refering expression.</Paragraph>
    <Paragraph position="10"> The scheme of the syntactic tree as introduced above offers us a key to the solution of this problem. From this point of view there is a distinction between dog and ears in the sentence (1). According to our scheme, the word dog stands in the position G(2), the word ears is in the position G(6). Both are contextually non-bound.</Paragraph>
    <Paragraph position="11"> Thus, examples along this line seem to suggest that the modified SSK has to take into account the distinction between immediate members of a respective verb frame and words which are embedded on a deeper level of the syntactic tree.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Contextual boundness and non-boundness
</SectionTitle>
      <Paragraph position="0"> The distinction between contextually bound and non-bound elements is also significant. Let us consider the following example from Hoskovec (1989): Ex.2:  (4) At the railway station I saw their dog. (5) I realized they would look for him the whole afternoon.</Paragraph>
      <Paragraph position="1"> (6) I wondered how he happened to get there.  Although this sample text seems to have the same distribution of pronouns as (1)-(3), the difference between the two texts shows when we change the order of sentences to (4)-(6)-(5). In the latter case, the change of the order is possible . Since the sentences (1) and (4) differ only in contextual non-boundness of long ears vs. contextual boundness of their, respectively, both expressions being on the second level of dependency, we conclude that the distinction between contextual boundness/nonboundness of the nodes in the syntactic tree of the sentence is important for the resolution of anaphora and, therefore, must be captured by the new version of SSK.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Syntactic associations
</SectionTitle>
      <Paragraph position="0"> The notion of syntactic associations is introduced by means of slightly modified examples found in technical texts. Let us start with the following sample text: Ex.3:  (7) In the residence quarter of Brno it is possible to find a villa of professor Sehmidt.</Paragraph>
      <Paragraph position="1"> (8) It was built during the thirties. (9) His other two hgat,S~ are to be found in  Olomouc and Jihlava.</Paragraph>
      <Paragraph position="2"> In this case the assignment of him to its antecedent is straightforward; although the expression professor SchmMt is in the focus part of (7), it does not depend directly on the governing verb and, moreover, it is contextually non-bound. At the first sight this seems to be a counterexample to the above introduced scheme of the role of CB and NB elements of a sentence, namely, to the impossibility of referring to NB-nodes on the second level of dependency by means of personal pronouns across one embedded sentence (see Ex. 1). However, we believe that the difference between (1)-(2)-(3) and (7)-(8)-(9) lies in the fact, that his is in the third sentence accompanied by the full noun reference to the v_~iLusing a similar word (house), which certainly influences the salience of the item professor Schmidt. The structure of a noun phrase governed by villa in (7) is the same as the dependency structure of the noun phrase governed by ~ in (9), therefore also the salience of the item professor Schmidt is evidently higher than without that association. We can support our observations with the modified example: Ex.4:  (7) In the residence quarter of Brno it is possible to find a v_illa of professor Schrnidt.</Paragraph>
      <Paragraph position="3"> (8) It was built during the thirties.</Paragraph>
      <Paragraph position="4"> (9a) He was known as a collector of paintings of young local painters.</Paragraph>
      <Paragraph position="5">  In our opinion, the process of assigning the antecedent professor Schmidt to the refering expression him is not as straightforward as in Ex.3; indeed, some of the hearers have difficulties with accepting Ex.4 as a valid tgxt.</Paragraph>
      <Paragraph position="6"> The degree of the influence of syntactic associations on anaphora resolution can vary for different languages. It is also clear that at least a sm'all stock of related notions plays a very important role in this mechanism. We will discuss these problems more in detail in the Sect. 4 of this paper, where we show the approach for a particular language under consideration (Czech).</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.4 Topology
</SectionTitle>
      <Paragraph position="0"> We can use Ex.3 to show another important fact which has an influence on the reference assignment.</Paragraph>
      <Paragraph position="1"> The sentence (8) is a very simple one, in particular, it does not introduce any new element into the SSK except the word thirties. The situation is very different, if we replace (8) by (Sa):  (7) In the residence quarter of Brno it is possible to find a villa of professor Schmidt.</Paragraph>
      <Paragraph position="2"> (Sa) The buildin~ was built by a group of architects in late thirties.</Paragraph>
      <Paragraph position="3"> (9) His other two ~ are to be found in  The reference by him in (9) is in this case still possible, but the text is not as clear as in Ex.3. Any other new element in (Sa) makes the reference almost unclear.</Paragraph>
      <Paragraph position="4"> Supported by this observation, we believe that also the linear distance between an antecedent and a refering expression influences to some extent the salience of the referred item.</Paragraph>
      <Paragraph position="5"> It is clear that the function which expresses the degree of salience is not continuous. The end of the paragraph seems to have a strong effect: it leads to a drop of the salience of almost all possible antecedents except tbr those the activation of which has been established by repeated mentioning in the previous paragraph. The exact values of the function are now the objects of intensive investigation. We discuss some results of our investigations into this problem in Sect. 4 below.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.5 Existence of competitors
</SectionTitle>
      <Paragraph position="0"> The last feature which is considered in our system is the role of competing elements. We can demonstrate the problem by means of a slight  change of (8a), which introduces a new competing element into the text: Ex.6: (7) In the residence quarter of Brno it is possible to find a ~ of professor Schmidt.</Paragraph>
      <Paragraph position="1"> (Sb) The building was built by architect Hovorka in late thirties.</Paragraph>
      <Paragraph position="2"> (9) His other two ~ are to be found in  Olomouc and Jihlava.</Paragraph>
      <Paragraph position="3"> In this case professor Schmidt is no longer available as an antecedent for pronominal anaphora Since architect Hovorka has a greater degree of salience and the same morphological categories. All previous examples show the necessity of including into the evaluation procedure of the SSK not only the notions of contextual boundness, but also associations, complexity of the sentences and existence/nonexistence of possible competitors. Their role in the evaluation procedure is described more in detail in the following paragraph.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. THE GENERAL EVALUATING PROCEDURE
</SectionTitle>
    <Paragraph position="0"> Before we start the explanation of our evaluation procedure, we must make clear that we restrict ourselves in our considerations to those items of knowledge (i.e.the mental representations of the objects of the outer world), referred to in the sentence by noun or by a pronoun. The starting conditions for the evaluating procedure are then as follows: We assume that our procedure is a part of a larger complex system which is able to provide our procedure with the result of syntactico-semantic parsing of any sentence in the form of a dependency tree as a representation of the meaning of the sentence in the sense of Sgall, Haji~ov~l, Panevov~, (1986). We do not assume the existence of any special knowledge base, any semantic evaluation procedure or semantic features present in the syntactic tree. For the time beeing we restrict ourselves to those items (mental objects) that are rendered by nouns or pronouns.</Paragraph>
    <Paragraph position="1"> The SSK as a basic data structure can be viewed in our modified account as a set of items, which represent all mental objects rendered by nouns or pronouns from the respective text. Each data entry has the form of an ordered quantuple:</Paragraph>
    <Paragraph position="3"> represents the lexical value of the item;</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
MORPH
</SectionTitle>
    <Paragraph position="0"> is a set of morphological characteristics of the word (e.g. gender, number, etc.). These characteristics are used in so-called morphological filter, which filters out the impossible antecedents of the referring expression.</Paragraph>
    <Paragraph position="1"> LAST are the coordinates 'of the latest occurrence of the word or of the pronominal reference to it. These coordinates are composed of the &amp;quot;surface&amp;quot; and &amp;quot;deep&amp;quot; part. The &amp;quot;surface&amp;quot; coordinates contain the number of the sentence and a serial number of the node in the sentence structure and they serve as a basis for the &amp;quot;topological&amp;quot; part of the evaluation procedure.</Paragraph>
    <Paragraph position="2"> The &amp;quot;deep&amp;quot; part contains the code for the position of the word in the syntactic tree as introduced above (G(i)). This information determines the contextual (non)boundness of the word.</Paragraph>
    <Paragraph position="3"> ACRES DE COL/NG-92. NAM'ES, 23-28 Ao(rr 1992 ! 3 0 PROC. OF COLING-92, NANTES, AUG. 23-28. 1992</Paragraph>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
SYNT
</SectionTitle>
    <Paragraph position="0"> contains the data about the syntactic structure of the sentence where the respective LEX was mentioned for the last time. The structure is represented only partially, by means of pointers, which point to the governing node and also to all dependent nodes if they are contained in the SSK.</Paragraph>
    <Paragraph position="1"> This system of syntactic pointers serves as a basic data structure for the simple handling of associations.</Paragraph>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="metho">
    <SectionTitle>
OCCUR
</SectionTitle>
    <Paragraph position="0"> is a pair of integers which represent the number of occurrences of the given item both from the beginning of the text and from the beginning of the paragraph.</Paragraph>
    <Paragraph position="1"> The algorithm processes the given text sentence by sentence. It receives the dependency tree of a new sentence from the syntactico-semantical preprocessor, together with the list of all the pronominal referring expressions contained in the sentence. Each referring expression in this list carries the information about its position in the sentence (the same as LAST) and about its form (weak or strong pronoun, etc.). Using SSK, the algorithm finds the antecedents for all referring expressions. Afterwards, it changes the degrees of salience of the items in the SSK and reads the next sentence from the input.</Paragraph>
    <Paragraph position="2"> Having stated the general idea of the algorithm, we can describe the evaluation process in more detail as follows:</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Algorithm:
</SectionTitle>
      <Paragraph position="0"> (i) Read an input (the syntactic structure of the new sentence and the list of referring expressions). For every referring expression R~, i= 1 ,..,k in the list do the following (preserve the order of the referring expressions with regard to hierarchy of communicative dynamism in case that the sentence contains more than one referring expression): a) Use the morphemic filter to filter out all units from the SSK which cannot be considered as possible antecedents of the refering expression 1~.</Paragraph>
      <Paragraph position="1"> b) Apply the evaluating function E(w) to all possible ~ 2 k antecedents Wi,Wi,...,Wi i and sort them according to the obtained results from the most probable antecedent W~i i to the least probable antecedent W~. i. (ii) For all referring expressions P~ and all results of J evaluation W~, i=l,..,k; j=l,..,l~ find the best solution. 'Ihus we are lot~king for the optimal solution of anapbora for the sentence as a whole, since some &amp;quot;best&amp;quot; solution tbr the particular expression can block successful reference assignment for other refering expressions (Cf.</Paragraph>
      <Paragraph position="2"> examples in Haji~ovg, Kubo/'t, Kubofi, 1990).</Paragraph>
      <Paragraph position="3"> Generally, this is a computationally expensive solution but ill practice the nnmber of refering expression and possible antecedents is strongly limited and, therefore, this phase does not impose a serious restriction on the performance of the algorithm.</Paragraph>
      <Paragraph position="4"> (iii) Update the data in the SSK  - change OCCUR if the item was mentioned or reffered to in the current sentence - add items mentioned for the first time into the SSK - remove all the items with degrees of salience function smaller than some constant THRESHOLD (which may vary with respect to the  type of the text and the particular language). The function of salience has the form:</Paragraph>
      <Paragraph position="6"> where w is the item of SSK under consideration, O is the number of occurrences of the item in the given paragraph N is the serial number of the current utterance (in the given paragraph) L is the serial number of the utterance in the paragraph where this item was mentioned for the last time</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 The general evaluating function
</SectionTitle>
      <Paragraph position="0"> This function is essential for the whole process of anaphora resolution. Also, it is considerably more dependent on the language under consideration than all the other parts of the process. For this reason we have divided its description into two parts, in this section we describe the function only generally. The method of costomizing all the constants according to the needs of a particular language (in our case Czech) is described in Sect.</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 below.
</SectionTitle>
    <Paragraph position="0"> The basic form of the function is:</Paragraph>
    <Paragraph position="2"> where f~ is a function describing the value of the factoq q is a constant expressing file weight of the factoq</Paragraph>
  </Section>
  <Section position="10" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ACRES DE COLING-92. NAN,S, 23-28 AO't3&amp;quot;r 1992 1 3 1 Prtoc. or COLING-92. NANI'ES. AUG. 23-28. 1992
4. THE METHOD OF THE CUSTOMIZATION
OF THE EVALUATING FUNCTION
</SectionTitle>
    <Paragraph position="0"> In this paragraph we want to show the method chosen for finding the values of the ~, and f~ for the particular language.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 First step of the method is to find the form of
</SectionTitle>
      <Paragraph position="0"> f~ for all factors taken into account. All functions should have a common value range. The balance of influence of all factors is achieved by the help of constants % After a complex examination of Czech texts (with a special stress on technical texts) we have come to the following results: a) Contextual boundness - the word w is either bound or nonbound, therefore</Paragraph>
      <Paragraph position="2"> b) Underlying structure - for the definition of this function it is necessary to extend our schema from the paragraph 2 deeper than to the second level of dependency. The rule for the extension is the following: All deeper levels consist only of nodes belonging to groups G(3-6) so that any governing node in the topic governs nodes GO) and G(4), the governing nodes from focus govern nodes G(5) and G(6).</Paragraph>
      <Paragraph position="3"> The function f:has been assigned the following (entative forms:</Paragraph>
      <Paragraph position="5"> The motivation for this distribution of values can be found in Hoskovec (1989).</Paragraph>
      <Paragraph position="6"> c) Associations - if the word wl depends directly on the word w:, it shares a part of the value of E(wz). We do not restrict the dependency only to the immediate dominance, but the words on a deeper level share less of the value E(wz). We also take into account that one word can be in principle associated with more than one other member of the SSK. Therefore the form of the function f3 is the following: wt .... w, are the governing words of w so that wi are ordered according to the syntactic level (w, is the immediate governor of w)</Paragraph>
      <Paragraph position="8"> d) Linear distance - this function is quite simple, it is only necessary to count the linear distance of w and the possible refering expression. The counting is easy - we count only the members of SSK.</Paragraph>
      <Paragraph position="9"> The function is simple:,</Paragraph>
      <Paragraph position="11"> where d is a distance between the word w and a possible refering expression.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 There is of course a significant difference
</SectionTitle>
      <Paragraph position="0"> between the way of computing f~ and % The latter is a constant, which describes the role of particular factors in the respective language.</Paragraph>
      <Paragraph position="1"> For the evaluation of weights cl we use the following method: In real texts we look for pieces of text with complicated referring structure. Any such text is modified by adding or removing items. The results are given to a group of randomly chosen native speakers, who should mark the understandability of all texts. One example of this method is given here by the modification of sentences (8) and (9) in Ex. 4 and 5 above.</Paragraph>
      <Paragraph position="2"> The basic constraint on C/, is described in the following equation: ~cl = 1 i-I which means that every c~ describes the role of factor i in percents. This constraint serves for the purpose of keeping the balance between particular factors under control. It is also useful in the case of some future extension of the whole system by adding new factors.</Paragraph>
      <Paragraph position="3"> There can of course be any other constraints according to the needs of a particular language. We do not have any additional constraint for Czech in the moment.</Paragraph>
      <Paragraph position="4"> The work on collecting material for the tests on c~ is now in progress. The following constants were chosen as initial values :  - contextual boundness and non-boundness 0.25 - syntactic structure of the sentence 0.25 - associations 0.25 - linear distance 0.25</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML