File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/89/e89-1010_metho.xml

Size: 22,665 bytes

Last Modified: 2025-10-06 14:12:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="E89-1010">
  <Title>Ambiguity Resolution in the DMTRANS PLUS</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Overview of DMTRANS PLUS
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Memory Access Parsing
</SectionTitle>
      <Paragraph position="0"> DMTRANS PLUS is a second generation DMA system based upon DMTRANS (\[19\]) with new methods of ambiguity resolution based on costs.</Paragraph>
      <Paragraph position="1"> Unlike most natural language systems, which are based on the &amp;quot;Build-and-Store&amp;quot; model, our system employs a &amp;quot;Recognize-and-Record&amp;quot; model (\[14\],\[19\], \[21\]). Understanding of an input sentence (or speech input in ~/iDMTRANS PLUS) is defined as changes made in a memory network. Parsing and natural language understanding in these systems are considered to be memory-access processes, identifying existent knowledge in memory with the current input. Sentences are always parsed in context, i.e., through utilizing the existing and (currently acquired) knowledge about the world. In other words, during parsing, relevant discourse entities in memory are constantly being remembered. null The model behind DMTRANS PLUS is a simulation of such a process. The memory network incorporates knowledge from morphophonetics to discourse. Each node represents a concept (Concept Class node; CC) or a sequence of concepts (Concept Sequence Class node; CSC).</Paragraph>
      <Paragraph position="2"> CCs represent such knowledge as phones (i.e. \[k\]), phonemes (i.e. /k/), concepts (i.e. *Hand-Gun, *Event, *Mtrans-Action), and plans (i.e. *Pick-Up-Gun). A hierarchy of Concept Class (CC) entities stores knowledge both declaratively and procedurely as described in \[19\] and \[21\]. Lexieal entries are represented as lexical nodes which are a kind of CC.</Paragraph>
      <Paragraph position="3"> Phoneme sequences are used only for ~DMTRANS PLUS, the speech-input version of DM'IRANS PLUS.</Paragraph>
      <Paragraph position="4"> CSCs represent sequences of concepts such as phoneme sequences (i.e. &lt;/k//ed/i//g//il&gt;), concept sequences (i.e. &lt;*Conference *Goal-Role *Attend *Want&gt;), and plan sequences (i.e. &lt;*Declare-Want-Attend *Listen-Instruction&gt;). The linguistic knowledge represented as CSCs can be low-level surface specific patterns such as phrasal lexicon entries \[1\] or material at higher levels of abstration such as in MOP's \[16\]. However, CSCs should not be confused with 'discourse segments' \[6\]. In our model, information represented in discourse segments are distributively incorporated in the memory network.</Paragraph>
      <Paragraph position="5"> During sentence processing we create concept instances (CI) correpsonding to CCs and concept sequence instances (CSI) corresponding to CSCs. This is a substantial improvement over past DMA research.</Paragraph>
      <Paragraph position="6"> Lack of instance creation and reference in past research was a major obstacle to seriously modelling discourse phenomena.</Paragraph>
      <Paragraph position="7"> CIs and CSIs are connected through several types of links. A guided marker passing scheme is employed for inference on the memory network following methods adopted in past DMA models.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
DMTRANS PLUS uses three markers for parsing:
* An Activation Marker (A-Marker) is created
</SectionTitle>
    <Paragraph position="0"> when a concept is initially activated by a lexical item or as a result of concept refinement. It indicates which instance of a concept is the source of activation and contains relevant cost information.</Paragraph>
    <Paragraph position="1"> A-Markers are passed upward along is-a links in the abstraction hierarchy.</Paragraph>
    <Paragraph position="2"> * A Prediction marker (P-Marker) is passed along a concept sequence to identify the linear order of concepts in the sequence. When an A-Marker reaches a node that has a P-Marker, the P-Marker is sent to the next element of the concept sequence, thus predicting which node is to be activated next.</Paragraph>
    <Paragraph position="3"> * A Context marker (C-Marker) is placed on a node which has contextual priming.</Paragraph>
    <Paragraph position="4"> Information about which instances originated activations is carried by A-Markers. The binding list of instances and their roles are held in P-Markers 1. The following is the algorithm used in DMTRANS PLUS parsing: Let Lex, Con, Elem, and Seq be a set of lexical nodes, conceptual nodes, elements of concept sequences, and concept sequences, respectively.</Paragraph>
    <Paragraph position="5"> Parse(~ For each word w in S, do&amp;quot; Activate(w), For all i and j: if Active(Ni) A Ni E Con IMarker parsing spreading activation is our choice over eonnectionist network precisely because of this reason. Variable binding (which cannot be easily handled in counectionist network) can be trivially attained through structure (information) passing of A- null where Ni and ej.Ni denote a node in the memory network indexed by i and a j-th element of a node Ni, respectively.</Paragraph>
    <Paragraph position="6"> Active(N) is true iff a node or an element of a node gets an A-Marker.</Paragraph>
    <Paragraph position="7"> Activate(N) sends A-Markers to nodes and elements given in the argument.</Paragraph>
    <Paragraph position="8"> Predict(N) moves a P-Marker to the next element of the CSC.</Paragraph>
    <Paragraph position="9"> Predicted(N) is true iff a node or an element of a node gets a P-Marker.</Paragraph>
    <Paragraph position="10"> Pmark(N) puts a P-Marker on a node or an element given in the argument.</Paragraph>
    <Paragraph position="11"> Last(N) is true iff an element is the last element of the concept sequence.</Paragraph>
    <Paragraph position="12"> Accept(N) creates an instance under N with links which connect the instance to other instances.</Paragraph>
    <Paragraph position="13"> isa(N) returns a list of nodes and elements which are connected to the node in the argument by abstraction links.</Paragraph>
    <Paragraph position="14"> isainv(N) returns a list of nodes and elements which are daughters of a node N.</Paragraph>
    <Paragraph position="15"> Some explanation would help understanding this algorithm: null  1. Prediction.</Paragraph>
    <Paragraph position="16"> Initially all the first elements of concept sequences (CSC - Concept Sequence Class) are predicted by putting P-Markers on them.</Paragraph>
    <Paragraph position="17"> 2. Lexicai Access.</Paragraph>
    <Paragraph position="18"> A lexical node is activated by the input word. 3. Concept Activation.</Paragraph>
    <Paragraph position="19"> An A-Marker is created and sent to the corresponding CC (Concept Class) nodes. A cost is added to the A-Marker if the CC is not C-Marked (i.e. A C-Marker is not placed on it.).</Paragraph>
    <Paragraph position="20"> 4. Discourse Entity Identification A CI (Concept Instance) under the CC is searched for.</Paragraph>
    <Paragraph position="21"> If the CI exists, an A-Marker is propagated to higher CC nodes.</Paragraph>
    <Paragraph position="22"> Else, a CI node is created under the CC, and an A-Marker is propagated to higher CC nodes.</Paragraph>
    <Paragraph position="23"> 5. Activation Propagation.</Paragraph>
    <Paragraph position="24"> An A-Marker is propagated upward in the absl~action hierarchy.</Paragraph>
    <Paragraph position="25"> 6. Sequential prediction.</Paragraph>
    <Paragraph position="26"> When an A-Marker reaches any P-Marked node (i.e. part of CSC), the P-Marker on the node is sent to the next element of the concept sequence.</Paragraph>
    <Paragraph position="27"> 7. Contextual Priming  When an A-Marker reaches any Contextual Root node. C-Makers are put on the contexual children nodes designated by the root node.</Paragraph>
    <Paragraph position="28"> 8. Conceptual Relation Instautiation.</Paragraph>
    <Paragraph position="29"> When the last element of a concept sequence recieves an A-Marker, Constraints (world and discourse knowledge) are checked for.</Paragraph>
    <Paragraph position="30"> A CSI is created under the CSC with packaging links to each CI. This process is called concept refinement. See \[19\].</Paragraph>
    <Paragraph position="31"> The memory network is modified by performing inferences stored in the root CSC which had the accepted CSC attached to it.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
9. Activation Propagation
</SectionTitle>
    <Paragraph position="0"> A-Marker is propagated from the CSC to higher nodes.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Memory Network Modification
</SectionTitle>
      <Paragraph position="0"> Several different incidents trigger the modification of the memory network during parsing: * An individual concept is instantiated (i.e. an instance is created) under a CC when the CC receives an A-Marker and a CI (an instance that - 74 was created by preceding utterances) is not existent. This instantiation is a creation of a specific discourse entity which may be used as an existent instance in the subsequent recognitions.</Paragraph>
      <Paragraph position="1"> A concept sequence instance is created under the accepted CSC. In other words, if a whole concept sequence is accepted, we create an instance of the sequence instantiating it with the specific CIs that were created by (or identified with) the specific lexical inputs. This newly created instance is linked to the accepted CSC with a instance relation link and to the instances of the elements of the concept sequences by links labelled with their roles given in the CSC.</Paragraph>
      <Paragraph position="2"> * Links are created or removed in the CSI creation phase as a result of invoking inferences based on the knowledge attached to CSCs. For example, when the parser accepts the sentence I went to the UMIST, an instance of I is created under the CC representing L Next, a CSI is created under PTRANS. Since PTRANS entails that the agent is at the location, a location link must be created between the discourse entities I and UMIST. Such revision of the memory network is conducted by invoking knowledge attached to each CSC.</Paragraph>
      <Paragraph position="3"> Since modification of any part of the memory network requires some workload, certain costs are added to analyses which require such modifications.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Cost-based Approach to the
Ambiguity Resolution
</SectionTitle>
    <Paragraph position="0"> Ambiguity resolution in DMTRANS PLUS is based on the calculation of the cost of each parse. Costs are attached to each parse during the parse process.</Paragraph>
    <Paragraph position="1"> Costs are attached when:  1. A CC with insufficient priming is activated, 2. A CI is created under CC, and 3. Constraints imposed on CSC are not satisfied initially and links are created or removed to satisfy  the constraint.</Paragraph>
    <Paragraph position="2"> Costs are attached to A-Markers when these operations are taken because these operations modify the memory network and, hence, workloads are required. Cost information is then carried upward by A-Markers. The parse with the least cost will be chosen. The cost of each hypothesis are calculated by:</Paragraph>
    <Paragraph position="4"> where Ci is a cost of the i-th hypothesis, cij is a cost carried by an A-Marker activating the j-th element of the CSC for the i-th hypothesis, constrainta is a cost of assuming k-th constraint of the i-th hypothesis, and b/as~ represents lexical preference of the CSC for the i-th hypothesis. This cost is assigned to each CSC and the value of Ci is passed up by A-Markers if higher-level processing is performed. At higher levels, each cij may be a result of the sum of costs at lower-levels.</Paragraph>
    <Paragraph position="5"> It should be noted that this equation is very similax to the activation function of most neural networks except for the fact our equation is a simple linear equation which does not have threshold value. In fact, if we only assume the addition of cost by priming at the lexical-level, our mechanism of ambiguity resolution would behave much like connectionist models without inhibition among syntactic nodes and excitation links from syntax to lexicon 2. However, the major difference between our approach and the connectionist approach is the addition of costs for instance creation and constraint satisfaction. We will show that these factors are especially important in resolving structural ambiguities.</Paragraph>
    <Paragraph position="6"> The following subsections describe three mechanisms that play a role in ambiguity resolution. However, we do not claim that these are the only mechanisms involved in the examples which follow s .</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Contextual Priming
</SectionTitle>
      <Paragraph position="0"> In our system, some CC nodes designated as Contextual Root Nodes have a list of thematically relevant nodes. C-Markers are sent to these nodes as soon as a Contextual Root Node is activated. Thus each sentence and/or each word might influence the interpretation of following sentences or words. When a node with C-Marker is activated by receiving an A-Marker, the activation will be propagated with no cost. Thus, a parse using such nodes would have no cost. However, when a node without a C-Marker is activated, a small cost is attached to the interpretation using that node.</Paragraph>
      <Paragraph position="1"> In \[19\] the discussion of C-Marker propagation concentrated on the resolution of word-level ambiguities.</Paragraph>
      <Paragraph position="2"> However, C-Markers are also propagated to conceptual</Paragraph>
      <Paragraph position="4"> class nodes, which can represent word-level, phrasal, or sentential knowledge. Therefore, C-Markers can be used for resolving phrasal-level and sentential-level ambiguities such as structural ambiguities. For example, atama ga itai literally means, '(my) head hurts.' This normally is identified with the concept sequences associated with the *have-a-symptom concept class node, but if the preceding sentence is asita yakuinkai da ('There is a board of directors meeting tomorrow'), the *have-a-problem concept class node must be activated instead. Contextual priming attained by C-Markers can also help resolve structural ambiguity in sentences like did you read about the problem with the students? The cost of each parse will be determined by whether reading with students or problems with students is contextually activated. (Of course, many other factors are involved in resolving this type of ambiguity.) Our model can incorporate either C-Markers or a connectionist-type competitive activation and inhibition scheme for priming. In the current implementation, we use C-Markers for priming simply because C-Marker propagation is computationaUy less-expensive than connectionist-type competitive activation and inhibition schemes 4. Although connectionist approaches can resolve certain types of lexical ambiguity, they are computationally expensive unless we have massively parallel computers. C-Markers are a resonable compromise because they are sent to semantically relevant concept nodes to attain contextual priming without computationally expensive competitive activation and inhibition methods.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Reference to the Discourse Entity
</SectionTitle>
      <Paragraph position="0"> When a lexical node activates any CC node, a CI node under the CC node is searched for (\[19\], \[21\]). This activity models reference to an already established discourse entity \[27\] in the heater's mind. If such a CI node exists, the reference succeeds and this parse will be attached with no cost. However, if no such instance is found, reference failure results. If this happens, an instantiation activity is performed creating a new instance with certain costs. As a result, a parse using newly created instance node will be attached with some cost.</Paragraph>
      <Paragraph position="1"> For example, if a preceding discourse contained a reference to a thesis, a CI node such as THESIS005 would have been created. Now if a new input sentence contains the word paper, CC nodes for THI/'*This does not mean that our model can not incorporate a connectionist model. The choice of C-Markers over the eonnectionist approach is mostly due to computational cost. As we will describe later, our model is capable of incorporating a connectionist approach. SIS and SHEET-OF-PAPER are activated. This causes a search for CI nodes under both CC nodes. Since the CI node THESIS005 will be found, the reading where paper means thesis will not acquire a cost. However, assuming that there is not a CI node corresponding to a sheet of paper, we will need to create a new one for this reading, thus incurring a cost.</Paragraph>
      <Paragraph position="2"> We can also use reference to discourse entities to resolve structural ambiguities. In the sentence We sent her papers, ff the preceding discourse mentioned Yoshiko's papers, a specific CI node such as YOSHIKO-P/ff'ER003 representing Yoshiko's papers would have been created. Therefore, during the processing of We sent her papers, the reading which means we sent papers to her needs to create a CI node representing papers that we sent, incurring some cost for creating that instance node. On the other hand, the reading which means we sent Yoshiko's papers does not need to create an instance (because it was already created) so it is costless. Also, the reading that uses paper as a sheet of paper is costly as we have demonstrated above.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Constraints
</SectionTitle>
      <Paragraph position="0"> Constraints are attached to each CSC. These constraints play important roles during disambiguation.</Paragraph>
      <Paragraph position="1"> Constraints define relations between instances when sentences or sentence fragments are accepted. When a constraint is satisfied, the parse is regarded as plausible. On the other hand, the parse is less plausible when the constraint is unsatisfied. Whereas traditional parsers simply reject a parse which does not satisfy a given constraint, DMTRANS PLUS, builds or removes links between nodes forcing them to satisfy constraints.</Paragraph>
      <Paragraph position="2"> A parse with such forced constraints will record an increased cost and will be less preferred than parses without attached costs.</Paragraph>
      <Paragraph position="3"> The following example illustrates how this scheme resolves an ambiguity. As an initial setting we assume that the memory network has instances of 'man' (MAN1) and 'hand-gun' (HAND-GUN1) connected with a PossEs relation (i.e. link). The input utterance is&amp;quot; &amp;quot;Mary picked up an Uzzi. Mary shot the man with the hand-gun.&amp;quot; The second sentence is ambiguous in isolation and it is also ambiguious if it is not known that an Uzzi is a machine gun. However, when it is preceeded by the first sentence and ff the hearer knows that Uzzi is a machine gun, the ambiguity is drastically reduced. DMTRANS PLUS hypothesizes and models this disambiguation activity utilizing knowledge about world through the cost recording mechanism described above.</Paragraph>
      <Paragraph position="4"> During the processing of the first sentence, DMTRANS PLUS creates instances of 'Mary' and 'Uzzi' - 76 and records them as active instances in memory (i.e., MARY1 and UZZI1 are created). In addition, a link between MARY1 and UZZI1 is created with the POSSES relation label. This link creation is invoked by triggering side-effects (i.e., inferences) stored in the CSC representing the action of 'MARY1 picking up the UZZII'. We omit the details of marker passing (for A-, P-, and C-Markers) since it is described detail elsewhere (particulary in \[19\]).</Paragraph>
      <Paragraph position="5"> When the second sentence comes in, an instance MARY1 already exists and, therefore, no cost is charged for parsing 'Mary '5. However, we now have three relevant concept sequences (CSC's6): CSCI: (&lt;agent&gt; &lt;shoot&gt; &lt;object&gt;) CSC2: (&lt;agent&gt; &lt;shoot&gt; &lt;object&gt; &lt;with&gt; &lt;instrument&gt;) CSC3: (&lt;person&gt; &lt;with&gt; &lt;instrument&gt;) These sequences are activated when concepts in the sequences are activated in order from below in the abstraction hierarchy. When the &amp;quot;man&amp;quot; comes in, recognition of CSC3:(&lt;person&gt; &lt;with&gt; &lt;instrument&gt;) starts. When the whole sentence is received, we have two top-level CSCs (i.e., CSC1 and CSC2) accepted (all elements of the sequences recognized). The acceptance of CSC1 is performed through first accepting CSC3 and then substituting CSC3 for &lt;object&gt;.</Paragraph>
      <Paragraph position="6"> When the concept sequences are satisfied, their constraints are tested. A constraint for CSC2 is (POSSES &lt;agent&gt; &lt;instrument&gt;) and a constraint for CSC3 (and CSCl, which uses CSC3) is (POSSES &lt;person&gt; &lt;instrument&gt;). Since 'MARY1 POSSESS HAND-GUNI' now has to be satisfied and there is no instance of this in memory, we must create a POSSESS link between MARY1 and HAND-GUN1. A certain cost, say 10, is associated with the creation of this link. On the other hand, MAN1 POSSESS HAND-GUN1 is known in memory because of an earlier sentence. As a result, CSC3 is instantiated with no cost and an A-Marker from CSC3 is propagated upward to CSC1 with no cost. Thus, the cost of instantiating CSC1 is 0 and the cost of instantiating CSC2 is 10. This way, the interpretation with CSC 1 is favored by our system.</Paragraph>
      <Paragraph position="7"> sOl course, 'Mary' can be 'She'. The method for handling this type of pronoun reference was already reported in \[19\] and we do not discuss it here.</Paragraph>
      <Paragraph position="8"> 6As we can see from this example of CSC's, a concept sequence can be normally regarded as a subcategorization list of a VP head. However, concept sequences are not restricted to such lists and are actually often at higher levels of abstraction representing MOP-like sequences.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML