File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/75/t75-2001_metho.xml

Size: 18,936 bytes

Last Modified: 2025-10-06 14:11:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="T75-2001">
  <Title>AUGMENTED PHRASE STRUCTURE GRAMMARS</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
SERVIC ('ACTIVITY',E,ES,ING,ED,
</SectionTitle>
    <Paragraph position="0"> TRANS,PS='VERB',XYZ=3) It is convenient to picture a record as a box enclosing a column of relation and property names on the left and a column of corresponding values on the right.</Paragraph>
    <Paragraph position="1"> Indicators which are present in the record (i.e. have a non-zero value) are listed at the bottom of the box. The named record &amp;quot;SERVIC&amp;quot; defined above could be drawn as: I NAME &amp;quot;SERVIC&amp;quot; SUP &amp;quot;ACTIVITY&amp;quot; PS &amp;quot;VERB&amp;quot; XYZ 3 |E,ES,ING,ED,TRANS Double quotes enclose a character string, single quotes enclose the name of a named record. The values of the SUPerset and PS (part-of-speech) attributes are really pointers to the records &amp;quot;ACTIVITY&amp;quot; and &amp;quot;VERB&amp;quot; and could be drawn as directed lines to those other records if they were included in the diagram.</Paragraph>
    <Paragraph position="2"> The named record &amp;quot;SERVIC&amp;quot; given here could be considered to be a dictionary entry stating that the VERB stem SERVIC can take endings E, ES, ING and ED, the VERB SERVIC is TRANSitive, and the concept SERVIC is an ACTIVITY. (When a named record name appears without the explicit mention of an attribute name, the SUPerset attribute is assumed.) The XYZ attribute was included just to illustrate a numerically-valued property.</Paragraph>
    <Paragraph position="3"> Of course, the true meaning of any of this information depends completely upon the way it is used by the APSG rules.</Paragraph>
    <Paragraph position="4"> During decoding and encoding, records called &amp;quot;segment records&amp;quot; are employed to hold information about segments of text.</Paragraph>
    <Paragraph position="5"> For example, the segment &amp;quot;are servicing&amp;quot; could be described by the record: I SUP &amp;quot;SERVIC&amp;quot; PRES,P3,PLUR,PROG which could be interpreted as saying that &amp;quot;are servicing&amp;quot; is the present, third person, plural, progressive form of &amp;quot;service&amp;quot;. Similarly, the sentence &amp;quot;The big men are servicing a truck.&amp;quot; could be described by:</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ISUP &amp;quot;MAN&amp;quot;
SUP &amp;quot;SERVIC&amp;quot; 1 ~SIZE &amp;quot;BIG&amp;quot;
AGENT i
GOAL L I ~~
PRES,PROG ~SUP &amp;quot;TRUCK&amp;quot;
' ~INDEF,SING
</SectionTitle>
    <Paragraph position="0"> where the indicators DEF and INDEF mean definite and indefinite, respectively. The sentence &amp;quot;A truck is being serviced by the big men.&amp;quot; could be described by exactly the same record structure but with the addition of a PASSIVE indicator in the record on the left.</Paragraph>
    <Paragraph position="1"> During a dialogue some records that begin as segment records may be kept to become part of longer term memory to represent the entities (in the broadest sense of the term) that are being discussed. Segment records then might have pointers into this longer term memory to show referrents. So, for example, the sentence &amp;quot;They are servicing a truck.&amp;quot; might be described by the same record structure shown above if the referrent of &amp;quot;they&amp;quot; was known to be a certain group of men who are big.</Paragraph>
    <Paragraph position="2"> III. ANALYSIS OF TEXT (DECODING) Decoding is the process by which record structures of the sort just shown are constructed from strings of text. The manner in which these records are to be built is specified by APSG decoding rules.</Paragraph>
    <Paragraph position="3"> A decoding rule consists of a list of one or more &amp;quot;segment types&amp;quot; (meta-symbols) on the left of an arrow to indicate which types of contiguous segments must be present in order for a segment of the type on the right of the arrow to be formed. Conditions which must be satisfied in order for the rule to be applicable may be stated in parentheses on the left side of the rule, and structure-building operations to be performed when a new segment record is created are stated in parentheses on the right side.</Paragraph>
    <Paragraph position="4"> For illustrative purposes, some of the rules which would be required to produce the segment records shown in the previous section will be discussed here. Complete examples are given in Reference 3.</Paragraph>
    <Paragraph position="5"> If the string &amp;quot;servicing&amp;quot; appeared in the input, and the substring &amp;quot;servic&amp;quot; were described by the VERBSTEM segment record</Paragraph>
    <Paragraph position="7"> would form the VERB segment record</Paragraph>
    <Paragraph position="9"> to describe the string &amp;quot;servicing&amp;quot;, identifying it as the present participle form of service. This rule says that if a segment of the string being decoded is described as a VERBSTEM, and the associated segment record has a SUP attribute which points to a named record which has an ING indicator (as the named record for &amp;quot;SERVIC&amp;quot; defined in the previous section would), and this segment is followed immediately by the characters &amp;quot;i&amp;quot;, &amp;quot;n&amp;quot; and &amp;quot;g&amp;quot;, then create a VERB segment record with the same SUP as the VERBSTEM and with a PRESPART indicator, to describe the entire segment (&amp;quot;servicing&amp;quot; in this case).</Paragraph>
    <Paragraph position="10"> Then the rule</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
VERB --&gt; VERBPH(~VERB)
</SectionTitle>
    <Paragraph position="0"> would create aVERBPHrase segment record which is a copy (4) of the VERB segment record just shown.</Paragraph>
    <Paragraph position="1"> If the string &amp;quot;are&amp;quot; input were described  would produce the new VERBPH segment record I sup SERWC I PRES,P3,PLUR,PROG I from the twojust shown, to describe the string &amp;quot;are servicing&amp;quot;. This rule says that if a segment of the string being decoded is described as a VERB with a SUP of &amp;quot;BE', and it is followed by a segment described as a VERBPH with a PRESPART indicator, then create a new VERBPH segment record which is a copy (automatically, because the segment type is the same) of the VERBPH segment record referred to on the left of the rule, but which as a PROGressive indicator and the FORM information from the VERB. FORM would have previously been defined as the name of a group of indicators (i.e. those having to do with tense, person and number). Similar rules can be used to recognize passives, perfects and modal constructions.</Paragraph>
    <Paragraph position="2"> Continuing with the example, if the string &amp;quot;the big men&amp;quot; were decoded to the  would produce the new VERBPH segment record (the one on the left in this diagram) I SUP &amp;quot;SERVIC&amp;quot; ISUP &amp;quot;MAN&amp;quot; I SUBJECT ~ SIZE &amp;quot;BIG&amp;quot; PRES,PROG DEF,PLUR from the previous VERBPH record, to describe the string &amp;quot;the big men are servicing&amp;quot;. It is important to realize that the record on the left in the above diagram is a segment record that &amp;quot;covers&amp;quot; the entire string and that the record shown on the right (which is the same one from the previous diagram) Just serves as the value of its SUBJECT attribute. The rule above says that if a NOUNPH is followed by a VERBPH, and the NUMBer indicators of the VERBPH are the same as the NUMBer indicators of the NOUNPH, and the VERBPH does not already have a SUBJECT attribute, then create a new VERBPH segment record which is a copy of the old one, give it a SUBJECT attribute pointing to the NOUNPH record, and delete the NUMBer and</Paragraph>
    <Paragraph position="4"> PERson indicators. Considering the subject to be part of the verb phrase in this manner can simplify the handling of some constructions involving inverted word order.</Paragraph>
    <Paragraph position="5"> If the string being decoded were &amp;quot;the big men are servicing a truck.&amp;quot;, a rule similar to the last one shown above could be used to pick up the direct object. Then the rule ./VERBPH(SUBJECT,OBJECTI -TTRANS*IPASSIVE).</Paragraph>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
--&gt; SENT (~VERBPH)
</SectionTitle>
    <Paragraph position="0"> could be applied, which says if a VERBPH extending between two periods has a SUBJECT attribute and also either has an OBJECT attribute or does not need one because there is no TRANSitive indicator in the named record pointed to by the SUP (i.e. the verb is intransitive) or because there is a PASSIVE indicator, then call it a SENTence.</Paragraph>
    <Paragraph position="1"> To get the record structure describing this string into the form shown near the end of the previous section, one more rule would be needed:  This says that for a non-PASSIVE ACTION SENTence that still has a SUBJECT attribute, set the AGENT and GOAL attributes to the values of the SUBJECT and OBJECT attributes, respectively, and then delete the SUBJECT and OBJECT attributes from the record. The notation $'ACTION&amp;quot; is read &amp;quot;in the set &amp;quot;ACTION'&amp;quot; and means that the named record &amp;quot;ACTION&amp;quot; must appear somewhere in the SUPerset chain of the current record. In the previous section the named record &amp;quot;SERVIC&amp;quot; was defined to have a SUP of &amp;quot;ACTIVITY'. If the named record &amp;quot;ACTIVITY&amp;quot; were similarly defined to have a SUP of &amp;quot;ACTION', the segment record under discussion here would satisfy the condition $'ACTION'.</Paragraph>
    <Paragraph position="2"> From the above examples it can be seen that the condition specifications take the form of logical expressions involving the values of attributes. Each element in a condition specification is basicaly of the form value.relation.value, but this is not obvious because there are several notational shortcuts available in the rule language.</Paragraph>
    <Paragraph position="3"> For example, &amp;quot;BE&amp;quot; is short for SUP.EG.'BE',PRESPART is short for PRESPART.NE.0, and -~SUBJECT is short for SUBJECT.EQ.0. The elements are combined by and's (commas) and or's (vertical bars).</Paragraph>
    <Paragraph position="4"> In most cases the attribute whose value is being tested is to be found in the segment record associated with the constituent, but that is not always the case. For example, ING* tests the value of the ING indicator in the named record pointed to by the SUP of the segment record, and could be written ING(SUP) or ING(SUP).NE.0. Another example is NUMB(NOUNPH) which was used to refer to the value of the NUMB indicators in the NOUNPH segment in one of the rules above.</Paragraph>
    <Paragraph position="5"> From the examples it can also be seen that creation specifications take the form of short procedures consisting of statements for setting the values of attributes. Each element in a creation specification is basically of the form attribute=value (where &amp;quot;=&amp;quot; means replacement), but again this is not obvious becuase of the notational shortcuts used. For example, SUP(VERBSTEM) is short for SUP=SUP(VERBSTEM), PRESPART is short for PRESPART:I (note that this form has a different meaning when it is used in a condition specification), and -SUBJECT is short for SUBJECT=0.</Paragraph>
    <Paragraph position="6"> In all of the examples here, the attribute whose value is set would be in the segment record being built, but that need not always be the case. If, for example, there were some reason to want to give the AGENT record of an action an ABC attribute equal to one more than the XYZ attribute of the concept record associated with that action (i.e. the named record pointed to by its SUP), the following could be included in the last rule shown: ABC(AGENT)=XYZ(SUP)+I which can be read as &amp;quot;set the ABC attribute of the AGENT of this record to the value of the XYZ attribute of the SUP of this record plus I.&amp;quot; There is no limit to the nesting of attribute names used in this manner.</Paragraph>
    <Paragraph position="7"> Although in the example rules given here the conditions are primarily syntactic, semantic constraints can be stated in exactly the same manner. Much of the record building shown here can be considered semantic (and somewhat case oriented). The important point, however, is that the kind of condition testing and structure building done is at the discretion of the person who writes the rules. Complete specifications for the APSG rule language are given in Reference 3.</Paragraph>
    <Paragraph position="8"> The decoding algorithm used with APSG is basically that of a bottom-up, left-to-right, parallel-processing, syntax-directed compiler. An important and novel feature of this algorithm is something called a &amp;quot;rule instance record&amp;quot;, which primarily maintains information abut the potential applicability of a rule. A rule instance record is initially created for a rule whenever a segment which can be the first constitutent of that rule becomes available. (A terminal segment becomes available by being obtained from the input stream, and a non-terminal segment becomes available whenever a rule is applied.) Then the rule instance record &amp;quot;waits&amp;quot; for a segment which can be the next constituent of the associated rule to become available.</Paragraph>
    <Paragraph position="9"> When such a segment becomes available, the rule instance record is &amp;quot;extended&amp;quot;. When a rule instance record becomes complete (i.e.</Paragraph>
    <Paragraph position="10"> all of its constituents are available), the associated rule is applied (i.e. the segment record specified on the right is built and made available). There may be many rule instance records in existence for a particular rule at any point in time.</Paragraph>
    <Paragraph position="11"> Because of the parallel processing nature of the decoding algorithm, when a segment record is created to describe a portion of the input text it does not result in the destruction of other records describing the same portion or parts of it.</Paragraph>
    <Paragraph position="12"> Local ambiguities caused by multiple word senses, idioms and the like may result in more than one segment record being created to describe a particular portion of the text, but usually only one of them is able to combine with its neighbors to become part of the analysis for an entire sentence.</Paragraph>
    <Paragraph position="13"> IV. SYNTHESIS OF TEXT (ENCODING) Encoding is the process by which strings of text are produced from record structures of the sort already shown. The manner in which this processing is to be done is specified by APSG encoding rules.</Paragraph>
    <Paragraph position="14"> The right side of an encoding rule specifies what segments a segment of the type on the left side is to be expanded into.</Paragraph>
    <Paragraph position="15"> Conditions and structure-building actions are included in exactly the same manner as in decoding rules.</Paragraph>
    <Paragraph position="16"> The encoding algorithm begins with a single segment record and its associated type side-by-side on a stack. At each cycle through the algorithm, the top pair is removed from the stack and examined. If there is a rule that can be applied, it results in new pairs being put on the top of the stack, according to its right hand side. Otherwise, either the character string value of the NAME attribute of the SUP of the segment record (e.g. &amp;quot;servic&amp;quot;) is put out, or the name of the segment type itself (e.g. &amp;quot;I&amp;quot;) is put out. Eventually the stack becomes empty and the algorithm terminates, having produced the desired output string.</Paragraph>
    <Paragraph position="17"> For example, if at some point the following pair were to come off the top of the stack: VERBPH I SUP &amp;quot;SERVIC&amp;quot; I PRES,P3,PLUR,PROG the following encoding rule could be applied:</Paragraph>
    <Paragraph position="19"> resulting in the following two pairs being put on the top of the stack:</Paragraph>
    <Paragraph position="21"> The above rule says that a VERBPH segment with a PROGressive indicator should be expanded into a VERB segment with a SUP of &amp;quot;BE&amp;quot; and the same FORM indicators as the VERBPH, followed by a new VERBPH segment which begins as a copy (automatically) of the old one and then is modified by deleting the PROG and FORM indicators and setting the PRESPART indicator.</Paragraph>
    <Paragraph position="22"> When the VERB segment shown above comes off the stack, a rule would be applied to put the string &amp;quot;are&amp;quot; into the output. Then, after application of a couple more rules, the top of the stack would have the four</Paragraph>
    <Paragraph position="24"> which would result in the string &amp;quot;servicing&amp;quot; being produced after four cycles of the algorithm. Complete encoding examples may be found in Reference 3.</Paragraph>
    <Paragraph position="25"> V. IMPLEMENTATIONS AND APPLICATIONS As part of the original work on APSG a computer system called NLP (~atural Language ~rocessor) was developed in 1968. This is a FORTRAN program for the IBM 360/370 computers which will accept as input named record definitions and decoding and encoding rules in exactly the form shown in this paper and then perfor m decoding and encoding of text \[3\]. A set of about 300 named record definitions and 800 rules was written for NLP to implement a specific system (called NLPQ) which is capable of carrying on a dialogue in English about a simple queuing problem and then producing a program in the GPSS simulation language to solve the problem \[3,4\].</Paragraph>
    <Paragraph position="26"> More recently a LISP implementation of NLP has been done, which accepts exactly the same input and does the same processing as the FORTRAN version. An interesting feature of this new version is that the compiler part, whose primary task is to translate condition and creation specifications (i.e.</Paragraph>
    <Paragraph position="27"> the information in parentheses) into lambda expressions, is itself written as a set of APSG rules. This work is part of a project at IBM Research to develop a system which will produce appropriate accounting application programs after carrying on a natural language dialogue with a businessman about his requirements. APSG is also being used in the development of a natural alaaguage query system for relational data bases and is being considered for use in other projects at IBM. None of this recent work has been documented yet.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML