File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/87/e87-1036_metho.xml
Size: 19,998 bytes
Last Modified: 2025-10-06 14:12:01
<?xml version="1.0" standalone="yes"?> <Paper uid="E87-1036"> <Title>Buitding Expert Systems. Addison-Wesley Publishing</Title> <Section position="4" start_page="218" end_page="219" type="metho"> <SectionTitle> 2 GRANNAR OESCR|PTION </SectionTitle> <Paragraph position="0"> For the development of a grammar notation idiosyncracies of the object Language had to be observed. Finnish is a relatively free word order language. The syntactic-semantic knowledge ts often expressed in the inflections of the words.</Paragraph> <Paragraph position="1"> Furthermore, the parser was needed to work as a practical toot for real production applications, so the process of parsing was taken as a starting point instead of sentence generation.</Paragraph> <Paragraph position="2"> A grammar description consists of four parts: (1) Type definitions: Linguistic properties, features and categories.</Paragraph> <Paragraph position="3"> (2) A lexicon for associating features with words. (3) Binary dependency relations that may hold between regents and their dependents.</Paragraph> <Paragraph position="4"> (4) Functional 8C/hemmta for defining the Local environments of regents.</Paragraph> <Section position="1" start_page="218" end_page="218" type="sub_section"> <SectionTitle> 2.1 Type definitions </SectionTitle> <Paragraph position="0"> in the type definition part a grammar writer defines the types and their values used in a grammar description. This corresponds to the classification of Linguistic properties. There are three kinds of types: CATEGORIES, FEATURES end PROPERTIES. In addition to this the structure of the texical entries is described in this part.</Paragraph> <Paragraph position="1"> CATEGORY statement assigns names in hierarchies. For example, a category SyntCat for word classes could be defined as are defined. Values can be mutuaLLy exclusive: adding of the complement value automaticaLLy destroys the old value.</Paragraph> <Paragraph position="2"> (FEATURE: SyntFeet < (Locative) ;a name of a place (InfAttr) ;a noun, that may have an tnfinittvial attribute (CountMessure) ;a countable measure noun dege.</Paragraph> <Paragraph position="3"> PROPERTY values are Like FEATURES except that they may have default values. For example: (PROPERTY: Polar < ( Pos ) Neg >) In this type definition polarity is positive by default.</Paragraph> </Section> <Section position="2" start_page="218" end_page="218" type="sub_section"> <SectionTitle> 2.2 Lexicon </SectionTitle> <Paragraph position="0"> The parser is preceded by a morphoLogicaL analyzer (J~ppinen and Ytitammi 1986). The morphological anatyzer produces for each word its morphological interpretation including texicat information. The parser associates default features for words. Those words which have idiosyncratic features, ms all verbs do, are in the parser~s Lexicon. Some example entries of the parser's lexicon: &quot;Netri&quot; (meter) is s measure unit for common nouns. &quot;Netsink{&quot; ts s proper noun and a name of a place. &quot;Ajatetla&quot; (to think) Js a transitive verb that may have infinittvtat or participle objects.</Paragraph> </Section> <Section position="3" start_page="218" end_page="219" type="sub_section"> <SectionTitle> 2.3 Binary dependency retations </SectionTitle> <Paragraph position="0"> The dependency parsing model aims at providing analyzed sentences with their dependency trees.</Paragraph> <Paragraph position="1"> According to this approach two elements of * sentence are directly related in a dependency relation tf one depends on another. The two elements ere catted the regent R (or head or governer) and the dependent 0 (or modifier). Binary relations define all permitted dependency relations that may exist between two words in Finnish sentences. For example, the binary relation Subject is the following boolean expression of the morphological end syntactic features of a finite verb and its nominal subject: appear within angle brackets that indicates a disjunction. Negation is expressed by &quot;-&quot;. (PersonP D) (PersonH D) indicates an agreement test. O must be e personal pronoun in nominative case in this fragment.</Paragraph> <Paragraph position="2"> In our computational model words of an input sentence appear as complexes of their morphological, syntactical end semantic properties. We call this complex a constituent. If * binary relation holds between R and D, they ere adjoined into a single constituent. This ts what we mean by a functional description. It can be stated formally as mopping</Paragraph> <Paragraph position="4"> where R' stands for the regent R after that it has bound D. Function f is defined by the corresponding binary relation. This function abstraction should be distinguished from grammatical functions, even though in our grammar specification dependency relations also estimate grammatical functions.</Paragraph> </Section> <Section position="4" start_page="219" end_page="219" type="sub_section"> <SectionTitle> 2.4 Functional schemata </SectionTitle> <Paragraph position="0"> In functional schemata the Local environment of a regent is described by dependency functions.</Paragraph> <Paragraph position="1"> Functional schemata can be seen as partial dependency tree descriptions. A simplified schema for verb phrases, when a regent is * transitive verb and it is preceeded by s negative auxiliary verb, could be defined aS</Paragraph> <Paragraph position="3"> This scheme is able to recognize end build, for instance, pertlet dependency trees shown in Figure 2.</Paragraph> <Paragraph position="4"> Y=mr~ WMrt~ Ver~ svb~ eeg eb} subl nag e~v e~ |eaj s~j mql There ere three parts in the simplified schema NegTransVerb: WHEN. FUMCTIOIIS end HARK. WHEN pert describes features for the regent and its context. FUNCTIONS part describes the dependents for the regent. NULT|PLE clause indicates which dependents may exist multiple times. OBLIGATORY names obligatory dependents. LEFT end RIGHT give the structure of the left and right context of the regent.</Paragraph> <Paragraph position="5"> The free word order is allowed by default because of the particular interpretation of the clauses LEFT and RIGHT. The definition only indicates which dependents exist in the named context, not their mutual order. ALl the permutations ere attoued. There is also means of fixing yard ordering. ORDER clause indicates mutual ordering of dependents. For example, a grammar writer may define for the simple NP#s (ORDER AdjAttr GenAttr R RelAttr) For this particular regent the most immediate Left netghbour must be a genetive attribute. The next to that is an adjective attribute. The right netghbour is a relative clause.</Paragraph> <Paragraph position="6"> For tong-distmnce dependencies the Local decision strategy must be augmented. The binding of |ong-dJstance dependents has two phases: the recognition end the actual binding.</Paragraph> <Paragraph position="7"> In transformational grammar, tong-distance dependencies ere dealt with by assuming that in the deep structure the missing word is in the place it would be in the corresponding simple sentence. It is then moved or deleted by a transformation. The essential point is that tong-distance dependency is caused by an element which has moved from the Local environment of * regent to the Local environment of another regent. Hence a moved element must be recognized by the functional schema associated with that Latter regent. The binding, then, is done Later on by the schema of the former regent.</Paragraph> <Paragraph position="8"> In the recognition phase the tong-distance dependents are recognized and bound &quot;sway&quot; (captured), so that the current regent can govern its environment.</Paragraph> <Paragraph position="9"> After this capture the possible Long-distance dependent remains waiting for binding by another scheme.</Paragraph> <Paragraph position="10"> Capturing dependency functions are marked tn the</Paragraph> </Section> </Section> <Section position="5" start_page="219" end_page="219" type="metho"> <SectionTitle> CAPTURE clause: (CAPTURE DistantNember) </SectionTitle> <Paragraph position="0"> The dependency function DistentNember is general enough to capture all possible tong-distant dependents. For the actual binding of tong-distance dependents, one must mark in the clause DISTANT the dependents which may be distant:</Paragraph> </Section> <Section position="6" start_page="219" end_page="220" type="metho"> <SectionTitle> (DISTANT Object) </SectionTitle> <Paragraph position="0"/> </Section> <Section position="7" start_page="220" end_page="223" type="metho"> <SectionTitle> 3 BLACKBOARD-BASED CONTROL FOB DEPENDENCY PARSING </SectionTitle> <Paragraph position="0"> BLackboard ts a problem-solving model for expert systems (Hayes-Both et at. 1983, Nii 1986). We have adopted that concept end utilized it for parsing purposes. Our blackboard model application is rather simple (Figure 3).</Paragraph> <Paragraph position="1"> There are three main components: * blackboard, m control part end knowledge sources. The blackboard contains the active environment description for a regent. According to the structural knowledge in that environment description corresponding partial parse tree is built in the blackboard. Also all other changes in the state of computation are marked in the blackboard.</Paragraph> <Paragraph position="2"> Functional schemata and binary dependency relations are independent and separate knowledge sources; no communication happens between them. Art data flow takes place through the blackboard. Which module of knowledge to appty is determined dynemicalty, one step at * time, resulting in the incremental generation of partial solutions.</Paragraph> <Paragraph position="3"> In functional schemata s grammar writer has described Local environments for regents by dependency functions. The schemata are compiled into an internal LXSP-form. At s time, only one of the schemata is chosen as an active environment description for the current regent. The activated schema is matched with the environment of the regent by binary relation tests. The binary relations respond to the changes in the blackboard according to the structural description in the active schema and the properties of the regent and dependent candidates. At the same the partial dependency tree is built by corresponding dependency function applications. When s schema has been fully matched end the active regent bound to its dependents through function Links, the Local partial dependency parse is complete.</Paragraph> <Paragraph position="4"> A scheduler for knowledge sources controls the whole system. It monitors the changes on the blackboard and decides Mhat actions to take next. The scheduler employs * finite two-way automaton for recognition of the dependents.</Paragraph> <Section position="1" start_page="221" end_page="222" type="sub_section"> <SectionTitle> 3.1 The blackboard-based control strategy for </SectionTitle> <Paragraph position="0"> dependency parsing For the format definition of the parsing process we describe the input sentence as a sequence (c(1),c(2),...,c(i-1), c(i), c(i+l),...,c(n)) of word constituents. With each constituent c(i) there is associated a set (s(i,1),...,s(i,m)) of functional schemata. The general parsing strategy for each word constituent c(t) can be modelled using * transition network. During parsing there ere five possible computational states for each constituent c(i): Sl The initial state. One of the schemata associated with ctt) is activated.</Paragraph> <Paragraph position="1"> S2 Left dependent* ere searched for c(i).</Paragraph> <Paragraph position="2"> $3 c(i) is waiting for the building of the right context.</Paragraph> <Paragraph position="3"> 1) A schema candidate s(iek) associated with c(t) is activated, i.e. the constituent c(t) take* the rote of a regent. Following the environment description in s(i,k), dependents for c(i) are searched from its immediate neighbourhood. Go to the step 2 with j * i-1.</Paragraph> <Paragraph position="4"> 2) The search of left dependents. There are two subcases: 2a) There are no left neighbours (j = 0), none is expected for c(i), or c(j) (j < i) exists and is in the *tats $3.</Paragraph> <Paragraph position="5"> Go to the step 3 with j = j+l.</Paragraph> <Paragraph position="6"> 2b) c(j) (j x i) exists and is in the state SS. Binary relation tests are done. In the case o? a * ucces the lipping f(c(i), c(j)) -> c(i)' takes place. Repeat the *tap 2 with j - j-1 end c(i) =</Paragraph> <Paragraph position="8"> Right dependent* are searched for c(i).</Paragraph> <Paragraph position="9"> The final state. The schema associated with c(i) has been fully matched and becomes inactive, c(i) is the head of the completed (partial) dependency tree.</Paragraph> <Paragraph position="10"> At any time, only one schema is active, i.e. only one constituent c(i) may be in the state B2 or S4. Only s completed constituent (one in the *tale S5) is allowed to be bound as s dependent for * regent. There may be s number of constituents simultaneously in the state S3. We call these pending constituent* (implemented as a *tack PENDING).</Paragraph> <Paragraph position="11"> 3) Building the right context of the regent. There are two subcases: 3a) There ere no right neighbours (j * n) or none is expected for c(i). Go to the *tap 5.</Paragraph> <Paragraph position="12"> 3b) c(j) (j * i) exists. Go to the step 1 with c(i) : c(i+l) and PENDING = push (c(i), PEND%MG). 4) The search of right dependents. Binary relation tests are done. in the case of succes the mapping f(c(i), c(j)) -> c(i) ~ takes place. Repeat the step 3 with j = j+l and c(i) = c(i)'.</Paragraph> <Paragraph position="13"> 5) The final state. There are two subcases: The parsing process *tarts with c(1) *nd proceeds to the right. Initially all constituents c(1),..,c(n) are in the *tats el. A sentence is welt formed if in the end of the parsing process the result i* * *ingle constituent that has reached the state S$ and contain* all other constituents bound In it* dependency tree. For each constituent c(i) the parsing process can be described by the following five steps. Parsing begins from the *tap 1 with i,k = 1.</Paragraph> <Paragraph position="14"> 5a) The environment description has been matched. if there remains no unbound c(j)'s (j < i or j > i) the sentence is parsed. If c(i+l) exists go to the step 1 with i = i+1. if c(i+l) doesn't exist or the steps followed previous case returned a failure, go to the step 4 with c(i) * pop (PENDING).</Paragraph> </Section> <Section position="2" start_page="222" end_page="222" type="sub_section"> <SectionTitle> 3.2 The implementation of the control atrategy </SectionTitle> <Paragraph position="0"> The control system has two levels: the basic level employs a generat two-way automaton and the upper level uses a blackboard system. There is a ctear correspondence between the grammar description and the control system: the two-way automaton makes local decisions according to the binary relations. These local decisions are controlled by the blackboard system which utilizes the environment descriptions written in the schemata. This two-level control model has certain advantages. The two-way automaton is computationalty efficient in local decisions. On the other hand, the blackboard system is able to utilize global knowledge of the input sentence.</Paragraph> <Paragraph position="1"> ChronoLogicat backtracking To account for ambiguities there are three kinds of backtracking points in the control system.</Paragraph> <Paragraph position="2"> Backtracking may be done in regard to choice of dependency functions, homographic word forms, or associated schemata. Backtracking is chronological.</Paragraph> <Paragraph position="3"> In our system a constituent c(|) may contain several different morphotactic interpretations of a word form. Function backtracking takes place if there are several possible binary relations between a given constituent pair. The preconditions of the schemata may allow multiple schema candidates for a given constituent.</Paragraph> <Paragraph position="4"> All alternatives are gone through one by one, if necessary, in chronological backtracking. As a result, the system may perform an exhaustive search and produce all possible solutions.</Paragraph> <Paragraph position="5"> Register for tong-distance dependencies The recognition of possible fond-distant dependencies is done by the capture function. An element is bound as a possible &quot;distant member&quot; in the context where the capture function fires. An element is also moved to the special register for s set of distant elements. The actual binding is done by the distant function from another schema. In chronological backtracking also distant bindings are undone.</Paragraph> <Paragraph position="6"> The strategy of local decisions controlled by global knowledge of the input sentence yields a strongly data-driven, taft-to-right and bottom-up parse whereby partial dependency trees are built proceeding from middle to out.</Paragraph> </Section> <Section position="3" start_page="222" end_page="223" type="sub_section"> <SectionTitle> 3.3 EZANPLES </SectionTitle> <Paragraph position="0"> To v|suatize our discussion, a functional schema IntrllapNegVP is described in Figure 5. A grammar writer has declared in WHEN-part that R must be a transitive process verb in active tense snd Imperative mood. In its taft context there must be a negative verb in imperative mood and of the textcat form &quot;El&quot; (&quot;NOT&quot;). There is one obligatory dependency retstion HegVerb. Adverbials may exist multiple times. A grammar writer has written in clauses LEFT and RIGHT the left and right context binary relations of the regent. After the schema has fully matched, the regent is marked VerbP and features PersonH and PersonP of the dependent recognized as HegVerb are marked for the regent.</Paragraph> <Paragraph position="1"> Next tins indicates the selected schema and dependents that are tested. The first word &quot;itS&quot; is identified ms a negative imperative verb with no dependents (schema DummyVP ok). The imperative verb &quot;eksy&quot; (to get lost) is then tried by the schema IntrlmperNegVP. The binary relation NegVerb holds between the two verbs, and the corresponding dependency function adjoins them. The othen functions fail. Dependents are searched next from the right context. The control proceeds to the word &quot;mets~ss~&quot; (forest). For that word no dependents are found and the system returns to the unfinished regent &quot;eksy&quot;. The schema IntrlmperNegVP has onty two relations remaining: Connect and Adverb|at. The word &quot;nets~ss~&quot; is bound as an adverbial. The schema has been fully matched and the Input sentence is completely parsed.</Paragraph> <Paragraph position="2"> The second example shows how our parser solves the following sentence (adopted from Karttunen, 1986b) which has a tong-distance dependency: En mini tennisti a|o ruveta petaamaan.</Paragraph> <Paragraph position="3"> not I tennis intend start play I do not intend to start to play tennis.</Paragraph> <Paragraph position="4"> The object of the subordinated infinitiviat clause (&quot;tennisti&quot;) has been raised in the main clause thus creating a gap. The parse tree of the sentence is in In the parsing process the schema NO-VP has matched the environment of the verb &quot;a|o&quot; (intend) and the schema O-LocativeVP of the verb &quot;peLaamaan&quot; (play).</Paragraph> <Paragraph position="6"> The schema NO-VP has captured the word &quot;tennisti&quot; as a DistantNember. The schema O-LocattveVP has Later on bound it as a removed Object.</Paragraph> </Section> </Section> class="xml-element"></Paper>