File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-3130_metho.xml
Size: 17,127 bytes
Last Modified: 2025-10-06 14:13:01
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-3130"> <Title>An Abstraction Method Using a Semantic Engine Based on Language Information Structure</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 A semantic engine </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 What is the l,anguage Information </SectionTitle> <Paragraph position="0"> Strncture(LIS) form? The LIS form expresses the inlormation structure that pernlits commnnication between individuals. If two ino dividuals communicate about one that happened (will happen) m the real world, the core inlormation is the event. Sometm~es a spe,aker will atulche an attitude to the event. So information al~ml real world is expressed by the event and the attitude of tile speaker.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 L1S form </SectionTitle> <Paragraph position="0"> In the LIS form, there are two types of feature-structure, word feature-strncture and event feature-structure. Almost all slots of tile word feature-stxucture are filled with appropriate values and lew slots of the event feature-structure are empty. The semantic engine tries to fill all slots of lhe feature-structure.</Paragraph> <Paragraph position="1"> Event has one leature-structure, the role lkkqture. The sentence conlilins one or more events and die event feature-structure indicates the role of words or phrases in the event. The role feature is either essential or extensional. Seven essential roles have been created: as AGENT, OB-JECT, ACTION, LOCATION, TIME, FROM, and TO.</Paragraph> <Paragraph position="2"> These roles are defined not for verbs but for events. This is quite diflerent li'om Fillmore's cases \[2\]. Therefore, the action ill the event is represented by the ACTION slot, which c~ln be lilled by verbs, nouns, gerunds, and so oil.</Paragraph> <Paragraph position="3"> It is not necessary to fill the ACTION slot by a verb.</Paragraph> <Paragraph position="4"> l:or exulnple, tile phra.sc &quot; a laud tmrchase agreement&quot; is dealt with as one event in the LIS, and the ACTION slot-value is &quot;agreement&quot;. Other slots, such as AGENT, OBJECT, LOCAI'ION, TIME, FROM and TO slots are ahoost the same as in Fillmore's cases of 'agent', 'object', 'location', 'time', 'source', and 'goal (or experiencer)'.</Paragraph> <Paragraph position="5"> It is important that our role model deals with the roles of words (or pltrases) in an event, not word meaning.</Paragraph> <Paragraph position="6"> Using just seven essential roles, it is difficult to assign a talc to a word (or a phrase). To overcome this problem, AcrEs DECOL1NG-92, NANTES, 23-28 AOl~f 1992 8 7 5 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 we introduce extensional roles which allow to be moditied by the addition of &quot;/constraint&quot;.</Paragraph> </Section> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2,2.2 Word feature-structure </SectionTitle> <Paragraph position="0"> Word has six features. These are semantic lizature (DDF) slot, numerical-value slot, date slot, constmtint slot, modality slot, and word string slot.</Paragraph> <Paragraph position="1"> Using the semantic feature, the event feature-structure will be determined during semantic interpretation process.</Paragraph> <Paragraph position="2"> Six classes of semantic features are defined, such as INDI-VIDUAL, ELEMENT, THING, ACTION, LOCATION, and TIME These classes are instantiated to the Domain Dependent semantic Features (DDF) when tile domain is decided.</Paragraph> <Paragraph position="3"> The constraint fcmure restricts the feature-su'ucturc of brother words or t)hrases. FurthcrlllOre, tile constraint feature determines the relations betweell word feature-structure and event feature-structure, ht Japanese language, a word which have a ACTION DDF usually has the constraint feature that determines tile slots of event feature-structure.</Paragraph> <Paragraph position="4"> The numerical-value slot expresses numerical value of a word ; 0, 1 , 2, -- (one) , C/i (hundred) , :1&quot;- (thousand) , and so on. The calculation of countthg up and down is necessary, so all figures are separated. The nmnerical-value feature will be expressed as folh)ws.(Onr notation of a feature-structure is ~eature-name =feature-value\].) \[numerical-value= \[s~icr')~!!dTidg~it .... \] \] The date slot expresses event occurrence time and is expressed by the Christian era. In the Christian era, days are counted by numbers, so that date slots arc calculated nsing the numerieal-value feature. The date slot has a minute slot, a second slot, a hour slot, a day slot, a month slot, and a year slot. Eacll slot is expressed in nmnerical value.</Paragraph> <Paragraph position="5"> The modality slot is classilied into three c,'ttcgories; tense, aspect, and uqood. Since tile tense and aspect are linguistically Iixed, we use an ordinary categorization.</Paragraph> <Paragraph position="6"> However, mood is needed to be categorized differently, because the information unit used this system is an event. So we categorized mood as a combination of Bratman's Belief-Desire-lnten lion model\[ 1 \] and modal logic. That is the skate of event is expressed hy modal logic (necessary operator, possible operator, and negation sign) and the attitude of speakers cart tx~ cthssiliod into belief, desire, and intention. For example, a seurence I think it ts possible to construct a plant there will be expressed as Belief(Possibleli', where E means an event;construct a plant there, that is, the individual believes that E is possible.</Paragraph> <Paragraph position="7"> Furthermore, it is necessary to consider u situation in which information is transferred. In tfie newspaper, it is created by journalists who get information from other services (person or company information bureau), ht this situation, the event aml the attitude of file inforumtion possessor (IP) is transported to a speaker (SP);journalist. The journalist then reexpresses the information to reflect his attitude. If the modality of IP and SP are expressed as M odalit ~,/se (), M odalit Yl v (), respectively, information in newspaper is expressed as, M odalitys p ( M odality, e( E V E N T) ) ).</Paragraph> <Paragraph position="8"> If the target document is newspaper, the LIS form includes the modality of speaker (Modalitysp) and the modality of information possessor (Modalitylp).</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 Projection Mechanism </SectionTitle> <Paragraph position="0"> Parsing is done using Morphological analysis and Dependency zmalysis\[4\] attd yields a syntactic tree for a sentence, After the parsing, we search a feature-structure dictionary to extract feature-structures of all words related to the domain. To perform semantic analysis with limited knowledge, word feature-structures are prepared only for abstract words. The registration of proper nouns are left to the user. The semantic engine infers the semantic meaning of words or phrases from the system default and user registrations held in the dictionary. This means the semantic engine do not need all knowledge of words for semantic interpretation. Thus only a small amount of words need to be maintained.</Paragraph> <Paragraph position="1"> After attaching the appropriate word feature-structure to all important words, semantic interpretation can proceed. From the type of propagation for the feature-structure in a parsing tree,there are two types of features. One is the synthesized type whose value is calculated from sons to fatber relationship of the parsing tree. The other is the inherited type that are calculated from father or brothers. Word string, DDF, numerical-value, date, and modality features are synthesized type and other features, such as consmdnt and role are inherited type. The propagation era feature-structure is accomplished by unification calculus, but the grammar is different.</Paragraph> <Paragraph position="2"> For DDF features, the grammar is as follows.</Paragraph> <Paragraph position="4"> Note: Uncapitalized words mean terminate and capitalized words mean nonterminate. E means EVENT node structure and N means other node structure and N.DDF means DDF feature-value in node 'N'. Symbol 'n' is the number of nodes. Operator ,~ means unification operator.</Paragraph> <Paragraph position="5"> For the constraint feature,</Paragraph> <Paragraph position="7"> Tile dale and nunierical-value features are ralJlOr COlUplicated because we have to 0col with the semantic mo,'tning of time.</Paragraph> <Paragraph position="8"> The grammar of l~)r the date feature is,</Paragraph> <Paragraph position="10"> Tile calculation of number aud date features is done like a stack. The nmnerical-value feature has one stack and date feature has six stacks.</Paragraph> <Paragraph position="11"> For example, lbr the number 1992, all die numbers, 1.</Paragraph> <Paragraph position="12"> 2. and 9, are expressed as follows,</Paragraph> <Paragraph position="14"> Note: Symbol 'oval' i/lealis ltlal next \[oi112 will be evaluated by Common-lisp, Symbol 'push-stuck' is the lunction Ihat puts the argumem wdue ou the top el tile stuck, The equation lot tile nun/eric/l-value of 1992 is,</Paragraph> <Paragraph position="16"> which, after evaluation, gives as the value of 1992 as follwing expressions,touting right to left, tirst digit being 2. second digit begin 9, and so on,</Paragraph> <Paragraph position="18"> fourth-digit 1 If we process the phrase, 1992 ~q&quot;. (year 19921, the equation becomes,</Paragraph> <Paragraph position="20"> Note: Nonteriifinate 'SELF' refers to the sell leaturc-structure Symbol 'push-year-stuck' is the functiou that places the argument value to tile year stack ill lhe date feature.</Paragraph> <Paragraph position="21"> Then and we get the time feature-structure as,</Paragraph> <Paragraph position="23"> The grammar for modality is quite simple,</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.4 All example of the projection process in </SectionTitle> <Paragraph position="0"> tile semantic engine This passage comes from Tile Nikkan-kougyou shiubun(Daily Industrial Newspaper). The headline is &quot;il~ Co.. Ltd. is constructing a new plant to assemble large-scale steel bridges in W~tkayama-ken.).</Paragraph> <Paragraph position="1"> amtounced a land purchase agreement in Shimotsu-machi, Wahayeana-ken, where they will construct a new plant giving ttuem additional ,wace and capabilities to fabricate larger scale steel bridge structures.) -~&quot; &quot;~ o &quot; (771e 170,S56-sq.-ttwter construction site, previously occupied by part eta Maruzen Oil Co., Ltd. refinery, was purchased from Maruzen Shinwtsu Kosan for an estimated x\[ 10 billion. Construction on the new plant facility is slated to begin this coming spring.) events which related to the dolnaill: corupany act. From tile featnre-structure dictionary, file sentence was quickly reviewed to determine whether there is a word which has an ACTION DDF or not. If there is no such word, the system thea stops analyzing the sentence. In the example, the Iirst sentence (S 1) is made of two events. One is&quot; J~ (construction)&quot; and the other is &quot; D,~\]~\[l (agreement)&quot;. The two events are connected by&quot; $- -5 ~ ~ &quot;~&quot; (surukoto-de)&quot;. So seutence ($1) is separated to two events. Senteuce $3 does not have an ACTION DDF, so further analysis is not couducted.</Paragraph> <Paragraph position="2"> Therefore, we call obk'lin live events from five sentences; S I to $5.</Paragraph> <Paragraph position="3"> L- ~ ~Clli~11 Ii),/ \[q'ltlilf i<c ~\[t~ L, &quot; (lakada-kiko Co., Ltd. will coastruct a new plant giving tl~em additional space arid capabilities to fabricale larger scale steel bridge structures.) After the event separation process concludes, semtmtic interpretation is commenced. The first stage is attaching a feature structure to each word.</Paragraph> <Paragraph position="4"> Let's consider the Event4 in $2; &quot; ~75~C/9~3Lg~,~ ~f'c~t~IL'J- ~ o &quot; (Construction on the new plant facility is slatedtobeginthiscomingspring.), lit this passage, there arc three Bunsetsu, five independem words, three delxmdent words. We need only five leamre-structures as shown below.</Paragraph> <Paragraph position="5"> ~l~;this coming spring-- null is unified to the one node, then variable '_agent' is tx)unded to that node's feature-structure. Variable *article-year* is bounded to the date of year when the article is publishezl. The example is parsed as shown in ligure 1.</Paragraph> <Paragraph position="6"> Once the parsing is finished, the semantic thterpretution process begins. Node n\] wilt have tile feature-structure timt is the result of calculation between the feature-structure of&quot; ~ (raisynn)&quot; and &quot; 7~, b (kara)&quot;, but the word&quot; 2,~ C9 (kava)&quot; has no leature-structurc so thc feature-structure of&quot; ~ (raisyun)&quot; ul, is propagated to node nl. The feature-structures of all nodes are calculated same way.</Paragraph> <Paragraph position="7"> For the constraint lcature, uniiication was done to all brothers. For example, \[~(tgenl.Dl)t&quot; = (raisyun) (kara) (shin) (koujyou)(ke~etsu)(ni) (chakkou)(suru) C/ C/ C/ C/ C/ ul u2 u3 u4 u5 Figure 1 : First stage of syntactic tree iJ.lividual(comprmv)\] means that one brother node is needed which have the DDF value of agent(company).</Paragraph> <Paragraph position="8"> If there is a node which satisfies the constraint, then the variable _agent is \['rounded to that node feature-structure. If there is no node which satisfies tile constraint, then variable _agent is unbounded.</Paragraph> <Paragraph position="9"> Try to think about the constraint feature in &quot;Jd~P. (kensetsu)&quot;. There is no node that has agent(company) in DDF, but there are nodes which satisfy the constraint, such as the ~action,_object, and _time which are bounded to nodes n3, n2, hi,respectively.</Paragraph> <Paragraph position="10"> Finally we get the event feature-structure of top node n-top,shown in figure 2.</Paragraph> <Paragraph position="11"> In tile abstraction, we utilize classification of the LIS ouqmt. First, a sentence is put into the LIS form by the semantic engine.</Paragraph> <Paragraph position="12"> TIle LIS output is used to commence the abstraction procedurc. To extract information from sentence, we think classilicaUon is tile best way. The semantic engine analyzes sentence in fixed domain, after the semantic attalysis. Sentences tire classilied whether an event or not, artd tile system extracts the events which are related to the domain.</Paragraph> <Paragraph position="13"> Finally, ABEX provides a abstraction. One abstraction proposed here is the classification of event occurrence time and similarity of event. This classification reveals the relationships of each event. Individual event occurrence times will be determnined from value of the time feature and the similarity of events is calculated by comparing event feature-structure slots.</Paragraph> <Paragraph position="14"> Tile other method is classification by the modality of information. Front the view point of Modalitytp, we can classify an event according to the modality of information possessor (1P). If there is no modality in the event, we classify it as 'fact'. Others are classified using modality feature. This classification of tile event's modality reveals the attitude of the information possessor.</Paragraph> <Paragraph position="15"> smdnt _objert. I)l)l'&quot; : : chm. nl(conumny ) _tlme. Dl)l'&quot; :-: tune \[ sl,,,,9 = &quot;*,i(shit,j lj~t(koujyou) '' \] ua l)l)t.1 elt:mcnl(~l~mlmng 1 a: ut nto~lh = 4 d,tc = yea, 1991 ,..,,s ,,,o,z.l,,v = L te~se :: future \]</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 An example oftlne abstraction </SectionTitle> <Paragraph position="0"> Figure 2 shows a typical abstraction result of ABEX.</Paragraph> <Paragraph position="1"> The events me classified by tile eveiit occurrence title and the simimlity of each event. In this Iigure, x-axis indicates tthsohlte event (}cctllfence tittle and y axis indicates relative sinlilarity of events alld cilcled icQIIS indicate single events.</Paragraph> <Paragraph position="2"> A typical classification restllt using tile modality of information is shown in ligure 3.</Paragraph> <Paragraph position="3"> The Event 2 lilts tile modality of an official bulletin and Event 5 has rite modality of company imention, so we get tile abs~tction result shown in figure 3.</Paragraph> </Section> </Section> class="xml-element"></Paper>