XML Viewer - c86-1058

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/86/c86-1058_abstr.xml
Size: 10,115 bytes
Last Modified: 2025-10-06 13:46:18
<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1058">
  <Title>Context Analysis System for Japanese Text</Title>
  <Section position="1" start_page="0" end_page="245" type="abstr">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> A natural language understanding system is described which extracts contextual information from Japanese texts. It integrates syntactic, semantic and contextual processing serially. The syntactic analyzer obtains rough syntactic structures from the text. The semantic analyzer treats modifying relations inside noun phrases and case relations among verbs and noun phrases.</Paragraph>
    <Paragraph position="1"> Then, the contextual analyzer obtains contextual information from the semantic structure extracted by the semantic analyzer. Our system understands the context using preceded contextual knowledge on terrorism and plugs the event information in input sentences into the contextual structure.</Paragraph>
    <Paragraph position="2"> i: Introduction Despite the advanced state of syntactic analysis research for natural language processing and the many useful results it has produced, there have been few studies involving contextual information, and many problems remain unsolved.</Paragraph>
    <Paragraph position="3"> The natural language understanding system described here employs a syntactic analyzer, a semantic analyzer treating modifying relations inside noun phrases and the relations among verbs and phrases, that is, word-level semantics, and a contextual analyzer (Fig. i). These analyzers operate in a serially integrated fashion. Though humans seem to understand natural language texts using these three analyzers simultaneously, we have made their methodology essentially different from their human counterparts for more efficient computing. Our system uses a context-free grammar parser named Extended-Lingol as a syntactic analyzer to analyze the Japanese sentences and produce parsing trees. From an analysis of these, in turn, it obtains word-level semantic structures expressed in frame-like representations. Finally, it extracts contextual information, using our representation from the semantic structures. We remain far from certain at this stage whether this system represents the best realization of an engineering-based natural language understanding system. Future plans include combining these three processes into one process and bringing the system closer to the human process.</Paragraph>
    <Paragraph position="4"> Because our system uses bottom-up analysis first (including syntactic analysis and word-level semantic analysis), it can obtain not only the outline of the input sentences but also their details, as necessary. This method is the best one in situations where the detailed information of texts are quite important, such as Machine-Translation systems and precise question-answering systems. Of course, in this way, we must build up a sizable dictionary of precise word definitions.</Paragraph>
    <Paragraph position="5"> In our system, predictive-style processing is not used in syntactic analysis and word-level semantic analysis. But, in the contextual analysis part, predictions from the tree structure of the contextual information are used for instantiation of the contextual structure.</Paragraph>
    <Paragraph position="6"> We are now developing a system which can understand newspaper articles through contextual structure (see Fig. 2a). After applying the procedures outlined above, the system obtains  In the morning of the 29th, at Palermo, Sicily in Italy, a parked car exploded, which killed 4 people including a judge who had directed an investigation into Mafia crimes, and injured about i0 people seriously or slightly. This is the fourth murder ease on judges at Palermo and is of the largest scale.</Paragraph>
    <Paragraph position="7"> Judge Rocco Chinnici, 58, the director of the Palermo preliminary court, police bodyguards and others were murdered. At the moment when the judge left home, the bomb exploded which had been set in the car of Fiatt parked near there. The explosion involved the residents, windows of the apartment and about I0 cars near there.</Paragraph>
    <Paragraph position="8"> b: The translation of the example article (a)  from Japanese into English.</Paragraph>
    <Paragraph position="9"> Fig. 2. An example of newspaper articles  contextual representations expressed as shown in Fig. 3. Some details of the input text are abbreviated in the figure.</Paragraph>
    <Paragraph position="10"> 2: Syntactic and semantic analysis\[2\] Let us proceed to an explanation of the methodologies adopted by our system, using the newspaper article in Fig. 2a as an example.</Paragraph>
    <Paragraph position="11"> First, the system analyzed each sentence syntactically, obtaining parsing trees. Next, the system constructs a semantic structure for each phrase. Word meanings in our word dictionary ate described in SRL (Semantic Representation Language) which uses frame-like expression as shown in Fig. 4. Each word meaning shares a suitable position in the hierarchy of concepts. SRL enables deep semantic analysis in a flexible way. The formal definition of its syntax and semantics is not stated here. In our system, a word meaning written in the lexical entry using SRL plays an important role in semantic analysis. The interaction between the word meanings is the central issue of the semantic analysis. The modifying relations inside noun phrases and the case relations among verbs and noun phrases are determined in the word-level semantic structure. In Fig. 4, three scenes (explosion, death and injury) are obtained by analyzin 9 the first sentence of the article in Fig. 2a. &amp;quot;Human&amp;quot; is a dummy node that means human beings. Here, the people who died include a judge and some policemen.</Paragraph>
    <Paragraph position="12"> There are several types of ambiguity in input text. In sNntactic analysis, ambiguity means the existanoe of several parsing trees. Word-level semantics often specify which should be selected. Here, we should use a kind of prediction. For example, people who are in authority could be a * target of terrorism (See Fig. 2a). These constraints are very helpful in eliminating ambiguity, as well as surface syntactic information. Some of this processing is done in an interactive way in our system. Our system asks the user how to specify the relations between events in some decision points. Even after the elimination of ambiguity by the word semantics, there may be unsolved ambiguities.</Paragraph>
    <Paragraph position="13"> These will he eliminated by contextual analysis with the contextual structure.</Paragraph>
    <Paragraph position="14"> 3: Features of contextual representation Our contextual structure fits into a tree structure with one root node and a number of leaf nodes. Relations between events in a story are defined in the structure as &amp;quot;scenes&amp;quot;, and the relations among our structure are defined by a tree structure. Our structure can share scenes with others.</Paragraph>
    <Paragraph position="15"> Leaf nodes with a shared root node have either an &amp;quot;and&amp;quot; or an &amp;quot;or&amp;quot; relationship with each other. The hierarchy shown in Fig. 5 is an example. The node &amp;quot;terrorism involving bomb&amp;quot; has, as in Fig. 5, three leaf nodes (scenes) - &amp;quot;explosion,&amp;quot; &amp;quot;damage&amp;quot; and &amp;quot;rescue&amp;quot;. Since those seem to occur serially, the relationship among them is an &amp;quot;and&amp;quot; relationship. On the 'other hand, the root node &amp;quot;terrorist action&amp;quot; in Fig. 5 has several leaf nodes - &amp;quot;terrorism involving bomb&amp;quot;, &amp;quot;shooting&amp;quot; and so on. As only one of these usually corresponds to the main topic in newspaper stories, they share an &amp;quot;or&amp;quot; relationship with each other.</Paragraph>
    <Paragraph position="16"> Input events are matched not only directly with scenes in the structure, but also with higher concepts in accordance with a predefined tree structure of a concept hierarchy llke that in Fig. 6. In other words, the system has a concept thesaurus. So, matching between the scene of the structure and the input events  4: Contextual structure selection process Now we have implemented two selection methods for the selection of the contextual structure, a &amp;quot;two-event method&amp;quot; and a &amp;quot;title-based method&amp;quot;. First, we will explain the &amp;quot;two-event method&amp;quot;. In the &amp;quot;two-event method&amp;quot;, titles are not processed by the system for selection. In sentence processing, after two events are obtained, the system begins a search for a structure involving these two events as their scenes. The use of two events helps decrease the number of possible structures during the search. As mentioned previously, selection of suitable structures and scenes can be accomplished flexibly with the concept thesaurus.</Paragraph>
    <Paragraph position="17"> After developing the &amp;quot;two-event method&amp;quot;, we began to implement the &amp;quot;title-based method&amp;quot;. In the case of newspaper articles, titles have important information for the selection of suitable contextual structures. If there is a special word (noun or verb) in the title, contextual representation indicated by that word is selected. In this way, the system can almost always select suitable structures. Newspaper titles should be written so that readers can get enough information for the selection of the topic from its title only. The correct selection rate of our &amp;quot;title-based method&amp;quot; is shown in Table I. Derivatives point to their original words, and, through them, derivatives can select suitable structure.</Paragraph>
    <Paragraph position="18"> Within our experience, there are no differences in the correct selection rates between these two methods. In our system, at present, we use the &amp;quot;title-based method&amp;quot; because of its similarity to human behaviour.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML