File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-1215_intro.xml
Size: 7,460 bytes
Last Modified: 2025-10-06 14:00:57
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1215"> <Title>Semantic Annotation of Chinese Phrases Using Recursive-Graph</Title> <Section position="3" start_page="0" end_page="103" type="intro"> <SectionTitle> 2 Motivation </SectionTitle> <Paragraph position="0"> As a specific semantic annotation task, two problems should be made clear at first.</Paragraph> <Paragraph position="1"> In general, semantic annotation of linguistic forms is to associate with them their semantic information being represented in some formal languages or diagrams. With the semantic information involved varying, the formal languages or diagrams may be different.</Paragraph> <Paragraph position="2"> One commonly used diagram for semantic annotation of linguistic forms is dependent tree, in which the dependence or control relationship between constituents of a linguistic form is depicted (Langacker, 1997). But such trees may be not powerful enough to differentiate those Chinese phrases that comprise the same content words but hold different meanings due to their word order or involved function words. As an example, consider 1) and 2) 1.</Paragraph> <Paragraph position="3"> words, but hold different word order.</Paragraph> <Paragraph position="4"> Regarding their meanings, 1) is an ambiguous phrase, corresponding with two English translation phrases as 3) and 4). 3) to smuggle cars 4) smuggled cars The translation phrase for 2) is 5).</Paragraph> <Paragraph position="5"> 5) the smuggling of cars So, there are altogether three meanings held by the two phrases. But the two content words can only form two dependent trees, listed in 6) and 7).</Paragraph> <Paragraph position="6"> 1 In this paper, whenever listing a Chinese word, we always list its Pinyin included within two symbols 7', and its English translation. For a Chinese phrase, we furthermore list its English translation when necessary.</Paragraph> <Paragraph position="8"> Obviously, these two dependent trees cannot code the three meanings listed in 3), 4) and 5). Only if we could see the same word ;~1~ ~L(/zousi/, smuggle) semantically different in 1) and 2), we could add another dependent tree 8) to code 5), with 6) and 7) corresponding with 3) and 4) respectively.</Paragraph> <Paragraph position="10"> But this view is quite unintuitive, and will lead to contradictory. Consider other two phrases 9) and 10).</Paragraph> <Paragraph position="11"> Intuitively, the word :~l~/L(/zousi/, smuggle) in 2) holds the same meaning as the word in 10), which subsequently is equivalent with the same word in 9). On the other hand, there is no reason to treat the same word ~L (/zousi/, smuggle) semantically differently in 1) and 9), two typical noun phrases. Particularly, they are both followed by a typical noun in the two phrases. The second problem for dependent tree to semantically annotate linguistic forms is that due to its tree nature, it cannot represent multi-dependence relationship, in which one node is controlled by several nodes. For example, consider 11).</Paragraph> <Paragraph position="12"> /xihuan/ /ziji/ /de/ /ren/ like self of people the people who like oneself Conventionally, its dependent tree should be 12).</Paragraph> <Paragraph position="14"> But intuitively, there should be some dependence relationship between )k(/ren/, people) and fl ~ (/ziji/, self). If we add this relationship, it will become a graph.</Paragraph> <Section position="1" start_page="102" end_page="103" type="sub_section"> <SectionTitle> 2.2 Conceptual Graph </SectionTitle> <Paragraph position="0"> Conceptual graph is another diagram for semantic annotation of linguistic forms, which comprises concepts and conceptual relationship denoted by linguistic forms (Eklund, 1996). Although it is claimed to be a directed graph in its original form, it is equivalent to an undirected graph in nature, with its relationship nodes and their directed edges replaced with an undirected edge to directly denote the relationship.</Paragraph> <Paragraph position="1"> One problem with this diagram for semantic annotation is that it cannot code the information about head, if any, in a linguistic form, which intuitively specifies the main information carried by a linguistic form. This will lead to severe problems when using the graph to represent linguistic forms. For example, both phrases 1) and 2) with all three meanings 3), 4) and 5) would be represented by the same diagram as in 13) 3 .</Paragraph> <Paragraph position="2"> 13) ~/~ ~:~ (/zousi/, smuggle) (/qiche/, car) To differentiate the three meanings held by the two phrases, we suggest using the following two weighted graphs and one unweighted graph, i.e., 14), 15) and 16) to represent 3), 4) and 5) respectively.</Paragraph> <Paragraph position="3"> 14) ~$L ~ (/zousi/, smuggle) (/qiche/, car) 15, I I (/zousi/, smuggle) (/qiche/, car) 16) \[ ~t~L ~ (/zousi/, smuggle) (/qiche/, car) Here we basically use undirected graphs to annotate phrases, and introduce a rectangle to denote the head of a linguistic phrase, if any. Notice that we don't mark a head in 14), which means that we don't take the verb ~L(/zousi/, smuggle) as the head of the verb phrase as usual. In general, for most verb phrases like 14) in Chinese, they correspond with two modifier-center phrases like 15) and 16) that comprise the same content words but with different meanings. For such phrases, we generally use a headword to differentiate between the verb phrase and the two modifier-center phrases, and then use different headword to distinguish the two modifier-center phrases. Another problem with conceptual structure is concerned with its ability to deal with hierarchical structures. Although nested conceptual structure is introduced to describe nested belief, one particular kind of hierarchical structures (Genevirve, 1998), some simple hierarchical structures cannot be distinguished or annotated appropriately. As an example, consider 17) and 18).</Paragraph> <Paragraph position="4"> to smuggle beautiful cars Using conceptual graph, we can annotate both of them as 19), in which case the two phrases cannot be distinguished.</Paragraph> <Paragraph position="5"> 19) ~t~L~ ~~~ If based on undirected graph plus head, 17) can be annotated as 20).</Paragraph> <Paragraph position="6"> 20, But there will be no appropriate annotation for 18), because on the one hand, ~(/qiche/,car) should be coded as a head due to its relationship with ~g~(/piaoliang/, beautiful), on the other hand, it's role as the object of the verb ~L(/zousi/, smuggle) makes it illegal to be a head.</Paragraph> <Paragraph position="7"> To differentiate 17) and 18), we further suggest specifying the embedded structures in linguistic forms in some way, and use circles to denote them. In such opinion, 17) can be annotated as 21) and 22) respectively. 21) ~ In this diagram, the smaller rectangle denotes the head of the modifier-center phrase ~$/L~ (/zousiqiche/, smuggled cars), the circle codes the phrase as an embedded structure of the whole phrase, while bigger rectangle denotes the head of the whole phrase.</Paragraph> <Paragraph position="8"> 22) :xT~ ~L~ In this diagram, the circle denotes the embedded structure $~,;~~(/piaofiang de qiche/, beautiful cars), while the rectangle denotes the head of the embedded structure. Notice the phrase on the whole is a verb phrase, so there is no head coded here.</Paragraph> </Section> </Section> class="xml-element"></Paper>