File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-1119_metho.xml
Size: 11,304 bytes
Last Modified: 2025-10-06 14:08:37
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1119"> <Title>A Sentence Reduction Using Syntax Control</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Sentence reduction using syntax control </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Formulation </SectionTitle> <Paragraph position="0"> Let E and V be two difference languages. Given a long sentence e : e1;e2;:::;en in the language E.</Paragraph> <Paragraph position="1"> The task of sentence reduction into two languages E and V is to remove or replace some redundant words in the sentence e to generate two new sentences e01;e02;:::;e0m and v1;v2;:::;vk in language E and V so that their gist meanings are unchanged.</Paragraph> <Paragraph position="2"> In practice, we used English language as a source language and the target language are in English and Vietnamese. However, the reader should understand that our method can apply for any pair of languages.</Paragraph> <Paragraph position="3"> In the following part we present an algorithm of sentence reduction using syntax control with rich semantic information.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Sentence reduction algorithm </SectionTitle> <Paragraph position="0"> We present an algorithm based on a semantic parsing in order to generate two short sentences into difference languages. There are three steps in a reduction algorithm using syntax control. In the first step, the input sentence e will be parsed into a syntax tree t through a syntax parser.</Paragraph> <Paragraph position="1"> In the second step, the syntax tree will be added rich semantic information by using a semantic parser, in which each node of the syntax tree is associated with a specific syntax control. The final step is a process of generating two deference sentences into language E and V language from the syntax tree t that has been annotated with rich semantic information. null First, We parse a sentence into a syntax tree. Our syntax parser locates the subject, object, and head word within a sentence. It also recognizes phrase verbs, cue phases or expressions in English sentences. These are useful information to reduce sentence. The Figure 2 explains the equivalent of our grammar symbol with English grammar symbol.</Paragraph> <Paragraph position="2"> Figure 1 shows an example of our syntax parsing for the sentence &quot;Like FaceLift, much of ATM's screen performance depends on the underlying application&quot;. null To reduce the ambiguity, we design a syntactic parsing base on grammar symbols, which classified in detail. Part of speech of words was extended to cope with the ambiguity problem. For example, in Figure 2, &quot;noun&quot; was dived into &quot;private noun&quot; and &quot;general noun&quot;.</Paragraph> <Paragraph position="3"> The bilingual dictionary was built including about 200,000 words in English and its meaning in Vietnamese. Each English word entry includes several meanings in Vietnamese and each meaning was associated with a symbol meaning. The set of symbol meanings in each word entry is defined by using WordNet database.(C. Fellbaum, 1998) The dictio- null FaceLift, much of ATM's screen performance depends on the underlying application&quot; English and its equivalent to Vietnamese.</Paragraph> <Paragraph position="4"> After producing a syntax tree with rich information, we continue to apply a semantic parsing for that syntax tree.</Paragraph> <Paragraph position="5"> Let N be an internal node of the syntax tree t and N has k children nodes: n1;n2;:::nk .</Paragraph> <Paragraph position="6"> The node N based on semantic information from its n children nodes to consider what the remained part in the reducing sentence should be.</Paragraph> <Paragraph position="7"> When parsing semantic for the syntax tree t, each N must be used the information of children nodes to define its information. We call that information is semantic-information of the node N and define it as N:sem . In addition, each semantic-information of a given node N was mapped with a meaning in the target language.</Paragraph> <Paragraph position="8"> For convince, we define SI is a set of semantic-information and assume that the jth semantic-information of the node nj is nj[i].</Paragraph> <Paragraph position="9"> To understand what the meaning of the node N should be, we have to know the meaning of each children node and know how to combine them into meanings for the node N .</Paragraph> <Paragraph position="10"> Figure 3 shows two choices for sequence meanings of the node N in a reduction process .</Paragraph> <Paragraph position="11"> It is easy for human to understand exactly which meaning of ni should be and then decoding them as objects to memorize. With this basic idea, we design a control language to do this task.</Paragraph> <Paragraph position="12"> The k children nodes n1;n2;:::nk are associated with a set of a syntax control to conduct the reducing sentence process. The node N and its children are associated with a set of rules. To present the set of rules we used a simple syntax of a control language as follows: 1) Syntax to present the order of children nodes and nodes to be removed.</Paragraph> <Paragraph position="13"> 2) Syntax to constraint each meaning of a children node with meanings of other children nodes.</Paragraph> <Paragraph position="14"> 3) Syntax to combine sequence meanings into one symbol meaning (this process called a inherit process from the node N to its children).</Paragraph> <Paragraph position="15"> A syntax rule control will be encoded as onegeneration rules and a set of condition rules so that the generation rule has to satisfy. With a specification condition rule, we can define its generation rule directly.</Paragraph> <Paragraph position="16"> Condition rule A condition rule is formulated as follows: if</Paragraph> <Paragraph position="18"> then N:sem = v with v and vj 2 SI Generation rule A generation rule is a sequence of symbols in order to transfer the internal node N into the internal node of a reduced sentence. We used two generation rules, one for E and other one for V . Given a sequence symbols g : g1g2:::gm , in which gi is an integer or a string. The equation gi = j means the children node be remained at position j in the target node. If gi = &quot;v1v2:::vl&quot;, we have that string will in the children node ni of the target node.</Paragraph> <Paragraph position="19"> Figure 1 shows a syntax tree of the input sentence: &quot;Much of ATM's performance depends on the underlying application.&quot;. In this syntax tree, the syntax The condition rule is &quot;default&quot; mean the generation rule is applied to any condition rule. The generation rule be &quot;1 2&quot; mean only the node (Subj) in the index 1 and the node (cdgt) in the index 2 of the rule &quot;S1=Bng-daucau Subj cdgt Bng-cuoicau&quot; are remained in the reduced sentence.</Paragraph> <Paragraph position="20"> If the syntax control is changed to This condition rule means that only the case the semantic information in the children node &quot;Subj&quot; is &quot;HUMAN&quot; the generation rule &quot;1 2&quot; is applied for reduction process. Using the default condition rule the reduced sentences to be generated as follows.</Paragraph> <Paragraph position="21"> Original sentence: Like FaceLift, much of ATM's screen performance depends on the underlying application.</Paragraph> <Paragraph position="22"> Reduced sentence in English: Much of ATM's performance depends on the underlying application. Reduced sentence in Vietnamese: Nhieu hieu suat cua ATM phu thuoc vao nhung ung dung tiem an.</Paragraph> <Paragraph position="23"> In order to generating reduced sentence in Vietnamese language, the condition rule and generation is also designed. This process is used the same way as transfer translation method.</Paragraph> <Paragraph position="24"> Because the gist meaning of a short sentence is unchanged in comparing with the original sentence, the gist meaning of a node after applying the syntax control will be unchanged. With this assumption, we can reuse the syntax control for translating the original sentence into other languages (English into Vietnamese) for translating the reduced sentence. Therefore, our sentence reduction program can produce two reduced sentences in two difference languages. Our semantic parsing used that set of rules to select suitable rules for the current context. The problem of selecting a set of suitable rules for the current context of the current node N is to find the most likely condition rule among the set of syntax control rules that associated with it. Thus, semantic parsing using syntax control problem can be described mathematically as follows: Given a sequence of children nodes n1;n2;:::;nk of a node N, each node ni consist of a list of meaning, in which each meaning was associated with a symbol meaning. The syntax rule for the node N was associated with a set of condition rules. In addition, one condition rule is mapped with a specification generation rule.</Paragraph> <Paragraph position="25"> Find the most condition rules for that node sequences. null This problem can be solved by using a variant of the Viterbi algorithm (A.J. Viterbi, 1967).</Paragraph> <Paragraph position="26"> Firstly, we define each semantic-information of a children node with all index condition rules. Secondly, we try to find all sequences that come from the same condition rules.</Paragraph> <Paragraph position="27"> Algorithm 1 A definition of condition rules algorithm. FindRule(N) Require: Input: N is a node Ensure: A syntax control for a rule fInitialization step:g 1: for i = 1 to k do 2: for j = 1 to Ki do 3: Set stack s[i]=all index rules in the set of condition rules satisfy ni:sem = ni[j] 4: end for 5: for i = 1 to K1 do 6: Cost[0][i] = 1; 7: Back[0][i] = 0; 8: end for 9: end for fInteraction step:g 10: for i = 1 to k do 11: for j=1 to Ki do 12: Cost[i][j] = maxCost[i ! 1][l] PS</Paragraph> <Paragraph position="29"> After defining a set of semantic-information for each internal node, we have a frame of semantic parsing algorithm as shown in Algorithm 2. Our semantic parsing using syntax control is fast because of finding syntax control rule for each node tree is applied dynamic programming.</Paragraph> <Paragraph position="30"> Algorithm 2 Semantic parsing algorithm Require: Given a syntax tree , a set of syntax control for each node of the syntax tree.</Paragraph> <Paragraph position="31"> Ensure: a syntax tree with rich semantic informa- null The input of this process is a syntax tree which associated with rich information after applying the semantic parsing process. Browsing the syntax tree following bottom-up process, in which, a node tree can be generated a short sub-sentence by using the corresponding generation rule. Because we have two generation rules for each node tree, so we have two reduced sentences in two difference languages.</Paragraph> </Section> </Section> class="xml-element"></Paper>