File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/c04-1040_metho.xml
Size: 4,940 bytes
Last Modified: 2025-10-06 14:08:44
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1040"> <Title>A Deterministic Word Dependency Analyzer Enhanced With Preference Learning</Title> <Section position="4" start_page="0" end_page="96" type="metho"> <SectionTitle> 3 Results </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 Root-Node Finder </SectionTitle> <Paragraph position="0"> For the Root-Node Finder, we used a quadratic kernel K(xi; xj) = (xi xj + 1)2 because it was better than the linear kernel in preliminary experiments.</Paragraph> <Paragraph position="1"> When we used the 'correct' POS tags given in the Penn Treebank, and the 'correct' base NP tags given by a tool provided by CoNLL 2000 shared task2, Matsumoto, 2003), this root accuracy is better than Charniak's MEIP and Collins' Model 3 parser.</Paragraph> <Paragraph position="2"> We also conducted an experiment to judge the effectiveness of the base NP chunker. Here, we used only the first 10,000 sentences (about 1/4) of the training data. When we used all features described above and the POS tags given in Penn Treebank, the root accuracy was 95.4%. When we removed the base NP information (bi, Li, Ri), it dropped to 94.9%. Therefore, the base NP information improves RNF's performance.</Paragraph> <Paragraph position="3"> Figure 3 compares SVM and Preference Learning in terms of the root accuracy. We used the first 10,000 sentences for training again. According to this graph, Preference Learning is better than SVM, but the difference is small. (They are better than Maximum Entropy Modeling3 that yielded RA=91.5% for the same data.) C does not affect the scores very much unless C is too small. In this experiment, we used Penn's 'correct' POS tags. When we used Collins' POS tags, the scores dropped by about one point.</Paragraph> </Section> <Section position="2" start_page="0" end_page="96" type="sub_section"> <SectionTitle> 3.2 Dependency Analyzer and PPAR </SectionTitle> <Paragraph position="0"> As for the dependency learning, we used the same quadratic kernel again because the quadratic kernel gives the best results according to Yamada's experiments. The soft margin parameter C is 1 following Yamada's experiment. We conducted an experiment to judge the effectiveness of the Root-Node Finder.</Paragraph> <Paragraph position="1"> We follow Yamada's definition of accuracy that excludes punctuation marks.</Paragraph> <Paragraph position="2"> in terms of the Dependency Accuracy of prepositions. SVM's performance is unstable for this task, and Preference Learning outperforms SVM. (We could not get scores of Maximum Entropy Modeling because of memory shortage.) Table 2 shows the improvement given by PPAR.</Paragraph> <Paragraph position="3"> Since training of PPAR takes a very long time, we used only the first 35,000 sentences of the training data. We also calculated the Dependency Accuracy of Collins' Model 3 parser's output for section 23. According to this table, PPAR is better than the Model 3 parser.</Paragraph> <Paragraph position="4"> Now, we use PPAR's output for each preposition instead of the dependency parser's output unless the modification makes the dependency tree into a non-tree graph. Table 3 compares the proposed method with other methods in terms of accuracy. This data According to this table, the proposed method is close to the phrase structure parsers except Complete Rate. Without PPAR, DA dropped to 90.9% and CR dropped to 39.7%.</Paragraph> </Section> </Section> <Section position="5" start_page="96" end_page="96" type="metho"> <SectionTitle> 4 Discussion </SectionTitle> <Paragraph position="0"> We used Preference Learning to improve the SVM-based Dependency Analyzer for root-node finding and PP-attachment resolution. Preference Learning gave better scores than Collins' Model 3 parser for these subproblems. Therefore, we expect that our method is also applicable to phrase structure parsers. It seems that root-node finding is relatively easy and SVM worked well. However, PP attachment is more difficult and SVM's behavior was unstable whereas Preference Learning was more robust. We want to fully exploit Preference Learning for dependency analysis and parsing, but training takes too long. (Empirically, it takes O('2) or more.) Further study is needed to reduce the computational complexity. (Since we used Isozaki's methods (Isozaki and Kazawa, 2002), the run-time complexity is not a problem.) Kudo and Matsumoto (2002) proposed an SVM-based Dependency Analyzer for Japanese sentences. Japanese word dependency is simpler because no word modifies a left word. Collins and Duffy (2002) improved Collins' Model 2 parser by reranking possible parse trees. Shen and Joshi (2003) also used the preference kernel K(xi: ; xj: ) for reranking. They compare parse trees, but our system compares words.</Paragraph> </Section> class="xml-element"></Paper>