File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-2934_metho.xml

Size: 2,155 bytes

Last Modified: 2025-10-06 14:10:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2934">
  <Title>Multi-lingual Dependency Parsing with Incremental Integer Linear Programming</Title>
  <Section position="5" start_page="227" end_page="228" type="metho">
    <SectionTitle>
3 System Summary
</SectionTitle>
    <Paragraph position="0"> We use four different feature sets. The first feature set, BASELINE, is taken from McDonald and Pereira (2005b). It uses the FORM and the POSTAG fields. This set also includes features that combine the label and POS tag of head and child such as (Label,POSHead) and (Label,POSChild[?]1). For our Arabic and Japanese development sets we obtained the best results with this configuration. We also use this configuration for Chinese, German and Portuguese because training with other configurations took too much time (more than 7 days).</Paragraph>
    <Paragraph position="1"> The BASELINE also uses pseudo-coarse-POS tag (1st character of the POSTAG) and pseudo-lemma tag (4 characters of the FORM when the length is more than 3). For the next configuration we substitute these pseudo-tags by the CPOSTAG and LEMMA fields that were given in the data. This configuration was used for Czech because for other configurations training could not be finished in time.</Paragraph>
    <Paragraph position="2"> The third feature set tries to exploit the generic FEATS field, which can contain a list features such as case and gender. A set of features per dependency is extracted using this information. It consists of cross product of the features in FEATS. We used this configuration for Danish, Dutch, Spanish  CPOSTAG and LEMMA fields for the head. This configuration is used for Slovene and Swedish data where it performed best during development.</Paragraph>
    <Paragraph position="3"> Finally, we add constraints for Chinese, Dutch, Japanese and Slovene. In particular, arity constraints to Chinese and Slovene, coordination and arity constraints to Dutch, arity and selective projectivity constraints for Japanese2. For all experiments b was set to 2. We did not apply additional constraints to any other languages due to lack of time.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML