XML Viewer - p06-3009

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-3009_evalu.xml
Size: 3,930 bytes
Last Modified: 2025-10-06 13:59:45
<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-3009">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Integrated Morphological and Syntactic Disambiguation for Modern Hebrew</Title>
  <Section position="7" start_page="52" end_page="53" type="evalu">
    <SectionTitle>
4.4 Results
</SectionTitle>
    <Paragraph position="0"> Table 4 shows the evaluation scores for models I-A to II-C. To the best of our knowledge, these are the rst parsing results for MH assuming no manual interference for morphological disambiguation.</Paragraph>
    <Paragraph position="1"> For all sets, parsing of tagged-segments (Model II) shows improvement of up to 2% over parsing bare segments' sequences (Model I). This indicates that morphosyntactic information selected in tandem with morphological segmentation is more informative for syntactic analysis than segmentation alone. We also observe decreasing string coverage for Model II, possibly since disambiguation based on short context may result in a probable, yet incorrect, POS tag assignment for which the parser cannot recover a syntactic analysis. Correct disambiguation may depend on long-distance cues, e.g., agreement, so we advocate percolating the ambiguity further up to the parser.</Paragraph>
    <Paragraph position="2"> Comparing the performance for the different tag sets, parsing accuracy increases for models I-B/C and II-B/C while POS tagging results decrease.</Paragraph>
    <Paragraph position="3"> These results seem to contradict the common wisdom that performance on a 'complex' task de12Since we evaluate the models' performance on an integrated task, sentences in which one of the subcomponents failed to propose an analysis counts as zero for all subtasks.  pends on a 'simpler', preceding one; yet, they support our thesis that morphological information orthogonal to syntactic categories facilitates syntactic analysis and improves disambiguation capacity.</Paragraph>
  </Section>
  <Section position="8" start_page="53" end_page="53" type="evalu">
    <SectionTitle>
5 Discussion
</SectionTitle>
    <Paragraph position="0"> Devising a baseline model for morphological and syntactic processing is of great importance for the development of a broad-coverage statistical parser for MH. Here we provide a set of standardized baseline results for later comparison while consolidating the formal and architectural underpinning of an integrated model. However, our results were obtained using a relatively small set of training data and a weak (unlexicalized) parser, due to the size of the corpus and its annotated scheme.13 Training a PCFG on our treebank resulted in a severely ambiguous grammar, mainly due to high phrase structure variability.</Paragraph>
    <Paragraph position="1"> To compensate for the at, ambiguous phrasestructures, in the future we intend to employ probabilistic grammars in which all levels of non-terminals are augmented with morphological information percolated up the tree. Furthermore, the MH treebank annotation scheme features a set of so-called functional features14 which express grammatical relations. We propose to learn the correlation between various morphological markings and functional features, thereby constraining the space of syntactic structures to those which express meaningful predicate-argument structures.</Paragraph>
    <Paragraph position="2"> Since our data set is relatively small,15 introducing orthogonal morphological information to syntactic categories may result in severe data sparseness. In the current architecture, smoothing is handled separately by each of the subcomponents.</Paragraph>
    <Paragraph position="3"> Enriched grammars would allow us to exploit multiple levels of information in smoothing the estimated probabilities and to redistribute probability mass to unattested events based on their similarity to attested events in their integrated representation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML