File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/p97-1024_concl.xml

Size: 2,743 bytes

Last Modified: 2025-10-06 13:57:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="P97-1024">
  <Title>Independence Assumptions Considered Harmful</Title>
  <Section position="7" start_page="187" end_page="188" type="concl">
    <SectionTitle>
5 Conclusions
</SectionTitle>
    <Paragraph position="0"> We have contrasted two types of statistical language models: A model that derives a probability distribution over the response variable that is properly conditioned on the combination of the explanatory variable, and a simpler model that treats the explanatory variables as independent, and therefore models the response variable simply a~s the addition of the individual main effects of the explanatory variables.</Paragraph>
    <Paragraph position="1"> 2These features use tile s~unc Mutual Informationba.~ed measure of lcxic',d a.sso(:iation a.s tim prc.vious log-linear model for two possibh~&amp;quot; attachment sites, which wcrc estimated from all nomin'M azt(l vcrhal PP att~t(:hments in the corpus. The features FIRST-NOUN-LEVEL aaM SECOND-NOUN-LEVEL use the same estimates: in other words, in contrm~t to the &amp;quot;split Lexi(:al Association&amp;quot; method, they were not estimated sepaxatcly for the two different nominaJ, attachment sites.</Paragraph>
    <Paragraph position="2">  The experimental results show that, with the same feature set, inodeling feature interactions yields better performance: such nmdels achieves higher accuracy, and its accura~,y can be raised with additional features. It is interesting to note that modeling variable interactions yields a higher perforlnanee gain than including additional explanatory variables.</Paragraph>
    <Paragraph position="3"> While these results do not prove that modeling feature interactions is necessary, we believe that they provide a strong indication. This suggests a mlmber of avenues for filrther research.</Paragraph>
    <Paragraph position="4"> First, we could attempt to improve the specific models that were presented by incorporating additional features, and perhal)S by taking into account higher-order features. This might help to address the performance gap between our models and human subjects that ha,s been documented in the literature, z A more ambitious idea would be to use a statistical model to rank overall parse quality for entire sentences. This would be an improvement over schemes that a,ssnlne independence between a number of individual scoring fimctions, such ms (Alshawi and Carter, 1994). If such a model were to include only a few general variables to account for such features a.~ lexical a.ssociation and recency preference for syntactic attachment, it might even be worthwhile to investigate it a.s an approximation to the human parsing mechanism.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML