File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/97/w97-0101_abstr.xml
Size: 2,240 bytes
Last Modified: 2025-10-06 13:48:57
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0101"> <Title>Summary of Invited Speech Miteh Marcus</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> PART I </SectionTitle> <Paragraph position="0"> Over the past several years, there has been very significant and continuing progress in the development of accurate parsers for unconstrained text; much of this breakthrough has depended crucially on the use of statistical methods which estimate model parameters from a &quot;tree bank&quot; of hand-parsed sentences. This use of statistical models was clearly inspired by the use of statistical methods for speech recognition, where self-organizing systems based on statistics are now beginning to achieve commercial success, and greatly outperform systems which attempt to explicitly encode linguistic knowledge. Many of the techniques used in statistical parsing derive rather directly from methods used for speech recognition; this is particularly true of methods for dealing with sparse data.</Paragraph> <Paragraph position="1"> Tiffs progress in parsing accuracy continues; within the last month, several new parsers have been reported, two developed within the speaker's group at the University of Pennsylvania, which show distinct improvement over the best parsing results of even a year ago. Most surprisingly, each of of these parsers incorporates a very different model, yet they perform similarly.</Paragraph> <Paragraph position="2"> This talk will argue that this progress in parsing, unlike the earlier progress in speech recognition, depends crucially on the combination of explicit linguistic representation and statistical estimation. I will claim that the reason these recent systems perform so similarly is that they explicitly encode very similar levels of linguistic representation. Furthermore, even in the use of smoothing techniques to deal with sparse data, paying close attention to underlying representational issues is often crucial; systems that utilize designer-encoded linguistic knowledge about representations currently significantly outperform representation-poor &quot;self-organizing&quot; smoothing methods.</Paragraph> </Section> class="xml-element"></Paper>