File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/98/p98-2184_abstr.xml
Size: 1,699 bytes
Last Modified: 2025-10-06 13:49:27
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2184"> <Title>How Verb Subcategorization Frequencies Are Affected By Corpus Choice</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> The probabilistic relation between verbs and their arguments plays an important role in modern statistical parsers and supertaggers, and in psychological theories of language processing. But these probabilities are computed in very different ways by the two sets of researchers. Computational linguists compute verb subcategorization probabilities from large corpora while psycholinguists compute them from psychological studies (sentence production and completion tasks).</Paragraph> <Paragraph position="1"> Recent studies have found differences between corpus frequencies and psycholinguistic measures. We analyze subcategorization frequencies from four different corpora: psychological sentence production data (Connine et al. 1984), written text (Brown and WSJ), and telephone conversation data (Switchboard). We find two different sources for the differences.</Paragraph> <Paragraph position="2"> Discourse influence is a result of how verb use is affected by different discourse types such as narrative, connected discourse, and single sentence productions. Semantic influence is a result of different corpora using different senses of verbs, which have different subcategorization frequencies. We conclude that verb sense and discourse type play an important role in the frequencies observed in different experimental and corpus based sources of verb subcategorization frequencies.</Paragraph> </Section> class="xml-element"></Paper>