File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/c00-2105_concl.xml
Size: 1,856 bytes
Last Modified: 2025-10-06 13:52:45
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-2105"> <Title>Robust German Noun Chunking With a Probabilistic Context-Free Grammar</Title> <Section position="6" start_page="730" end_page="731" type="concl"> <SectionTitle> 5 Summary </SectionTitle> <Paragraph position="0"> We t)resenl;ed a German noun ('hunker for unrestricted text. The chunker in based on a head-lexicalised probabilistic context-free grammar and |;r~,ined on unlal)elled data. ~\]'he base grammar was semi-automatically augmenl;ed with robust, heSS rules in order to cover unrestricted input. An algorit;hm for chunk exl, ract, ion was develoi)ed which maximises the probabilil;y of l;he chunk set;s rather than the probability of single t)arses like l;he Vil;erl)i algorithm. null German noun chunks were del;ected wil;h 93% 1)recision and 92(~) re(:all. Asking the clmnker to additionally identil~y the syntactic category and l;he case of the chunks resulted in recall of 83% and precision of 84~). A COml)arison of different training strategies showed that unlexicalised parsing inforlnation xv~s sufIi('ienl; for noun chunk extra(:l;ion wil;h and wil;ho111; (:~s('. informal;ion. The base gralltllt~r played an iml)orl;ant role in the chunker dev(dot)ment: (i) building the (:hunker on |,11(; basis of an ~dready train(~(t gr~mmmr iml)rov(~d the chtmker rules, and (ii) relining l;he base grammar wil;h even simple verbtirs\[; and verl)-se(:ond rules improved accuracy, so it should \])e worthwhile to flirt;her extend lhe grammar rules. Increasing l;he ~mlounl; of training (tal;a also improved noun ('hunk r(;cognition, especially case disaml)iguat;ion. I~(;IA:er heuristi(:s for guessing the I)arts-of-st)eech of unknown words should flu'ther improve l;he noun chunk recognition, since lnalk~, errors were ('ause(1 l)y llnk\]~own words.</Paragraph> </Section> class="xml-element"></Paper>