File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/n06-1044_concl.xml

Size: 2,585 bytes

Last Modified: 2025-10-06 13:55:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1044">
  <Title>for Psycholinguistics</Title>
  <Section position="7" start_page="347" end_page="347" type="concl">
    <SectionTitle>
6 Conclusions
</SectionTitle>
    <Paragraph position="0"> In this paper we have investigated a number of methods for the empirical estimation of probabilistic context-free grammars, and have shown that the resulting grammars have the so-called consistency property. This property guarantees that all the probability mass of the grammar is used for the finite strings it derives. Thus if the grammar is used in combination with other probabilistic models, as for instance in a speech processing system, consistency allows us to combine or compare scores from different modules in a sound way.</Paragraph>
    <Paragraph position="1"> To obtain our results, we have used a novel proof technique that exploits an already known construction for the renormalization of probabilistic context-free grammars. Our proof technique seems more intuitive than arguments previously used in the literature to prove the consistency property, based on counting arguments or on spectral analysis. It is not difficult to see that our proof technique can also be used with probabilistic rewriting formalisms whose underlying derivations can be characterized by means of context-free rewriting. This is for instance the case with probabilistic tree-adjoining grammars (Schabes, 1992; Sarkar, 1998), for which consistency results have not yet been shown in the literature.</Paragraph>
    <Paragraph position="2"> A Cross-entropy minimization Inorder tomake thispaper self-contained, wesketch a proof of the claim in Section 3 that the estimator in (12) minimizes the cross entropy in (11). A full proof appears in (Corazza and Satta, 2006).</Paragraph>
    <Paragraph position="3"> Let D, pD and G = (N,S,R,S) be defined as in Section 3. We want to find a proper PCFG G = (G,pG) such that the cross-entropy H(pD ||pG) is minimal. We use Lagrange multipliers lA for each</Paragraph>
    <Paragraph position="5"> By setting to zero all of the above partial derivatives, we obtain a system of|N|+|R|equations, which we must solve. From [?][?][?]pG(A-a) = 0 we obtain</Paragraph>
    <Paragraph position="7"> We sum over all strings a such that (A - a) [?] R,  = EpD f(A,d). (28) From each equation [?][?][?]lA = 0 we obtainsummationtext apG(A - a) = 1 for each A [?] N (our original constraints). Combining this with (28) we obtain</Paragraph>
    <Paragraph position="9"> This is the estimator introduced in Section 3.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML