File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/n04-1011_metho.xml
Size: 10,528 bytes
Last Modified: 2025-10-06 14:08:54
<?xml version="1.0" standalone="yes"?> <Paper uid="N04-1011"> <Title>Sentence-Internal Prosody Does not Help Parsing the Way Punctuation Does</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> NORM LAST RHYME DUR, FOK WRD DIFF MNMN N, FOK LR MEAN KBASELN and SLOPE MEAN DIFF N in </SectionTitle> <Paragraph position="0"> the data provided by Ferrer et al. (2002).</Paragraph> <Paragraph position="1"> While Ferrer (2002) should be consulted for full details, PAU DUR N is pause duration normalized by the speaker's mean sentence-internal pause duration, NORM LAST RHYME DUR is the duration of the phone minus the mean phone duration normalized by the standard deviation of the phone duration for each phone in the rhyme, FOK WRD DIFF MNMN NG is the log of the mean f0 of the current word, divided by the log mean f0 of the following word, normalized by the speakers mean range,</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> FOK LR MEAN KBASELN is the log of the mean f0 </SectionTitle> <Paragraph position="0"> of the word normalized by speaker's baseline, and</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> SLOPE MEAN DIFF N is the difference in the f0 slope </SectionTitle> <Paragraph position="0"> normalized by the speaker's mean f0 slope.</Paragraph> <Paragraph position="1"> These variables all range over continuous values.</Paragraph> <Paragraph position="2"> Modern statistical parsing technology has been developed assuming that all of the input variables are categorical, and currently our parser can only use categorical inputs. Given the complexity of the dynamic programming algorithms used by the parser, it would be a major research undertaking to develop a statistical parser of the same quality as the one used here that is capable of using both categorical and continuous variables as input.</Paragraph> <Paragraph position="3"> In the experiments below we binned the continuous prosodic variables to produce the actual categorical values used in our experiments. Binning involves a trade-off, as fewer bins involve a loss of information, whereas a large number of bins splits the data so nely that the statistical models used in the parser fail to generalize. We binned by rst constructing a histogram of each feature's values, and divided these values into bins in such a way that each bin contained the same number of samples. In runs in which a single feature is the sole prosodic feature we divided that feature's values into 10 bins, while runs in which two or more prosodic features were conjoined we divided each feature into 5 bins.</Paragraph> <Paragraph position="4"> While not reported here, we experimented with a wide variety of different binning strategies, including using the bins proposed by Ferrer et al. (2002). In fact the number of bins used does not affect the results markedly; we obtained virtually the same results with only two bins.</Paragraph> <Paragraph position="5"> We generated and inserted pseudo-punctuation symbols based on these binned values that were inserted into the parse input as described below. In general, a pseudo-punctuation symbol is the conjunction of the binned values of all of the prosodic features used in a particular run. When mapping from binned prosodic variables to pseudo-punctuation symbols, some of the binned values can be represented by the absence of a pseudo-punctuation symbol.</Paragraph> <Paragraph position="6"> Because we intend these pseudo-punctuation symbols to be as similar as possible to normal punctuation, we generated pseudo-punctuation symbols only when the corresponding prosodic variable falls outside of its typical values. The ranges are given below, and were chosen so that they align with bin boundaries and result in each type of pseudo-punctuation symbol occuring on 40% of words.</Paragraph> <Paragraph position="7"> Thus when a prosodic feature is used alone only 4 of its 10 bins are represented by a pseudo-punctuation symbol.</Paragraph> <Paragraph position="8"> However, when two or more types of the prosodic pseudo-punctuation symbols are used at once there is a larger number of different pseudo-punctuation symbols and a greater number of words appearing with a following pseudo-punctuation symbol.</Paragraph> <Paragraph position="9"> For example, when P, R and S prosodic annotations are used together there are 89 distinct types of prosodic pseudo-punctuation symbols in our corpus, and 54% of words are followed by a prosodic pseudo-punctuation symbol.</Paragraph> <Paragraph position="10"> The experiments below make use of the following types of pseudo-punctuation symbols, either alone or concatenated in combination. See Figure 2 for an example tree with pseudo-punctuation symbols inserted.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> NORM LAST RHYME DUR value, and is only </SectionTitle> <Paragraph position="0"> generated that value is greater than -0.061.</Paragraph> <Paragraph position="1"> Wb This is based on the bin b of the binned</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="metho"> <SectionTitle> FOK WRD DIFF MNMN N value, and is only gen- </SectionTitle> <Paragraph position="0"> erated when that value is less than -0.071 or greater than 0.0814.</Paragraph> <Paragraph position="1"> Lb This is based on the bin b of the</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="metho"> <SectionTitle> FOK LR MEAN KBASELN value, and is only </SectionTitle> <Paragraph position="0"> generated when that value is less than 0.157 or greater than 0.391.</Paragraph> <Paragraph position="1"> Sb This is based on the bin b of the</Paragraph> </Section> <Section position="10" start_page="0" end_page="0" type="metho"> <SectionTitle> SLOPE MEAN DIFF N value, and is only </SectionTitle> <Paragraph position="0"> generated whenever that value is non-zero.</Paragraph> <Paragraph position="1"> In addition, we also created a binary version of the P feature in order to evaluate the effect of binarization. null NP This is based on the PAU DUR N value, and is only generated when that value is greater than 0.285.</Paragraph> <Paragraph position="2"> We actually experimented with a much wider range of binned variables, but they all produced results similar to those described below.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Parse corpus construction </SectionTitle> <Paragraph position="0"> We tried to incorporate the binned prosodic information described in the previous subsection in a manner that corresponds as closely as possible to the way that punctuation is represented in this corpus, because previous experiments have shown that punctuation improves parser performance (Charniak and Johnson, 2001; Engel et al., 2002). We deleted disuency tags and EDITED subtrees from our training and test corpora.</Paragraph> <Paragraph position="1"> We investigated several combinations of prosodic pseudo-punctuation symbols. For each of these we generated a training and test corpus. The pseudo-punctuation symbols are dominated by a new preterminal PROSODY to produce a well-formed tree.</Paragraph> <Paragraph position="2"> These prosodic local trees are introduced into the tree following the word they described, and are attached as high as possible in the tree, just as punctuation is in the Penn treebank. Figure 2 depicts a typical tree that contains P R S prosodic pseudo-punctuation symbols inserted following the word they describe.</Paragraph> <Paragraph position="3"> We experimented with several other ways of incorporating prosody into parse trees, none of which greatly affected the results. For example, we also experimented with a raised representation in which the prosodic pseudo-punctuation symbol also serves as the preterminal label. The corresponding raised version of the example tree is depicted in Figure 3.</Paragraph> <Paragraph position="4"> The motivation for raising is as follows. The statistical parser used for this research generates the siblings of a head in a sequential fashion, rst predicting the category label of a sibling and later conditioning on that label to predict the remaining siblings. Raising should permit the generative model to condition not just on the presence of a prosodic pseudo-punctuation symbol but also on its actual identity. If some but not all of the prosodic pseudo-punctuation symbols were especially indicative of some aspect of phrase structure, then the raising structures should permit the parsing model to detect this and condition on just those symbols. Note that in the Penn treebank annotation scheme, different types of punctuation are given different preterminal categories, so punctuation is encoded in the treebank using a raised representation.</Paragraph> <Paragraph position="5"> The resulting corpora contain both prosodic and punctuation information. We prepared our actual training and testing corpora by selectively removing subtrees from these corpora. By removing all punctuation subtrees we obtain corpora that contain prosodic information but no punctuation, by removing all prosodic information we obtain the original treebank data, and by removing both prosodic and punctuation subtrees we obtain corpora that contain neither type of information.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 Evaluation </SectionTitle> <Paragraph position="0"> We trained and evaluated the parser on the various types of corpora described in the previous section.</Paragraph> <Paragraph position="1"> trained and tested on corpora with varying prosodic pseudo-punctuation symbols. The entry punctuation gives the parser's performance on input with standard punctuation, while none gives the parser's performance on input without any punctuation or prosodic pseudo-punctuation whatsoever. (We always tested on the type of corpora that corresponded to the training data). We evaluated parser performance using the methodology described in Engel et al. (2002), which is a simple adaptation of the well-known PARSEVAL measures in which punctuation and prosody preterminals are ignored. This evaluation yields precision, recall and F-score values for each type of training and test corpora.</Paragraph> </Section> </Section> class="xml-element"></Paper>