File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/a00-2038_intro.xml

Size: 3,225 bytes

Last Modified: 2025-10-06 14:00:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-2038">
  <Title>A New Algorithm for the Alignment of Phonetic Sequences</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Identification of the corresponding segments in sequences of phones is a necessary step in many applications in both diachronic and synchronic phonology. Usually we are interested in aligning sequences that represent forms that are related in some way: a pair of cognates, or the underlying and the surface forms of a word, or the intended and the actual pronunciations of a word. Alignment of phonetic sequences presupposes transcription of sounds into discrete phonetic segments, and so differs from matching of utterances in speech recognition. On the other hand, it has much in common with the alignment of proteins and DNA sequences. Many methods developed for molecular biology can be adapted to perform accurate phonetic alignment.</Paragraph>
    <Paragraph position="1"> Alignment algorithms usually contain two main components: a metric for measuring distance between phones, and a procedure for finding the optimal alignment. The former is often calculated on the basis of phonological features that encode certain properties of phones. An obvious candidate for the latter is a well-known dynamic programming (DP) algorithm for string alignment (Wagner and Fischer, 1974), although other algorithms can used as well. The task of finding the optimal alignment is closely linked to the task of calculating the distance between two sequences. The basic DP algorithm accomplishes both tasks. Depending on the application, either of the results, or both, can be used. Within the last few years, several different approaches to phonetic alignment have been reported.</Paragraph>
    <Paragraph position="2"> Covington (1996) used depth-first search and a special distance function to align words for historical comparison. In a follow-up paper (Covington, 1998), he extended the algorithm to align words from more than two languages. Somers (1998) proposed a special algorithm for aligning children's articulation data with the adult model. Gildea and Jurafsky (1996) applied the DP algorithm to pre-align input and output phonetic strings in order to improve the performance of their transducer induction system. Nerbonne and Heeringa (1997) employed a similar procedure to compute relative distance between words from various Dutch dialects. Some characteristics of these implementations are juxtaposed in Table 1.</Paragraph>
    <Paragraph position="3"> In this paper, I present a new algorithm for the alignment of cognates. It combines various techniques developed for sequence comparison with an appropriate scoring scheme for computing phonetic similarity on the basis of multivalued features. The new algorithm performs better, in terms of accuracy and efficiency, than comparable algorithms reported by Covington (1996) and Somers (1999). Although the main focus of this paper is diachronic phonology, the techniques proposed here can also be applied in other contexts where it is necessary to align phonetic sequences.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML