File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/89/h89-2027_intro.xml
Size: 2,327 bytes
Last Modified: 2025-10-06 14:04:51
<?xml version="1.0" standalone="yes"?> <Paper uid="H89-2027"> <Title>An The N-Best Algorithm: Efficient Procedure for Finding Top N Sentence Hypotheses</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> In a spoken language system (SLS) we have a large search problem. We must find the most likely word sequence consistent with all knowledges sources (speech, statistical N-gram, natural language). The natural language (NL) knowledge sources are many and varied, and might include syntax, semantics, discourse, pragmatics, and prosodics. One way to use all of these constraints is to perform a top-down search that, at each point, uses all of the knowledge sources (KSs) to determine which words can come next, and with what probabilities. Assuming an exhaustive search in this space, we can find the most likely sentence. However, since many of these KSs contain &quot;long-distance&quot; effects (for example, agreement between words that are far apart in the input), the search space can be quite large, even when pruned using various beam search or best-first search techniques. Furthermore, a top-down search strategy requires that all of the KSs be formulated in a predictive, left-to-right manner. This may place an unnecessary restriction on the type of knowledge that can be used.</Paragraph> <Paragraph position="1"> The general solution that we have adopted is to apply the KSs in the proper order to constrain the search progressively. Thus, we trade off the entropy reduction that a KS provides against the cost of applying that KS. Naturally, we can also use a pruning strategy to reduce the search space further. By ordering the various KSs, we attempt to minimize the computational costs and complexity for a given level of search error rate. To do this we apply the most powerful and cheapest KSs first to generate the top N hypotheses. Then, these hypotheses are evaluated using the remaining KSs. In the remainder of this paper we present the N-best search paradigm, followed by the N-best search algorithm. Finally, we present statistics of the rank of the correct sentence in a list of the top N sentences using acoustic-phonetic models and a statistical language model.</Paragraph> </Section> class="xml-element"></Paper>