XML Viewer - h91-1019

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/h91-1019_metho.xml
Size: 30,753 bytes
Last Modified: 2025-10-06 14:12:41
<?xml version="1.0" standalone="yes"?>
<Paper uid="H91-1019">
  <Title>A Textual processor to handle ATIS queries</Title>
  <Section position="3" start_page="0" end_page="116" type="metho">
    <SectionTitle>
TEXT PROCESSING FOR OAG
QUERIES
</SectionTitle>
    <Paragraph position="0"> The task of natural language processing (including deviations as found in spontaneous speech) is difficult (e.g., witness the difficulty of automatic machine translation of natural languages).</Paragraph>
    <Paragraph position="1"> However, we have simplified the task here by assuming that the user is querying the OAG database. Thus we have a good idea of the type of questions that will usually be asked, and of the typical subjects of those questions. We do not, however, know the format that any individual user may employ. Furthermore, each user is free to use one's own style of speaking and one's own choice of words. We staxt out with a vocabulary that includes all the words (including names) in the OAG database, and extend that vocabulary to include words discovered during training sessions with trial users. While one could theoretically access a dictionary of over 100,000 words (as might be found in a large English dictionary, augmented by the names found in the OAG database), such an approach is probably inefficient for this OAG application (especially if such a large vocabulary had to be searched in a full speech recognition task). Thus we have chosen to limit ourselves to a dictionary of about 700 words (with separate entries for parts of contractions, e.g., 're, 'll). We also employ a list of 47 common word suffixes (e.g., -s, -ed); unrecognized words with such endings have corresponding final letters removed before trying the dictionary again. For example, the word ~cities' is not in the dictionary; so the final -s is removed (and the 'ie' changed to 'y') to locate 'city' in the lexicon (the plural nature of the located noun is also  noted).</Paragraph>
    <Paragraph position="2"> When a word outside this vocabulary is employed, it does not directly affect the text analysis. (From context, we can usually determine the syntactic category of such a word, but not its semantic content.) We assume that such words are not critical for the OAG queries here (empirically this has been generally true; although future training data will likely discover some words which belong in the vocabulary because their presence affects the query response).</Paragraph>
    <Paragraph position="3"> A standard parser for English (if one can be said to exist to handle all of natural English) was not used for this OAG application for two reasons: 1) a significant number (perhaps 10-15 %) of the sentences are not grammatical (which would cause ordinary parsers to have errors), and 2) the limited nature of the task does not require a full English parser. In particular, the query system needs only to extract certain critical information from the text queries, and can largely ignore other extraneous information in the text input. For example, the most common query in the training data appears to be asking about flights (e.g., the user specifies departure and destination cities, with optional timing constraints and/or factors dealing with meals and service class, and wishes to receive details about flights that meet these requirements). In such a scenario, the system must determine the identities of the departure and destination cities and properly extract other information relevant to selecting the desired flights from among the hundreds in the database.</Paragraph>
    <Paragraph position="4"> Extraneous information in such sentences can be in many forms. Idioms, for example, are common in spontaneous speech, but contain little information relevant to help the system to give the correct answer (e.g., &amp;quot;hello,&amp;quot; &amp;quot;all right,&amp;quot; &amp;quot;excuse me,&amp;quot; &amp;quot;thank you&amp;quot;). One could list all known idioms in the dictionary (to be recognized and ignored when found); however, our approach was to use a relatively constrained dictionary of 700 words and simply ignore words that were not found in the dictionary.</Paragraph>
  </Section>
  <Section position="4" start_page="116" end_page="116" type="metho">
    <SectionTitle>
DISCOVERING THE NATURE OF THE DESIRED
INFORMATION
</SectionTitle>
    <Paragraph position="0"> One key aspect of the system's task is to identify what type of information the user wants. For example, does the user indicate a desire for a listing of flights, fares, available meals, stopover cities, explanations, etc.? Is the user's request in the form of a question or a statement? Does the user want a list of information, or a simple yes-or-no answer? We follow the convention that appears to be the case in the training examples, in that we give a list of information in virtually all cases. The assumption is that, even when asking a yes/no question, the user will be better informed if he receives more information than actually requested. For example, in response to &amp;quot;Are there any flights to Pittsburgh?,&amp;quot; instead of simply responding &amp;quot;yes,&amp;quot; we list the appropriate flights. When a person says &amp;quot;Can you show me...?,&amp;quot; &amp;quot;Do you know...?,&amp;quot; or &amp;quot;Don't you have a ...?,&amp;quot; he really does not want only a yes-or-no answer.</Paragraph>
    <Paragraph position="1"> Thus our system identifies the desired information and lists it, rather than giving a yes/no answer. This also avoids the problem of necessarily determining whether the query is a question or a statement. Usually this latter fact can be discovered by the existence of a reversal of the initial words in the utterance (for a yes-no question), or the presence of a Wh-word (e.g., what, when, how) at the start. We cannot simply look at the sentence-final punctuation, because the query text has no punctuation marks (only words).</Paragraph>
    <Paragraph position="2"> We employ heuristics to determine the subject of the request (i.e., the desired information). Keywords are used to discover the subject, and the first such keyword in each query is usually assumed to be the one asked about (ensuing keywords are then assumed to be used to qualify the request). The major keywords include: flight, fare, time, airport, city, class, code, cost, capacity, distance, reservation, book, ground, date, day, restriction, define, describe, explain, abbreviation. For example, keywords such as mean(ing), explain, abbreviation, represent, and stand for signal that an explanation is desired.</Paragraph>
    <Paragraph position="3"> In sentences starting with &amp;quot;What is X...?&amp;quot; or &amp;quot;Show me X...,&amp;quot; the choice of subject arrives early in the sentence and is obvious.</Paragraph>
    <Paragraph position="4"> In other sentences, the topic can arrive late (e.g., &amp;quot;All right now, can you please show me all the available flights...?&amp;quot;). Sentences starting with &amp;quot;Which X...&amp;quot; or &amp;quot;How (long, far, big, much,...)...&amp;quot; also lead to an obvious topic choice. Those beginning directly with a noun (e.g., &amp;quot;Cost of...&amp;quot;) are interpreted as having an implied &amp;quot;Show me the&amp;quot; preceding. Sentences starting with a preposition, on the other hand, are more difficult to analyze as to topic (e.g., &amp;quot;On the flights to Atlanta are meals served?&amp;quot; requests meal information, not a general flight listing); in such cases, the noun in the prepositional phrase is treated as qualifying information.</Paragraph>
  </Section>
  <Section position="5" start_page="116" end_page="117" type="metho">
    <SectionTitle>
FILLING SLOTS IN THE QUALIFYING INFOR-
MATION
</SectionTitle>
    <Paragraph position="0"> Most tables in the OAG database contain columns of information organized by type (e.g., codes, flight numbers, company names, days, classes, etc.), and each row is an entry relating typically a code (number or letter sequence) to relevant information describing a flight, a fare, an aircraft, etc. Most requests are filled by listing information from lines in a table in the database.</Paragraph>
    <Paragraph position="1"> The subject of the request specifies which table to use. To select which lines to list, we must extract relevant qualifying information from the input text (e.g., if 'flights' is the subject, qualifying data may be the departure and/or destination cities and may concern time of travel). After the query subject is established early in each query sentence, ensuing words form phrases and clauses that fill slots in the qualifying information. These ensuing words may form prepositional phrases or relative clauses; no major distinction or classification as to phrase or clause function is needed here, only identification as to what information is contained in the phrases and clauses.</Paragraph>
    <Paragraph position="2"> Flight table It is a simple task to identify city names in a textual query, but more difficult to determine whether a city is the departure, stopover, or destination point. Many requests are straightforward, however (e.g., &amp;quot;...from X to Y...&amp;quot;). Keywords preceding a city or airport name usually identify a city's role: departure (from, out of, leav(ing), depart(ing)), destination (to, land(ing), arriv(ing)), or stopover (connecting at, stop(ping) at, via, through). Lacking such keywords (e.g., &amp;quot;the Atlanta Boston flight&amp;quot;), we assume that the first city is the departing one and the second is the landing one. If only one city is named without keywords, its role must be gleaned from other parts of the query text. (Repeated information is ignored; e.g., &amp;quot;..from Dallas to Baltimore leaving Dallas..&amp;quot;) The keywords above set flags to look for a matching city (e.g., when &amp;quot;to&amp;quot; is encountered, the  destination flag is set; upon finding an ensuing city name, that flag is turned off and the destination city slot is filled). Intervening words such as &amp;quot;the airport at&amp;quot; do not affect the flag. If the sentence ends without a match and the query subject is a city or airport (e.g., &amp;quot;what cities does United fly to?&amp;quot;), the system will look under the corresponding column in the flight table.</Paragraph>
    <Paragraph position="3"> When the user specifies an airline name, it is easily identified, with the possible exception of companies whose names are uttered as a sequence of letters (e.g. US, TWA). The latter case can cause confusion because user requests often contain letter sequences that refer to other tables or to items other than airline companies. For example, the user may be spelling out the code name for a column in a table, the code name of an aircraft or airport, or the code for a service class.</Paragraph>
    <Paragraph position="4"> Aircraft table As a second table example, the aircraft table is usually accessed via a query about an aircraft model number (e.g., &amp;quot;what is a 737,&amp;quot; &amp;quot;describe a D8S&amp;quot;), but it may also be queried in terms of its column entries (e.g., &amp;quot;what airplane is the fastest,&amp;quot; &amp;quot;which plane has the longest range&amp;quot;). Each column is labeled with a noun heading, to which relevant adjectives are associated (e.g., the weight column is associated in the system to the descriptors heavy and light), which allows comparative requests between aircraft. Where the OAG model number differs from the company's public model number (e.g., a DC10 is officially a 'D10'), the system notes this as a special case.</Paragraph>
  </Section>
  <Section position="6" start_page="117" end_page="117" type="metho">
    <SectionTitle>
SEQUENCES OF LETTERS
</SectionTitle>
    <Paragraph position="0"> A two-letter code followed immediately by a digit sequence is tested to be a possible airline name + flight number, by looking for that entry in the flight table. A three-letter code invokes a search of the airport table, for a possible airport code name; a fourletter sequence calls for a possible city code. A letter sequence containing a slash ('/') invokes a look at the restriction table.</Paragraph>
    <Paragraph position="1"> Since the input text is all in capital letters, the distinction between the letter 'A' (as part of a code name) and the article 'a' can lead to ambiguities. For example, 'WHAT IS A D EIGHT S?' requests an entry in the aircraft table corresponding to the code 'D8S.' However, there conceivably could be a code elsewhere in the OAG database of the form 'AD8S.' The system looks first to match the longer letter (and digit) sequence; if no match is located, it strips off the initial 'A' and tries again.</Paragraph>
    <Paragraph position="2"> Users may utter a code name of two or more letters as a single word. Such pronounceable code names are included in the system dictionary. Confusions can arise when such words also have other meanings. For example, in &amp;quot;WHAT DOES AS MEAN (IN THE AIRLINE TABLE)?,&amp;quot; if the user does not specify the airline table (where &amp;quot;AS&amp;quot; means Alaska Airlines), the system might not understand that &amp;quot;AS&amp;quot; is a code (and not a conjunction). Given the frequency of ungrammatical queries in the training data, this is not unreasonable. However, the system looks for the subject of the query when it sees the word &amp;quot;mean(ing)&amp;quot; in the context &amp;quot;what does X mean&amp;quot; or &amp;quot;what is the meaning of (the) X.&amp;quot;</Paragraph>
  </Section>
  <Section position="7" start_page="117" end_page="117" type="metho">
    <SectionTitle>
NUMBERS
</SectionTitle>
    <Paragraph position="0"> Numbers in the input text can refer to dates, times, prices, groups of people, flight numbers, flight codes, fare codes, aircraft codes, etc. The system uses context to correctly interpret digit sequences as numbers. For example, ordinal numbers (except for 'first' - which is often associated with 'first class') are usually associated with dates (similarly for cardinal numbers adjacent to a month name); a number followed by 'a m' or 'pm' is also easy to interpret as a time. Numbers preceded by an article (e.g., 'a 737') are tested to see if they match a model number for an aircraft.</Paragraph>
    <Paragraph position="1"> More interesting are cases of numbers run together; e.g., &amp;quot;Is the departure time for two thirteen four twenty?&amp;quot; (flight 213 leaving at 4:20?).</Paragraph>
    <Paragraph position="2"> The system assumes that numbers are spoken following certain syntactic rules. In particular, people say times as hour + minutes (e.g., 11:40 is 'eleven forty', and not 'one thousand one hundred forty' or any other possibility). Digits are converted into a full number form (e.g., 'sixteen eight twenty' =16820), including time of day (e.g., 'seven o'clock' = 700); thus someone using military time (e.g., 'eighteen hundred hours' = 1800) will be properly interpreted.</Paragraph>
    <Paragraph position="3"> Faced with a number of several digits (e.g., a flight number or code), people usually pronounce it digit-by-digit. For 3- or 4digit numbers, however, the pronunciation is often grouped into digit pairs (e.g., 'flight twenty three forty two' = 2342). Lastly, there is the question of interpeting times as AM or PM; when not explicitly specified, the system assumes flights at reasonable hours (i.e., no departures or landings between 11 pm and 6 am) (e.g., 'twelve o'clock' means 12:00 and not midnight).</Paragraph>
    <Paragraph position="4"> If a digit sequence is preceded by the words 'flight (code)' or 'fare (code),' the identification of the sequence is obvious. Otherwise, a six-digit sequence starting with '1' is assumed to be a flight code, and a seven-digit one starting with '7' to be a fare code. The sequence 'nineteen ninety-X' after a word sequence containing a month is interpreted as a year.</Paragraph>
    <Paragraph position="5"> The preferred times of flights can be specified as: 1) 'after X' and/or 'before Y,' 2) 'between X and Y,' or 3) 'around Z.' Alternatively, the user may specify vague times with terms such as morning, evening, and night.</Paragraph>
    <Paragraph position="6"> A number between about 80 and 1000 is assumed to be a fare if followed by the word 'dollars,' adjacent to the word(s) 'fare (of),' or even followed by the word 'flight.'</Paragraph>
  </Section>
  <Section position="8" start_page="117" end_page="118" type="metho">
    <SectionTitle>
SPECIAL REQUESTS
</SectionTitle>
    <Paragraph position="0"> Occasionally, the user wishes to view the desired information in a specific fashion, e.g., flights ordered by departure time, or fares in order of increasing price. This is determined by the key-words 'sort(ed)' or '(in) order(ed)' plus 'by X' or 'de/increasing X' (where X is price, weight, etc.), or 'alphabetically.' Mathematical operations are sometimes requested: &amp;quot;the difference between fares class Y and F,&amp;quot; &amp;quot;the difference in time between Atlanta and Dallas.&amp;quot; The keywords difference, sum, and average invoke the corresponding mathematical operations using values extracted from the tables for the coordinated items mentioned immediately after the keywords (e.g., &amp;quot;the average fare for classes Y and F').</Paragraph>
  </Section>
  <Section position="9" start_page="118" end_page="118" type="metho">
    <SectionTitle>
COORDINATION
</SectionTitle>
    <Paragraph position="0"> Coordination in general is a difficult computational linguistic problem. The system attempts to group words and phrases on as local as basis as possible. Thus, the word and (or or) will link adjacent words to form a single unit if the words are from the same syntactic class (e.g., &amp;quot;between Dallas and Baltimore&amp;quot;, &amp;quot;fares for taxis and limousines&amp;quot;). If necessary, larger units are grouped next (e.g., &amp;quot;Delta 402 and United 567&amp;quot;); finally the conjunction is treated as separating clauses if foUowed by a verb (e.g., &amp;quot;... and list the...&amp;quot;). A local coordination is verified, if possible, through the appearance of a plural classifying word just before or after the coordinated units (e.g., &amp;quot;flights thirty four and ninety three,&amp;quot; &amp;quot;the Y and F classes&amp;quot;).</Paragraph>
    <Paragraph position="1"> The coordination routine normally links at the most local level (e.g., &amp;quot;flights from Oakland or Dallas to Atlanta&amp;quot; will group the first two cities as departure sites). However, if an inconsistency arrives immediately afterward (e.g., an attempt to fill a slot already filled), the routine will attempt to link larger units (e.g., &amp;quot;flights from Boston to Pittsburgh and Dallas to Atlanta&amp;quot; would normally link Pittsburgh and Dallas as destination cities, but the &amp;quot;to Atlanta&amp;quot; words are inconsistent with that interpretation; so the coordination routine will group the first two cities together and the last two cities together, giving two listings as output.</Paragraph>
    <Paragraph position="2"> The conjunctions and and but invoke a logical 'and' (intersection) when linking separate qualifying information (e.g., &amp;quot;leaving Boston and landing at Atlanta&amp;quot;), whereas or invokes a logical 'or' (union) (e.g., &amp;quot;arriving at or before five o'clock&amp;quot;). On the other hand, when the words immediately following an and relate to a subject topic (e.g., &amp;quot;show the flights and fares...&amp;quot;), then the topic is augmented to deal with both items (e.g., list both flight and fare information). Similarly, when the words after an and attempt to fill qualifying information slots already filled, the system produces an output using the information up to the and, and then continues further using the new qualifying information (e.g., in the 4-city example above, flights would be listed first for the first two cities, then for the next two).</Paragraph>
  </Section>
  <Section position="10" start_page="118" end_page="118" type="metho">
    <SectionTitle>
COMPARISON
</SectionTitle>
    <Paragraph position="0"> Some of the queries request a comparison of numbers (e.g., &amp;quot;list flights under three hundred dollars,&amp;quot; &amp;quot;which airline has the most flights&amp;quot;). When a comparison keyword is located (e.g., under, more, less, last, earliest, next), the direction of the comparison (more vs. less) is noted, and the ensuing noun describes the item being measured (e.g., cost, number of flights, time of flight, etc.). In the case of more or less, the noun after the ensuing than is used for comparison to the subject of the query.</Paragraph>
  </Section>
  <Section position="11" start_page="118" end_page="118" type="metho">
    <SectionTitle>
WORDS TO IGNORE
</SectionTitle>
    <Paragraph position="0"> Many words are effectively ignored during the processing. For example, some and all (as in &amp;quot;show me some/all ...') have no relevance, since the system shows all possibilities in any case.</Paragraph>
    <Paragraph position="1"> Similarly, expressions such as &amp;quot;please, .... OK,&amp;quot; and &amp;quot;I'm sorry&amp;quot; (while useful in a polite, user-friendly interface) are ignored here.</Paragraph>
    <Paragraph position="2"> Also, some users have a habit of saying letters followed by &amp;quot;as in X&amp;quot; (e.g., &amp;quot;class Q as in queen&amp;quot;). This brings words (e.g., 'queen') into the dialogue that are invariably outside the system vocabulary. When the system sees LETTER + &amp;quot;as in&amp;quot;, it ignores the &amp;quot;as in X&amp;quot;-phrase.</Paragraph>
    <Paragraph position="3"> Since the system does not attempt to make a complete parse for the text input, it can handle word repetitions by speakers (as found with interruptions and hesitation pauses). While immediate word repetitions are ignored (on the assumption of possible hesitations), when the repetition is a digit, the full resulting number is first tried in the table look-up. For example, in &amp;quot;twelve twelve ninety,&amp;quot; the number is assumed to be 121290; if no match is found in the tables, then the number 1290 is assumed.</Paragraph>
  </Section>
  <Section position="12" start_page="118" end_page="119" type="metho">
    <SectionTitle>
DISCUSSION OF RESULTS
</SectionTitle>
    <Paragraph position="0"> The system was officially tested on February 6, 1991, with the results that 54 queries were correctly answered and 94 were not.</Paragraph>
    <Paragraph position="1"> At the time, the system was not set up to give a &amp;quot;no answer&amp;quot; for cases that it did not feel confident. The relatively poor performance can be largely explained by the fact that the system was prematurely tested, without having been properly debugged. As of March 6, most of the bugs had been removed and the system was tested again on the same sentence queries, with the results that 110 were answered correctly, with 19 false responses and two &amp;quot;no answers.&amp;quot; This second testing was done with the benefit of having examined the test data, to correct the program, both from the point of view of system bugs and inadequate coverage.</Paragraph>
    <Paragraph position="2"> The majority of the improvement was simply due to eliminating system bugs. A large majority of the remaining incorrect performance can also be readily eliminated with a little more effort.</Paragraph>
    <Paragraph position="3"> Thus, the approach described in this paper is certainly capable of handling queries typical of the ATIS data in the range exceeding 90%.</Paragraph>
    <Paragraph position="4"> We examine now where the revised (debugged) system does well and not so well, and point out where the recent improvement is due to rule modification (to better cover more types of queries, as revealed by certain queries in the February 1991 test set) as opposed to simple system debugging. Since the system ignores words that it does not recognize, it continues to make a mistake on sentence cj0011sx, where &amp;quot;from Dallas Love Field&amp;quot; is interpreted as simply &amp;quot;from Dallas.&amp;quot; Sentences ci00hlsx and ci00clsx (noting December 14th as &amp;quot;121490&amp;quot;) are now covered, due to a rule addition permitting dates in pure digit form (this type of date representation was new in the test data). Sentence cp00plsx (&amp;quot;..earliest wide-body flight...&amp;quot;) is now easily handled, with the addition of a rule testing for the first flight of the day (if &amp;quot;earliest&amp;quot; or &amp;quot;first&amp;quot; appears just before the word &amp;quot;flight&amp;quot;), as well as the last one (keywords = &amp;quot;last,&amp;quot; &amp;quot;latest&amp;quot;).</Paragraph>
    <Paragraph position="5"> Looking at sentences which caused the most problems in the official February 1991 results (e.g., those for which at most two of the eight sites who submitted results were correct), we can see examples of where our system can correctly handle difficult queries. It is our system's ability to ignore irrelevant words and not require a full parse that allows it to accept syntactic and semantic structures that have not been seen in the training data.</Paragraph>
    <Paragraph position="6"> In sentence ci00klsx (&amp;quot;Can you please tell me what time zone Dallas would be on thank you&amp;quot;), both the initial five words and final five words are irrelevant to the message and are ignored by our system, which seizes upon the initial, subject keywords time zone and ensuing location name Dallas to produce the correct answer. The preposition on at the end of the sentence can cause problems for parsers that insist on accounting for every word and which involve a semantic module (for which a city should not be &amp;quot;on&amp;quot; a time zone). Similar comments hold for the final preposition  to in sentence cl00wlsx (&amp;quot;Please show all cities that Delta airlines flies to&amp;quot;), whereas our system sees cities as the subject keyword and Delta as the only other relevant information.</Paragraph>
    <Paragraph position="7"> In sentence ce00plsx (&amp;quot;How long does it take to drive from the airport to downtown Atlanta&amp;quot;), the subject keyword is how long, does it take to is ignored, drive specifies ground transportation, and the remaining words fill in the to - from slots. By ignoring distracting words such as menu and seat (in sentence ch00klsx - &amp;quot;...menu of departures...&amp;quot;; in sentence cl00dlsx - &amp;quot;how many persons does a 757 seat&amp;quot;), our system avoids mistaking the subject of some queries. Similar comments hold for the word major in sentence cj00ilsx (&amp;quot;closest major airport to San Francisco&amp;quot;). Our system correctly handles even most cases of mentioning of locations outside of the 11 cities of the database (although the case of &amp;quot;Love Field&amp;quot; above shows its limitations). In sentence cj0081sx (&amp;quot;...with a stop in Las Vegas&amp;quot;), the system looks for a stopover location after the keywords stop in; finding words there which are not in the dictionary, the system assumes a location outside the database.</Paragraph>
    <Paragraph position="8"> In sentence cp0021sx (&amp;quot;Does American flight 1010 leaving at 1303 have any stops enroute&amp;quot;), the initial keyword does cues a yes-no question; flight is not taken as the subject because number and digits ensue immediately; instead, stops is taken as crucial information and the potentially confusing word enroute (which may not have appeared in earlier training data) is ignored. In the stilted sentence cp00flsx (&amp;quot;How many engines does a D 10 equipment have&amp;quot;), the subject is identified by the first three words, the letter/number sequence D10 is noted as a model number, and equipment is treated as superfluous data.</Paragraph>
    <Paragraph position="9"> There are cases of ATIS queries in which a full parser can be useful, but there are many other cases such as these above where not performing a full parse and ignoring superfluous words can accomplish the task as well with less effort.</Paragraph>
  </Section>
  <Section position="13" start_page="119" end_page="119" type="metho">
    <SectionTitle>
DIFFERENCES BETWEEN TEXT AND
SPEECH PARSERS
</SectionTitle>
    <Paragraph position="0"> In recent years, there has been considerable work on parsing of general text in the context of natural language analysis \[1\].</Paragraph>
    <Paragraph position="1"> A parser in a speech recognition context, however, encounters problems that a text parser does not have \[2\]. For example, in determining syntactic structure, a text parser has access to punctuation (e.g., quotation marks, parentheses), capitalization, and other phrase-offsetting devices (e.g., italics, underlining). Major punctuation marks (periods, exclamation and question marks) denote the ends of sentences, and others (colons and semicolons) mark the ends of major clauses; a text parser can thus easily determine major syntax boundaries and need only operate on sets of words between such markers to parse each set into a logical clause or sentence. For a speech parser, on the other hand, gross segmentation cues may take the (unreliable) form of pauses (e.g., speakers often pause at major syntactic boundaries - but not consistently - and hesitation pauses can cause significant difficulties). Swings in vocal fundamental frequency which are often correlated with syntactic boundaries \[3\] also furnish (at best) unreliable indicators for a parser to use. In text, the appearance of capital letters (except, of course, at the start of a sentence) indicates a proper name (and thus usually a noun); such a fact can help a text parser distinguish such words which may alternatively (without capitalization) be used as other parts-of-speech. A speech parser has no access to such information found readily in texts.</Paragraph>
    <Paragraph position="2"> In the context of data entry via voice, one could envision requiring a user to pronounce aloud markers such as capitalization and punctuation. However, this forces a departure from natural speaking style (and slows the rate of data entry) and so is not preferred in most applications. In the application explored in this paper - an isolated-word system - only the actual words were pronounced (as in speaking naturally, except of course for the brief pause required after each word). In isolated-word systems, no durational information (i.e., from pauses or from word lengths), however unreliable, can be exploited to determine the syntactic function of individual words.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML