File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/85/j85-4002_intro.xml

Size: 9,472 bytes

Last Modified: 2025-10-06 14:04:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="J85-4002">
  <Title>PHRED: A GENERATOR FOR NATURAL LANGUAGE INTERFACES 1</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> The PHRED (PHRasal English Diction) system is a language generation module for natural language interfaces. The generator operates from a declarative knowledge base of linguistic knowledge, common to that used by PHRAN (PHRasal ANalyzer; Wilensky and Arens, 1980). PHRED and PHRAN together form an interface for analyzing natural language and producing natural language responses. This interface serves as the linguistic component to the UNIX Consultant system (UC) (Wilensky, Arens, and Chin 1984), a program for responding to inquiries about the UNIX operating system. As the entire UC system operates in several seconds of CPU time, it is an important feature of PHRED that it requires no more than two or three seconds to produce a complete sentence.</Paragraph>
    <Paragraph position="1"> The principal knowledge structure used by PHRAN and PHRED is the pattern-concept pair, which links a phrasal pattern to a conceptual template. This structure has proven particularly effective in the encoding of specialized linguistic knowledge, i.e., knowledge about Copyright1985 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the CL reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission. 0362-613X/85/040219-242503.00 Computational Linguistics, Volume 11, Number 4, October-December 1985 219 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces particular phrases and their specialized meanings. Part of the theoretical basis of PHRED is the notion that such specialized constructs are an essential component of language use. This idea has among its advocates Chafe (1968), Harris (1968), and Kittredge and Lehrberger (1983), and is behind other generation systems such as Kukich's Ana (1983).</Paragraph>
    <Paragraph position="2"> The shared linguistic knowledge base is an unusual feature of PHRED and PHRAN. Computer programs that can effectively communicate in natural language must be capable both of analyzing a range of utterances to derive their meaning or intent, and of producing appropriate and intelligible responses. Historically these two tasks have been treated independently, principally because some of the hard problems in language production differ from those of language analysis. In the MARGIE system, for example, the BABEL generator (Goldman 1975) employed a discrimination net as its principal data structure to facilitate the selection of an appropriate verb and an ATN grammar to apply syntactic constraints, while the ELI analyzer (Riesbeck 1975) in the same system attached routines to individual words to control the interpretations considered during the parsing process.</Paragraph>
    <Paragraph position="3"> Throughout the short history of natural language generation systems, programs that produce language have treated generation as a process of decision making (McDonald 1980), choice (Mann and Matthiessen 1983), or planning (Appelt 1982). These systems have employed knowledge structures specifically geared, to varying degrees, to the task of constraining the selection of lexical and grammatical elements. The design of analyzers, on the other hand, focuses on the problem of ambiguity in natural language and makes use of knowledge structures designed to constrain the consideration of alternative interpretations. While the tasks of analysis and generation are thus inescapably different, much of the same knowledge can be used in performing both tasks.</Paragraph>
    <Paragraph position="4"> Even in systems with both analysis and generation components, the knowledge used to derive meaning from language is not used to produce language from meaning.</Paragraph>
    <Paragraph position="5"> Such systems may be able to use a word or grammatical structure without being able to recognize the same structure, or vice versa, and must duplicate a great deal of information if the generator uses language similar to that understood by the analyzer. Intuitively, it seems that the knowledge used to constrain the interpretation of language can be used to constrain the choice of language.</Paragraph>
    <Paragraph position="6"> A natural language system with a parsimonious knowledge representation could encompass an interface capable of both analysis and production without excessive duplication, with knowledge about language as well as mechanisms for its analysis and generation.</Paragraph>
    <Paragraph position="7"> PHRED was conceived as the generation component of such an interface. PHRED, along with its companion analysis program PHRAN, embodies an approach to natural language processing founded on the principle that knowledge about language is in essence declarative, consisting of associations between linguistic and conceptual structures of varying degrees of specificity (Wilensky 1981). Such declarative knowledge should be used for both analysis and generation, and the form of the knowledge should be independent of the particular language.</Paragraph>
    <Paragraph position="8"> From its initial conception the &amp;quot;English&amp;quot; part of PHRED's acronym was anomalous, as both the analyzer and generator were envisioned as being able to utilize knowledge bases of multiple languages.</Paragraph>
    <Paragraph position="9"> The UNIX Consultant, designed to aid novice users of the UNIX operating system by conversing in natural language, is one application of the PHRED generator.</Paragraph>
    <Paragraph position="10"> The PHRAN analyzer receives the user's input and passes a conceptual interpretation to UC, which then passes its response in conceptual form to PHRED to be Converted into natural language output. Below are some examples of questions to UC, with the responses generated by PHRED: User: How do I delete a file? UC: Typing 'rm filename' causes the file filename to be removed from the current directory.</Paragraph>
    <Paragraph position="11"> User: C6mo se borra un archivo? UC: Para borrar el archivo de hombre filename del espacio de trabajo, se escribe 'rm filename'.</Paragraph>
    <Paragraph position="12"> User: How do I delete all the files from a directory? UC: Typing 'rm *' will remove all files within the current directory from the current directory.</Paragraph>
    <Paragraph position="13"> Typing 'rm -r directoryname' will cause all files and directories recursively inside of directoryname to be deleted from directoryname and removes the directory directoryname from the working directory.</Paragraph>
    <Paragraph position="14"> The Spanish example above, a rough paraphrase of the first English question, illustrates the ability of UC to answer questions posed in either language, using the same knowledge about UNIX. The PHRAN/PHRED knowledge base includes the information that borrar and escribe used with certain objects refer to specific concepts in the UNIX world, the same concepts as deleting and typing, respectively. Also in the phrasal knowledge base is knowledge that espacio de trabajo and working directory have a specialized denotation in the UNIX world. Such specialized linguistic knowledge is common in this domain.</Paragraph>
    <Paragraph position="15"> While PHRAN and PHRED were originally tested using an English vocabulary used for various stories and news articles, it was a relatively easy task to accommodate linguistic knowledge bases for English and Spanish in order for the same programs to operate in the UC domain. Adding a new vocabulary or language capability to the UC system has required no modification to the program, although the system has not had extensive testing with many languages.</Paragraph>
    <Paragraph position="16"> PHRED is implemented in Franz LISP and runs compiled on a VAX 11/780. The English linguistic knowledge base of UC contains about 150 patterns, in addition to knowledge of the morphological characteristics of 30 220 Computational Linguistics, Volume 11, Number 4, October-December 1985 Paul S. Jacobs PHRED: A Generator for Natural Language Interfaces verbs and 50 nouns commonly used in communicating UNIX information. The compiled program occupies about 100K bytes of memory, of which about 20K is code used also by PHRAN. Output from PHRED in the UC system requires 1-3 seconds of CPU time, roughly a third of the total time used by the system. For sentences of the length typically produced by the generator, the amount of time used is roughly proportional to the length of output. Experiments with larger knowledge bases have suggested that the time used by the generator is not heavily dependent on the size of the knowledge base.</Paragraph>
    <Paragraph position="17"> The next section describes the PHRED knowledge base and outlines its role in the generation process.</Paragraph>
    <Paragraph position="18"> Section 3 covers this process in more detail, and Section 4 traces a complete example of generation using PHRED.</Paragraph>
    <Paragraph position="19"> Section 5 compares the PHRED approach with other research. Section 6 discusses some current and future research directions.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML