File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/87/p87-1009_intro.xml
Size: 3,316 bytes
Last Modified: 2025-10-06 14:04:38
<?xml version="1.0" standalone="yes"?> <Paper uid="P87-1009"> <Title>Phrasal Analysis of Long Noun Sequences</Title> <Section position="2" start_page="0" end_page="59" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> In everyday language we routinely encounter noun phrases consisting of an article and a head noun, possibly modified by one or more adjectives.</Paragraph> <Paragraph position="1"> Noun-noun pairs, e.g., park bench, atom bomb, and computer programmer, are also common. It is rare, however, to encounter noun phrases consisting of three or more nouns in sequence. Consequently, research in natural language analysis has not concentrated on parsing such constructions.</Paragraph> <Paragraph position="2"> The situation in many technical fields is quite different. For example, when describing the specifications of electronic systems, designers commonly use expressions such as: bus request cycle transfer block size segment trap request interrupt vector transfer phase arithmetic register transfer instruction.</Paragraph> <Paragraph position="3"> During design specification such phrases are often constructed by the specifier in order to reference a particular entity: a piece of hardware, an activity, or a range of time. In most cases, the nouns preceding the last one are used as modifiem, and idiomatic expressions are very rare. In almost all cases the meaning of noun sequences can therefore be inferred largely based on the last noun in the sequence*. (But see Finin (1980) for in-depth treatment of the meaning of such constructions). The process of recognizing the presence of these expressions is, however, complicated by the fact that many of the words used are syntactically ambiguous. Almost every single word used in the examples above belongs to both the syntactic categories of noun and verb. As a result, bus request cycle may conceivably be understood either as a corn* When a sequence has length three or more the order of modification may vary. Consider: lengine damage\] report January \[aircraft repairs I \[boron epoxyl \[ \[rocket motor\] chambers l 1970 I \[balloon flight I \[ \[solar-cell standardization l program\] \]. But the last noun is still the modified one. These examples are from (Rhyne, 1976) and (Marcus, 1979). mand (to bus the request cycle) or as a noun phrase.</Paragraph> <Paragraph position="4"> Considerable knowledge of the semantics of the domain is necessary to decide the correct interpretation of a nominal compound and the natural language analyzer must ultimately have access to it. But before complete semantic interpretation of such a noun phrase can even be attempted the analyzer must have a method of recognizing its presence in a sentence and determining its boundaries.</Paragraph> <Paragraph position="5"> I.i. The Rest of this Paper The rest of this paper is structured as follows: In the next section, Section 2., we describe the phrasal analysis approach used by our system to process input sentences. In Section 3. we discuss the problems involved in the recognition of long noun sequences, and in Section 4. we present our proposed solution and describe its implementation. Sections 5. and 6. are devoted to related work and to our conclusions, respectively.</Paragraph> </Section> class="xml-element"></Paper>