File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/a88-1010_metho.xml
Size: 11,187 bytes
Last Modified: 2025-10-06 14:12:02
<?xml version="1.0" standalone="yes"?> <Paper uid="A88-1010"> <Title>EVALUATION OF A PARALLEL CHART PARSER</Title> <Section position="4" start_page="0" end_page="71" type="metho"> <SectionTitle> 2. Background </SectionTitle> <Paragraph position="0"> There have been a number of theoretical and algorithmic studies of parallel parsing, beginning well before the current availability of suitable experimental facilities.</Paragraph> <Paragraph position="1"> For general context-free grammars, it is possible to adapt the Cocke-Younger-Kasami algorithm (Aho and Ullman 1972, p. 314 if) for parallel use.</Paragraph> <Paragraph position="2"> This algorithm, which takes time proportional to rt 3 (rt -&quot; length of input string) on a single processor, can operate in time n using n 2 processors (with shared memory and allowing concurrent writes). This algorithm, and its extension to unification grammars, has been described by Haas (1987b). The matrix form of this algorithm is well suited to large arrays of synchronous processors. The algorithm we describe below is basically a CYK parser with top-down filtering 1, but the main control structure is an event queue rather than iteration over a matrbc. Because tile CYK matrix is large and typically sparse 2, we felt that the event-driven algorithm would be more efficient in our environment of a small number of asynchronous processors (<< n 2 for our longest sentences) and grammars augmented by conditions which must be checked on each rule application and which vary widely in compute time.</Paragraph> <Paragraph position="3"> Cohen et al. (1982) present a general upper bound for speed-up in parallel parsing, based ou the number of processors and properties of the grammar. Their more detailed analysis, and the subsequent work of Sarkar and Deo (1985) focus on algorithms and speed-ups for parallel parsing of deterministic context-free grammars. Most programming language grammars are deterministic, but most natural language grammars are not, so this work (based on shift-reduce parsers) does not seem directly applicable.</Paragraph> <Paragraph position="4"> Experimental data involving actual implementations is more limited. Extensive measurements were made on a parallel version of the 1 We also differ from CYK in that we do not merge different analyses of the same string as the same symbol. As a result, our procedure would not operate in linear time for general (ambiguous) grammars.</Paragraph> <Paragraph position="5"> 2For grammar #4 given below and a 15-word sentence, the matrix would have roughly 15,000 entries (one entry for each substring and each symbol in the equivalent Chomsky normal form $ranunar), of which only about 1000 entries are filled.</Paragraph> <Paragraph position="6"> Hearsay-II speech understanding system (R. Fennel and V. Lesser, 1977). However, the syntactic analysis was only one of many knowledge sources, so it is difficult to make any direct comparison between their results and those presented here. Bolt Beranek and Newman is currently conducting experiments with a parallel parser quite similar to those described below (Haas, 1987a). BBN uses a unification grammar in place of the procedural restrictions of our system. At the time of this writing, we do not yet have detailed results from BBN to compare to our own.</Paragraph> </Section> <Section position="5" start_page="71" end_page="71" type="metho"> <SectionTitle> 3. Environment </SectionTitle> <Paragraph position="0"> Our programs were developed for the NYU Ultracomputer (Gottlieb et al., 1983), a shared-memory MIMD parallel processor with a special instruction, fetch-and-add, for processor synchronization. The programs should be easily adaptable to any similar shared memory architecture.</Paragraph> <Paragraph position="1"> The programs have been written in ZLISP, a version of LISP for the Ultracomputer which has been developed by Isaac Dimitrovsky (1987).</Paragraph> <Paragraph position="2"> Both an interpreter and a compiler are available. ZLISP supports several independent processes, and provides both global variables (shared by all processes) and variables which are local to each process. Our programs have used low-level synchronization operations, which directly access the fetch-and-add primitive. More recent versions of ZLISP also support higher level synchronization primitives and data structures such as parallel queues and parallel stacks.</Paragraph> </Section> <Section position="6" start_page="71" end_page="72" type="metho"> <SectionTitle> 4. Algorithms </SectionTitle> <Paragraph position="0"> Our parser is intended as part of the PROTEUS system (Ksiezyk et al. 1987). PROTEUS uses augmented context-free grammars - context-free grammars augmented by procedural restrictions which enforce syntactic and semantic constraints.</Paragraph> <Paragraph position="1"> The basic parsing algorithm we use is a chart parser (Thompson 1981, Thompson and Ritchie, 1984). Its basic data structure, the chart, consists of nodes and edges. For an n word sentence, there are n + 1 nodes, numbered O to n. These nodes are connected by active and inactive edges which record the state of the parsing process. If A W X Y Z is a production, an active edge from node nl to n2 labeled by A -+ W X . Y Z indicates that the symbols W X of this production have been matched to words nl + 1 through n2 of the sentence. An inactive edge from nl to n2 labeled by a category Y indicates that words n 1 + 1 through n2 have been analyzed as a constituent of type Y. The &quot;fundamental rule&quot; for extending an active edge states that if we have an active edge A ---* W X . Y Z from nl to n 2 and an inactive edge of category Y from n 2 to n3, we can build a new active edgeA---* WX Y.Z fromnl ton3.</Paragraph> <Paragraph position="2"> If we also have an inactive edge of type Z from n 3 to n4, we can then extend once more, creating this time an inactive edge of type A (corresponding to a completed production) from nl to n4.</Paragraph> <Paragraph position="3"> If we have an active edge A ---* W X . Y Z from nl to n2, and this is the first time we have tried to match symbol Y starting at n2 (there are no edges labeled Y originating at n~), we perform a seek on symbol Y at n2: we create an active edge for each production which expands Y, and try to extend these edges. In this way we generate any and all analyses for Y starting at n2. This process of seeks and extends forms the core of the parser. We begin by doing a seek for the sentence symbol S starting a node 0. Each inactive edge which we finally create for S from node 0 to node n corresponds to a parse of the sentence.</Paragraph> <Paragraph position="4"> The serial (uniprocessor) procedure 3 uses a task queue called an agenda . Whenever a seek is required during the process of extending an edge, an entry is made on the agenda. When we can extend the edge no further, we go to the agenda, pick up a seek task, create the corresponding active edge and then try to extend it (possibly giving rise to more seeks). This process continues until the agenda is empty.</Paragraph> <Paragraph position="5"> Our initial parallel implementation was straightforward: a set of processors all execute the main loop of the serial program (get task from agenda / create edge / extend edge), all operating from a single shared agenda. Thus the basic unit of computation being scheduled is a seek, along with all the associated edge extensions. If there are many different ways of extending an edge (using the edges currently in the chart) this may involve substantial computation. We therefore developed a second version of the parser with morefine-grained parallelism, in which each step of extending an active edge is treated as a separate task which is placed on the agenda. We present some comparisons of these two algorithms below.</Paragraph> <Paragraph position="6"> There was one complication which arose in the parallel implementations: a race condition in the application of the &quot;fundamental rule&quot;. Suppose processor P1 is adding an active edge to the chart from node nl to n2 with the label A W X . Y Z and, at the same time, processor P2 is adding an inactive edge from node n2 to n3 with the label Y. Each processor, when it is finished adding its edge, will check the chart for possible application of the fundamental rule involving that edge. P1 finds the inactive edge needed to further extend the active edge it just created; similarly, P2 finds the active edge which can be extended using the inactive edge it just created. Both processors therefore end up trying to extend the edge A ---* W X . Y Z and we create duplicate copies of the extended edge A ---* W X Y . Z. This race condition can be avoided by assigning a unique (monotonically increasing) number to each edge and by applying the fundamental rule only if the edge in the chart is older (has a smaller number) than the edge just added by the processor.</Paragraph> <Paragraph position="7"> As we noted above, the context-free grammars are augmented by procedural restrictions.</Paragraph> <Paragraph position="8"> These restrictions are coded in PROTEUS Restriction Language and then compiled into LISP.</Paragraph> <Paragraph position="9"> A restriction either succeeds or fails, and in addition may assign features to the edge currently being built. Restrictions may examine the sub-structure through which an edge was built up from other edges, and can test for features on these constituent edges. There is no dependence on implicit context (e.g., variables set by another restriction). As a result, the restrictions impose no complications on the parallel scheduling; they are simply invoked as part of the process of extending an edge.</Paragraph> </Section> <Section position="7" start_page="72" end_page="72" type="metho"> <SectionTitle> 5. Grammars </SectionTitle> <Paragraph position="0"> These algorithms were tested on four grammars: 2. A very small English grammar, taken from (Grishman, 1986) and used for teaching purposes. It has 23 nonterminal symbols and 38 productions.</Paragraph> <Paragraph position="1"> 3. Grammar #2, with four restrictions added. 4. The grammar for the PROTEUS question-answering system, which includes yes-no and wh- questions, relative and reduced relative clauses. It has 35 non-terminal symbols and 77 productions.</Paragraph> <Paragraph position="2"> 6. Method The programs were run in two ways: on a prototype parallel processor, and in simulated parallel mode on a standard uniprocessor (the uniprocecessor version of ZLISP provides for relatively efficient simulation of multiple concurrent processes). The runs on our prototype multiprocessor, the NYU Ultracomputer, were limited by the size of the machine to 8 processors. Since we found that we could sometimes make effective use of larger numbers of processors, most of our data was collected on the simulated parallel system. For small numbers of processors (1-4) we had good agreement (within 10%, usually within 2%) between the speed-ups obtained on the Ultracomputer and under simulation 4</Paragraph> </Section> class="xml-element"></Paper>