File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-2121_metho.xml

Size: 25,230 bytes

Last Modified: 2025-10-06 14:12:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-2121">
  <Title>Semantic Network Array Processor as a Massively Parallel Computing Platform for High Performance and Large-Scale Natural Language Processing*</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. SNAP Architecture
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1. Design Philosophy of SNAP
</SectionTitle>
      <Paragraph position="0"> The Semantic Network Array Processor (SNAP) is a highly parallel array processor fully optindzed tbr semantic network processing with a marker-passing mechanism. The fundermental design decisions arc (1) a semantic network as a knowledge representation scheme, and (2) parallel marker-passing as an inference mechauism.</Paragraph>
      <Paragraph position="1"> First, the use of a semantic network as a represem tation scheme can be justified from the fact that most of the representation schemes of current AI and NLP theories (such as frame, feature structure, sort hierarchy, systemic choice network, neural network, etc.) can be mapped onto semantic networks. Also, tlmre are numbers of systems and models which directly use semantic networks \[Sown, 1991\].</Paragraph>
      <Paragraph position="2"> Second, the use of marker-passing can be justified from several aspects. Obviously, there are many AI and NLP models which use some form of marker-passing as the central computing principle. For example, there are significant number of research being done on word-sense disambiguation as scene in Waltz and Pollack 1985\] Itendler, 1988\], \[Hirst, 1986, \[Charniak, 1983\], \[Tomabechi, 1987, etc. All of them assume passing of markers or values among nodes interconnected via some types of links. There are studies to handle syntactic con-ACRES DE COLING-92. NANaXS, 23-28 AO~&amp;quot; 1992 8 1 3 Paoc. ov COLING-92. NAI, rVES, AUG. 23-28, 1992</Paragraph>
      <Paragraph position="4"> straints using some type of networks which can be mapped onto semantic networks. Recent studies on the Classification-Based Parsing \[Kasper, 1989\] and the Systemic Choice Network \[Carpenter and Pollard~ 1991\] assume hierarchical networks to represent varions linguistic constraints, and the search on these networks can be done by marker-passing. Also, there are more radical approaches to implement entire natural language systems using parallel marker-passing as seen in \[Norvig, 1986\], \[Riesbeck and Martin, 1985\], \[Tomabechi, 1987\], and \[Kitano, 1991\]. There are, however, differences in types of information carried in each marker-passing model. We will describe our design decisions later.</Paragraph>
      <Paragraph position="5"> As reported in \[Evett, at. al., 1990\], however, serial machines are not suitable for such processing because it causes performance degradation as a size of semantic network increases. There are clear needs for highly parallel machines. The rest of this section provides a brief overview of the SNAP architecture.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2. The Architecture
</SectionTitle>
      <Paragraph position="0"> SNAP consists of a processor array and an array controller (Figure 1). The processor array has processing cells which contain the nodes and hnks of a semantic network. The SNAP array consists of 160 processing elements each of which consists of a TMS320C30 DSP chip, local SRAM, etc. Each processing elements stores 1024 nodes which act as virtual processors. They are interconnected via a modified hypercube network. The SNAP controller interfaces the SNAP array with a SUN 3/280 host and broadcasts instructions to control the operation of the array. The instructions for the array are distributed through a global bus by the controller. Propagation of markers and the execution of other instructions can be proceased simultaneously.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3. Parallel Marker-Passing
</SectionTitle>
      <Paragraph position="0"> In the SNAP, content of the marker are: (1) bitvector, (2) address, and (3) numeric value (integer or floating point). In SNAP, the size of the marker is fixed. According to the classification in \[Blelloch, 1986\], our model is a kind of Finite Message Passing. There are types of marker-, or message-, pa~ing that propagates feature structures (or graphs)~ which are called Unbounded Message Passing. Although we have extended our marker-passing model from the traditional bit marker-passing to the complex marker-passing which carries bits, address, and numeric values, we decided not to carry unbounded messages.</Paragraph>
      <Paragraph position="1"> This is because propagation of feature structures and heavy symbolic operations at each PE are not practical assumptions to make, at least, on current massively parallel machines due to processor power, memory capacity on each PE, and the communication bottleneck. Propagation of feature structures would impose serious hardware design problems since the size of the message is unbounded, which means that the designer can not be sure if the local memory size is sufficient or not until the machine actually runs some applications. Also, PEa capable of performing operations to manipulate these messages (such as unification) would be large in physical size which causes assembly problems when thousands of processors are to be assembled into one machine. Since we decide not to support unbounded message passing, we decide to support functionalities attained by the unbounded message passing by other means such as sophisticated marker control rules, dynamic network modifications, etc.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.4. Instruction Sets
</SectionTitle>
      <Paragraph position="0"> A set of 30 high-level instructions specific to semantic network processing are implemented directly in hardware. These include associative search, marker setting and propagation, logical/arithmetic operations involving markers, create and delete nodes and relations, and collect a list of nodes with a certain marker set. Currently, the instruction set can be called from C language so that users can develop applications with an extended version of C language.</Paragraph>
      <Paragraph position="1"> From the programming level, SNAP provides data-parallel programming environment similar to C* of the Connection Machine \]Thinking Machines Corp., 1989\], but specialized for semantic network processing with marker passing.</Paragraph>
      <Paragraph position="2"> Particularly important is the marker propagation rules. Several marker propagation rules are provided to govern the movement of markers. Marker propagation rules enables us to implement guided, or constraint, marker passing as well as unguided marker passing. This is done by specifying the type of links that markers can propagate. The following are some of the propagation rules of SNAP: e Seq(rl, r~): The Seq (sequence) propagation rule allows the marker to propagate through rl once then to r~.</Paragraph>
      <Paragraph position="3"> * Spread(rl,r2) : The Spread propagation rule allows the marker to travel through a chain of rl links and then r~ links.</Paragraph>
      <Paragraph position="5"> rule allows the marker to propagate to all rl and r~ links without limitation.</Paragraph>
      <Paragraph position="6"> 2.5. Knowledge Representation on SNAP SNAP provides four knowledge representation elements: node, link, node color and link value. These elements offer a wide range of knowledge representation schemes to be mapped on SNAP. On SNAP) a concept is represented by a node. A relation can be represented by either a node called relation node or AcrEs DE COLING-92, NANTES. 23-28 AOt)r 1992 8 1 4 PROC. OF COLING-92, NANTES. AUG. 23-28. 1992 a link between two nodes. The node color indicates the type of node. For example, when representing USC is in Los Angeles and CW0 ie in Pittsbnrgh~ we may assign a relation node for IN. The IN node is shared by the two facts. In order to prevent the wrong interpretations such as USC in Pittsburgh and CSll in Lea Angeles, we assigu IN#I and IN#2 to two distinct IN relations, and group the two relation nodes by a node color IN. Each lhlk has assigned to it a link value which indicates the strength of interconcepts relations. This link value supports probabilistic reasoning and connectionist-like processing. These four basic elements allow SNAP to support virtually any kind of graph-based knowledge representation formalisms such as KL-ONE \[Braehman and Schmolze, 1985\], Conceptual Graphs \[Sown, 1984\], KODIAK \[Wilensky, 1987\], etc.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. The Memory-Based Natural
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Language Processing
</SectionTitle>
      <Paragraph position="0"> Memory-baaed NLP is an idea of viewing NLP as a memory activity. For example, parsing is considered as a memoryosearch process which identifies similar eases in the past from the memory, and to provide interpretation based on the identified case. It can be considered as an application of Memory-Baaed l~.easoning (MBR) \[Stm~fill and Waltz, 1986\] and Case-Based Reasoning (CBR) \[Riesbeck and Schank, 1989\] to NLP. This view~ however, counters to traditional idea to view NLP as arl extensive rule application process to build up meaning representation. Some models has been proposed in this direction, such as Direct Memory Access Parsing (DMAP) \[Riesbeck and Martin, 1985\] and q~DMDIALOO \[Kitano, 1991\]. For arguments concerning superiority of the metnory-based approach over the traditional approach, ace \[Nagao, 1984\], \[Riesbeck and Martin, 1985\], and \[Sumita and \]\[ida, 1991\].</Paragraph>
      <Paragraph position="1"> DMSNAP is a SNAP implementation of the (I)DMDIALOG speech-to-speech dialogue translation system which is based on, in part, the memory-based approach. Naturally, it inherits basic ideas and mechanisms of the ~DMDIALOG system such as a memory-based approach to natural language processing and parallel marker-passing. Syntactic constraint network is introduced in DMSNAP whereas ODMDIALOG has been assuming unification operation to handle linguistic processing.</Paragraph>
      <Paragraph position="2"> DMSNAP consists of the nlemory network, syntactic constraint network, and markers to carry out inference. The memory network and the syntactic constraint network are compiled from a set of grammar rules written for DMSNAP.</Paragraph>
      <Paragraph position="3"> Memory Network on SNAP The major types of knowledge required for language translation in DMSNAP are: a lexicon, a concept type hierarchy, concept sequences, and syntactic constraints. Among them, the syntactic constraints are represented in the syntactic constraint network, and the rest of the knowledge is represented in the memory network. The memory network consists of various types of nodes such as concept sequence class (CSC), lexical item node* (LEX), concept nodes (CC) and others. Nodes are connected by a number of different links such as concept ahstraction links (ISA), expression links for both source language and target language (ENG and JPN), Role links (ROLE), constraint links (CON-STRAINT), contextual llnk~ (CONTEXT) and others. A part of the menmry network is shown in Figure 2.</Paragraph>
      <Paragraph position="4"> Markers The processing of natural language on a marker-propagation architecture requires the creation and movement of markers on the memory network.</Paragraph>
      <Paragraph position="5"> The following types of markers are used: (1) A-Markers indicate activation of nodes. They propagate ttlrough ISA links upward, carry a pointer to tile source of activation axtd a cost measure, (2) P-Markers indicate the next possible nodes to be activated. They are initially placed on the first element nodes of the CSGs, and move through NEXT link where they collide with A-MARKERs at tile element nodes, (3) G-Markers indicate activation of nodes in tile target language, They carry pointers to the lexieal node to he lexicalized, and propagate througtl ISA links upward, (4) V-Markers indicate current state of the verbalization. When a V-MARKER collides with the G-MARKER, the surface string (which is specified by the pointer in the G-MARKER) is verbalized, (5) G-Markers indicate contextual priming. Nodes with C-MAItKERs are contextually primed. A C-MARKER moves from the designated contextual root node to other contextually relevant nodes through contextual links, and (6) SC-Markers indicate active syntax conattaints, and primed and/or inhibited nodes by currently active syntactic constraints. It also carries pointer to specific nodes. There are some other markera used for control process and tinting; they are not described here.</Paragraph>
      <Paragraph position="6"> The parsing algorithm is sinular to the shift-reduce parser except that our algorithms handle ambiguities, parallel processing of each hypothesis, and top-down predictions of possible next input symbol. The generation algorithm implemented on SNAP is a version of the lexically guided bottom-up algorithm which is described in Kitano 1990\]. Details of the algorithm is described in Kitano et. al., 1991b.</Paragraph>
      <Paragraph position="7"> DmSNAP can handle various linguistic l)henomena such as: lexical ambiguity, structural ambiguity, referencing (pronoun referenee~ definite noun reference, etc), control, and unbounded dependencies. Linguistically complex phenomena are handled using the syntactic constraint network (SCN). The SCN enables the DmSNAP to process sentences involving unbounded dependencies, controls without passing feature structures. Details of the SCN is described in \[Kitano et. ah, 1991h\]. One notable feature of DmSNAP is its capability to parse and translate sentences in context. In other words, DmSNAP can store results of previous sentences and resolve various levels of ambiguities using the contextual information. Examples of sentences which DmSNAP can handle is shown below.</Paragraph>
      <Paragraph position="8"> it should he noted that each example consists of a set of sentences (not a single sentence isolated from the context) ill order to denmnatrate the contextual processing capability of the DMSNAP.</Paragraph>
      <Paragraph position="9">  s4 Dan planned to develop a parallel processing computer.</Paragraph>
      <Paragraph position="10"> s5 Eric built a SNAP simulator.</Paragraph>
      <Paragraph position="11"> s6 Juntae found bugs in the simulator.</Paragraph>
      <Paragraph position="12"> s7 Dan tried to persuade Eric to help Juntae modify the simulator.</Paragraph>
      <Paragraph position="13"> s8 Juntae solved a problem with the simulator.</Paragraph>
      <Paragraph position="14"> s9 It was the bug that Juntae mentioned.</Paragraph>
      <Paragraph position="15">  These sentences in examples are not all the sentences which DMSNAP can handle. Currently, DMSNAP handles a substantial portion of the ATR conference registration domain (vocabulary 450 words, 329 sentences) and sentences from other corpora.</Paragraph>
      <Paragraph position="16"> The following are examples of translation into Japanese generated by the DmSNAP for the first set  DMSNAP completes the parsing in the order of milliseconds. Table 1 shows parsing time for some of the example sentences.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4. Classification-Based Parsing
</SectionTitle>
    <Paragraph position="0"> Classification-Based Parsing is a new parsing model proposed in \[Kasper, 1989\]. In the classification-based parsing, feature structures are indexed in the hierarchical network, and an unifiability of two feature structures are tested by searching the Most Specific Subsumer (MSS). The unification, a computationally expensive operation which is the computational bottleneck of many parsing systems, is replaced by search in the lattice of pre~indexed feature structures.</Paragraph>
    <Paragraph position="1"> For example, in Figure 3, the feature structure F3 is a result of successful unification of the feature structure F1 and F2 (F3 = F1 tA F2). All feature structures are pro-indexed in a lattice so that the unification is replaced by an intersection search in the lattice with complex indexing. To carry out a search, first we set distinct markers on each feature structures F1 and F2. For example, set marker M1 on F1, and M2 on F2. Then, markers M1 and M2 propagate upward in the lattice. M1 and M2 first co-exist at F3.</Paragraph>
    <Paragraph position="2"> The most simple program (without disjunctions and conjunctions handling) for this operation follows:</Paragraph>
    <Paragraph position="4"> Of course, nodes for each feature structure may need to be searched from a set of features, instead of direct marking. In such a case, a set of markers will be propagated from each node representing each feature, and takes disjunction and conjunction at all nodes representing a feature structure root. This operation can be data-parallel.</Paragraph>
    <Paragraph position="5"> There are several motivations to use classification-based parsing, some of which are described in \[Knsper, 1989\]. The efficiency consideration is one of the major reasons for using classification-based parsing. Since over 80% of parsing time has been consumed on unification operations, replacing unification by a faster and functionally equivalent method would substantially benefit the overall performance of the system. The classification-based parsing is efficient because (1) it maximizes structure sharing, (2) it utilizes indexing dependencies, and (3) it avoids redundant computations. However, these advantages of the classification- null was implemented on the serial machine. This is because a search on complex index lattice would be computationally expensive for serial machines. Actually, the time-complexity of the sequential classification algorithm is O(Mn2), and that of the retrieval algorithm is O(R,,~JogM), where M is a number of concepts, n is an average number of prop-erty links per concept, R,.c is an average number of roleset relations for one concept. We can, however, circumvent this problem by using SNAP. Theoretically, time-complexity of the classification on SNAP is O(loggo~,M), and that of the parallel retrieval is O(FinDa., + 17~), where Fo~t is an average fan-out (average number of suhconcepts for one concept), J~ is an average fan-in (average number of superconcept for one concept), and D.~. is an average depth of the concept hierarchy \[Kiln and Moldovan, 1990\].</Paragraph>
    <Paragraph position="6"> In our model, possible feature structures are pre-computed and indexed using our classification algorithms. While a large set of feature structures need to be stored and indexed, SNAP provides sufficiently large memory/processor space to load an entire feature structure lattice. It is analogous to the idea behind the memory-based parsing which pre-expand all possible syntactic/semantic structures. Here again, we see the conversion of tlme-complexity into spacecomplexity. null Figure 4 shows performance of retrieval of classitleation lattice with varying fan-out and size. The clock cycle is 10 MHz. It demonstrates that we can attain micro-seconds response for each search. Given the fact that the fastest unification algorithm, even on the parallel machines, takes over few milliseconds per unification, the performance obtained in our experiment promises a significant improvement in parsing speed for many of the unification-baaed parsers by re~ placing unification by classification-based approach.</Paragraph>
    <Paragraph position="7">  that intensive use of linguistic and world knowledge would provide lfigh quality automatic trmmlation.</Paragraph>
    <Paragraph position="8"> One of the central knowledge sources of the KBMT is the ontological hierarchy which encodes abstraction hierarchies of concepts in the given domain, prop~ erty information of each concept, etc. When a parser creates ambiguous parses or when some parts of the meaning representation (as represented in an interlingun) are missing, this knowledge source is accessed to disambiguate or to fill-in missing information.</Paragraph>
    <Paragraph position="9"> However, as the size of the domain scales up, access time to the knowledge source grows to the extent that cost-effective bulk processing would lint be possible.</Paragraph>
    <Paragraph position="10"> For exmaple, \[Evett, el. al., 1990\] reports that access to large frame systems on serial computers have a time-complexity ofO(M x B 't) where M is the number of conjuncts in the query, B is the average branching factor in the network, and d is the deptb of the network. Thus, even a simplest form of search takes over 6 seconds on a VLKB with 28K nodes measured on a single user mode VAX super nfini-computer. Since such search on a VLKB must be performed several times for each parse, the performance issue would be a major concern. Considering the fact that VLKB projects such as CYC \[Lenat and Guha, 1990 and EDR \[EDR, 1988\] aim at VLKBs containing over a million concepts, the performance of VLKB search would be an obvious problem in practical use of these VLKBs. In tile massively parallel machines such as SNAP, we should be able to attain time-complexity of O(D + M) \[Evett, et. al., 1990\].</Paragraph>
    <Paragraph position="11"> We have carried out experiments to measure KB access time on SNAP. Figure 5 shows the search time for various size of VLKBs ranging from 800 to 64K nodes. Performance was compared with SUN-4 anti the CM-2 connection machine. SNAP-1 consistently outperformed other machines (performance curve of SNAP-1 is hard to see in the figure as it exhibited execution time far less than a second.</Paragraph>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
6. Other Approaches
</SectionTitle>
    <Paragraph position="0"> One clear extension of the currently implemented modules is to integrate the classification-baaed parsing and the VLKB search. The classification-baaed parsing carry out high performance syntactic analysis and the VLKB search would impose semantic constraints.</Paragraph>
    <Paragraph position="1"> Integration of these two would require that the SNAP-1 to have a multiple controller because two different marker control processes need to he mixed and executed at the same time. Currently SNAP-1 has only one controller. This would be one of the major items for tile up-grade of the architecture. However, the performance gain by this approach would be significant and its in,pact can be far reaching because a lot of current NLP research has been carried out on the  framework of the unification-based grammar formalism and use VLKBs as major knowledge sources.</Paragraph>
    <Paragraph position="2"> A more radical approacl h however rooted in the traditional model is to fully map the typed unification grammars \[Emele and Zajac, 1990 on the SNAP. The typed unification grammar is based on the Typed Feature Structure (TFS) \[Zajac, 1989\] and HPSG \[Pollard and Sag, 1987\], and represents all objects in TFS.</Paragraph>
    <Paragraph position="3"> Objects includes Phrasal Sign, Lexical Sign, general principles such as the &amp;quot;Head Feature Principle&amp;quot;, the &amp;quot;Subcat Feature Principle&amp;quot;, grammar rules such as the &amp;quot;Complement Head Constituent Order Feature Principle,&amp;quot; the &amp;quot;Head Complements Constituent Order Feature Principle,&amp;quot; and lexical entries. The lexical entries can be indexed under the lexical hierarchy. In this apporach, all linguistic knowledge is precompiled into a huge network. Parsing and generation will be carried out as a search on this network. We have not yet complete a feasibility study for this approach on SNAP. However, as of today, we consider this approach is feasible and expect to attain singledigit millisecond order performance on an actual implementation. The dynamic network modification, address propagation, and marker propagation rules are especially useful in implementing this approach.</Paragraph>
    <Paragraph position="4"> Natural language processing model on semantic networks such as \[Norvig, 1986\], SNePS \[Neal and Shapiro, 1987\], and TRUMP, KING, ACE~ and SOISOR at GE Lab. \[Jacobs, 1991\] should fit well with the SNAP-1 architecture. For \[Norvig, 1986\], SNAP provides floating point numbers to be propagated. As for SNePS, the implementation should be trivial, yet we are not sure the level of parallelism gain by the SNePS model. When the parallelism was found to be low, the coarse-grain processor may fit well with this model. Although we do not have space to discuss in this paper, there are, of course, many other NLP and AI models which can be implemented on SNAP.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML