File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/a92-1030_metho.xml

Size: 19,005 bytes

Last Modified: 2025-10-06 14:12:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="A92-1030">
  <Title>XTAG - A Graphical Workbench for Developing Tree-Adjoining Grammars*</Title>
  <Section position="3" start_page="223" end_page="226" type="metho">
    <SectionTitle>
2 XTAG Components
</SectionTitle>
    <Paragraph position="0"> The communication with the user is centralized around the interface manager window (See Figure 2) which gives the user control over the different modules of XTAG.</Paragraph>
    <Paragraph position="2"> This window displays the contents of the tree buffers currently loaded into the system. The different functions of XTAG are available by means of a series of pop-up menus associated to buttons, and by means of mouse actions performed on the mouse-sensitive items (such as the tree buffer names and the tree names).</Paragraph>
    <Paragraph position="3">  A tree editor for a tree contained in one of the tree buffer contained in the window can be called up by clicking over its tree name. Each tree editor manages one tree and as many tree editors as needed can run concurrently. For example, Figure 2 holds a set of files (such as Tnx0Vsl.trees) 3 which each contain trees (such as anx0Vsl). When this tree is selected for editing, the window shown in Figure 3 is displayed. Files can be handled independently or in group, in which case they form a tree family (flag F next to a buffer name).</Paragraph>
    <Paragraph position="4">  All the editing and visualizing operations are performed through this window (see Figure 3). Some of  them are: * Add and edit nodes.</Paragraph>
    <Paragraph position="5"> * Copy, paste, move or delete subtrees.</Paragraph>
    <Paragraph position="6"> * Combine two trees with adjunction or substitution. These operations keep track of the derivational history and update attributes stated in form of feature structures as defined in the framework of unification-based tree-adjoining grammar \[Vijay-Shanker and Joshi, 1988\].</Paragraph>
    <Paragraph position="7"> * View the derivational history of a derived tree and its components (elementary trees).</Paragraph>
    <Paragraph position="8"> * Display and edit feature structures.</Paragraph>
    <Paragraph position="9"> * Postscript printing of the tree.</Paragraph>
    <Paragraph position="10">  reflect the structure of the trees and they can be ignored by the reader.</Paragraph>
    <Paragraph position="11"> XTAG uses a centralized clipboard for all binary operations on trees (all operations are either unary or binary). These operations (such as paste, adjoin or substitute) are always performed between the tree contained in XTAG's clipboard and the current tree. The contents of the clipboard can be displayed in a special view-only window.</Paragraph>
    <Paragraph position="12"> The request to view the derivational history of a tree result of a combining operation triggers the opening of a view-only window which displays the associated derivation tree. Each node in a derivation tree is mousesensitively linked to an elementary tree.</Paragraph>
    <Paragraph position="13"> Since the derivational history of a derived tree depends on the elementary trees which were used to build it, inconsistency in the information displayed to the user could arise if the user attempts to modify an elementary tree which is being used in a derivation. This problem is solved by ensuring that, whenever a modifying operation is requested, full consistency is maintained between all the views. For instance, editing a tree used in a derivation tree will break the link between those two. Thus consistency is maintained between the derived tree and the derivation tree.</Paragraph>
    <Paragraph position="14"> Figure 4 shows an example of a derived tree (leftmost window) with its derivation tree window (middle window) and an elementary tree participating in its derivation (rightmost window).</Paragraph>
    <Paragraph position="15"> As is shown in Figure 3, the tree display module handles the bracketed display of feature structures (in unification-based TAG, each node is associated two feature structures: top and bottom, see Vijay-Shanker and Joshi \[1988\] for more details). The tree formatting algorithm guarantees that trees that are structural mirror images of on another are drawn such that their displays are reflections of one another \[Chalnick, 1989\]. A unification module handles the updating of feature structures for TAG trees.</Paragraph>
    <Paragraph position="16"> XTAG includes a predictive left to right parser for unification-based tree-adjoining grammar \[Schabes, 1991\]. The parser is integrated into XTAG and derivations are displayed by the interface as illustrated in Figure 4. The parser achieves an O(G~n6)-time worst case behavior, O(G2n4)-time for unambiguous grammars and linear time for a large class of grammars. The parser uses the following two-pass parsing strategy (originally defined for lexicalized grammars \[Schabes et al., 1988\]) which improves its performance in practice \[Schabes and Joshi, 1990\]: * In the first step the parser will select, the set of structures corresponding to each word in the sentence. Each structure can be considered as encoding a set of 'rules'.</Paragraph>
    <Paragraph position="17"> * In the second step, the parser tries to see whether these structures can be combined to obtain a well-formed structure. In particular, it puts the structures corresponding to arguments into the structures corresponding to predicates, and adjoins, it needed, the auxiliary structures corresponding to adjuncts to what they select (or are selected) for.</Paragraph>
    <Paragraph position="18"> This step is performed with the help of a chart in the fashion of Earley-style parsing.</Paragraph>
    <Paragraph position="20"> Figure 4: left, a derived tree, middle, its derivation, right, an elementary tree participating in the derivation.</Paragraph>
    <Paragraph position="21"> The first step enables the parser to select a relevant subset of the entire grammar, since only the structures associated with the words in the input string are selected for the parser. The number of structures filtered during this pass depends on the nature of the input string and on characteristics of the grammar such as the number of structures, the number of lexical entries, the degree of lexical ambiguity, and the languages it defines. In the second step, since the structures selected during the first step encode the morphological value of their words (and therefore their position in the input string), the parser is able to use non-local bottom-up information to guide its search. The encoding of the value of the anchor of each structure constrains the way the structures can be combined. This information is particularly useful for a top-down component of the parser \[Schabes and Joshi, 1990\].</Paragraph>
    <Paragraph position="22"> XTAG provides all the utilities required for designing a lexicalized TAG structured as in Schabes et al. \[1988\].</Paragraph>
    <Paragraph position="23"> All the syntactic concepts of lexicalized TAG (such as the grouping of the trees in tree families which represents the possible variants on a basic subcategorization frame) are accessible through mouse-sensitive items. Also, all the operations required to build a grammar (such as load trees, define tree families, load syntactic and morphological lexicon) can be predefined with a macro-like language whose instructions can be loaded from a file (See Figure 5).</Paragraph>
    <Paragraph position="24"> The grammar writer has also the option to manually test a derivation by simulating adjoining or substitution of trees that are associated with words defined in the lexicon.</Paragraph>
    <Paragraph position="25"> The grammar consists of a morphological English analyzer and a syntactic lexicon, which is the domain of structural choice, subcategorization and selectional information. Lexical items are defined by the tree structure or the set of tree structures they select.</Paragraph>
    <Paragraph position="26">  defining a grammar.</Paragraph>
    <Paragraph position="27"> The morphological lexicons for English \[Karp el al., 1992\] were built with PC-KIMMO's implementation of two-level morphology \[Antworth, 1990\] and with the 1979 edition of Collins English Dictionary. They comprise 75 000 stems deriving 280 000 inflected forms.</Paragraph>
    <Paragraph position="28"> XTAG also comes with a tree-adjoining grammar for English \[Abeill@ et al., 1990a\] which covers a large range of linguistic phenomena.</Paragraph>
    <Paragraph position="29"> The entries for lexical items of all types belong to the syntactic lexicon and are marked with features to constrain the form of their arguments. For example, a verb which takes a sentential argument uses features to constrain the form of the verb acceptable in the complement clause. An interesting consequence of TAG's extended  domain of locality is that features imposed by a clausal lexical item can be stated directly on the subject node as well as on the object node. These features need not be percolated through the VP node as in context-free formalisms.</Paragraph>
    <Paragraph position="30"> When a word can have several structures, corresponding to different meanings, it is treated as several lexical items with different entries in the syntactic lexicon. Morphologically, such items can have the same category and the same entry in the morphological lexicon 4. Examples  The choice of a graphical package was motivated by considerations of portability, efficiency, homogeneity and ease of maintenance. XTAG was built using Common Lisp and its X Window interface CLX.</Paragraph>
    <Paragraph position="31"> We chose this rather low level approach to realize the interface as opposed to the use of a higher-level toolkit for graphic interface design because the rare tools available which were fulfilling our requirements for portability, homogeneity and ease of maintenance were still under development at the beginning of the design of XTA G. The first package we considered was Express Window.</Paragraph>
    <Paragraph position="32"> It attracted our attention because it has precisely been created to run programs developed on the Symbolics machine in other Common Lisp environments. It is an implementation of most of the Flavors and graphic primitives of the Symbolics system respectively in terms of the Common Lisp Object System (CLOS) \[Keene, 1988\] primitives and CLX \[Scheifler and Lamott, 1989\]. We  the following examples, a tree family is associated with each string of part of speeches (POS).</Paragraph>
    <Paragraph position="33"> did not use it mainly because it was known to work only with the PCL version from Xerox Parc (we want to remain as compatible as possible between the different dialects of Common Lisp), and was not robust enough.</Paragraph>
    <Paragraph position="34"> Although WINTERP has many interesting points for our purpose, we did not choose it because we wanted to have a complete and efficient (i.e. a compiler) Common Lisp implementation. WINTERP is an interpretive, interactive environment for rapid prototyping and application writing using the OSF Motif toolkit \[Young, 1990\]. It uses a mini-lisp interpreter (called XLISP; it is not available as a compiler) to glue together various Cimplemented primitive operations (Xlib \[Nye, 1988\] and Xtk \[Asente and Swick, 1990\]) in a Smalltalk-like \[Goldberg, 1983\] object system (widgets are a first class type). WINTERP has no copyright restrictions.</Paragraph>
    <Paragraph position="35"> Initially we were attracted by GARNET \[Meyers et al., 1990\], mainly because it is advertised as look-and-feel independent and because it is implemented using only Common Lisp and CLX (but not CLOS, nor any existing X toolkit such as Xtk or Motif). The system is composed of two parts: (1) a toolkit offering objects (prototype instance model \[Lieberman, 1986\], constraints, (2) and an automatic run-time maintenance of properties of the graphic objects based on a semantic network. The different behavior of the interface components is specified by binding high level interactors objects to the graphic objects. An interface builder tool (Lapidary) allows the drawing of the graphic aspects of an interface. However we did not use GARNET because the version at the time of the design of XTAG was very large and slow, and still subject to changes of design. Furthermore, another reason for not choosing GARNET was the fact that Carnegie Mellon University retained the copyrights, slowing the distribution.</Paragraph>
    <Paragraph position="36"> PICASSO \[Schank et al., 1990\], a more recent package from Berkeley University, offers similar functionalities shared by other non Common Lisp based application frameworks like InterViews \[Linton et al., 1989\], MacApp \[Schmuker, 1986\] and Smalltalk \[Goldberg, 1983\], but is freely distributed. It is an object-oriented system implemented in Common Lisp, CLX and CLOS.</Paragraph>
    <Paragraph position="37"> With each type of PICASSO object is associated a CLOS class, the instances of which have different graphic and interactive properties. The widgets implementing those properties are automatically created during the creation of a PICASSO object. Unlike the two previous systems, the PICASSO objects may be shared in an external database common to different applications (persistent classes) when this is enabled, PICASSO requires the use of the database management system INGRES \[Charness and Rowe, 1989\]. PICASSO was not available as a released package at the time the implementation of</Paragraph>
  </Section>
  <Section position="4" start_page="226" end_page="227" type="metho">
    <SectionTitle>
XTA G started.
4 Programming considerations
</SectionTitle>
    <Paragraph position="0"> All the graphic objects of XTAG are defined as contacts and are implemented using only the structures of Common Lisp and their simple inheritance mechanism.</Paragraph>
    <Paragraph position="1"> Because of the relatively low computing cost associated with the contacts, we have been able to define every  graphic object of XTAG (whatever its complexity as a contact is) without having to resort to a different procedure oriented implementation for simpler objects as was done in InterViews with the painter objects \[Linton et al., 1989\].</Paragraph>
    <Paragraph position="2"> The programming difficulties we have encountered deal with re-synchronizing XTAG with the server during a transfer of control between contacts (the active contact is the one containing the cursor). These difficulties stem from the asynchronous nature of the communication protocol of X and from the large number of events mouse motion may generate when the cursor is moved over closely located windows. The fact that windows may be positioned anywhere and stacked in any order (overlapping windows) makes the handling of those transitions a non trivial task. A careful choice of the eventmasks attached to the windows is by itself insufficient to solve the problem. To limit the number of queries made to the server, we use extensive caching of graphic properties. The structures implementing the contacts contain fields that duplicate server information. They are updated when the graphic properties of the object they describe are changed. We found this strategy to improve the performance noticeably. This feature can easily be turned off, in case a particular X-terminai or workstation would provide hardware support for caching.</Paragraph>
    <Paragraph position="3"> While we put a lot of attention on issue of portability, we did not worry about look independence, limiting the user possibilities in this domain to geometric dimension parameterization and font selection by means of a configuration file and a few menus.</Paragraph>
    <Paragraph position="4"> Our current implementation uses the twin window manager, but another window manager could also be used. We have found the need for multi-font string support for XTAG because the tree names and node labels require a mix of at least two or three fonts (ascii symbols and greek symbols such as c~, fl and e, and a font for subscripts). We could have used a font which contains all the characters may use the same font as the normal differ from those only by their location to the writing line), but we preferred to define a multi-font composed of several existing fonts (which can be customized by the user) for portability purposes and to leave open the way for future extensions.</Paragraph>
    <Paragraph position="5"> In order to be able to scroll over the trees when they are too big to be displayed in a window, every tree editor window is associated with an eight direction touch-pad (inspired from the mover of InterViews \[Linton et al., 1989\]).</Paragraph>
    <Paragraph position="6"> The nodes displayed in the window of a tree editor are not sensitive to the presence of the cursor, they react only to mouse button clicks. During earlier versions of XTAG we highlighted the visited node with a border, but this required too much overhead because of the numerous cursor motions over the tree window which occur during editing.</Paragraph>
    <Paragraph position="7"> The text editing task we had to implement fall into two classes: * short line editing requiring multi-fonts (e.g. edition of node names); * text editing not requiring multi-fonts (e.g. multiline comments, unification equations).</Paragraph>
    <Paragraph position="8"> For the former, we implemented all the editing functions ourselves because they do not require much processing and multi-font support was unavailable. For the latter, we used system calls to an external editor (emacs in our case).</Paragraph>
    <Paragraph position="9"> Concerning the programming task, we would have liked to have available tools to help us write an X application in Common Lisp at a level slightly higher than the one of the CLX interface without going up to the level of elaborate toolkits like GARNET or PICASSO which implies the use of a complex infra-structure, perhaps something like an incremental or graded toolkit. Our next developments effort will be concerned with introducing parallelism in the interface (actors), adding new features like an undo mechanism (using the Item list data structure proposed by Dannenberg \[1990\]), and extending XTAG for handling meta-rules \[Becker, 1990\] and Synchronous TAGs \[Shieber and Schabes, 1990b\] which are used for the purpose of automatic translation \[Abeill~ et ai., 1990b\] and semantic interpretation \[Shieber and Schabes, 1990a; Shieber and Schabes, 1991\].</Paragraph>
  </Section>
  <Section position="5" start_page="227" end_page="227" type="metho">
    <SectionTitle>
5 Requirements for Running XTAG
</SectionTitle>
    <Paragraph position="0"> XTAG has been tested on UNIX based machines (R4.4) (SPARC station 1, SPARC station SLC, HP BOBCATs series 9000 and SUN 4 also with NCD Xterminals) running XllR4 and CLX with Lucid Common Lisp (4.0) and Allegro Common Lisp (4.0.1).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML