File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-2133_intro.xml

Size: 6,385 bytes

Last Modified: 2025-10-06 14:06:32

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2133">
  <Title>Shalom Wintner (1997) An Abstract Machine for Unification Grammars -with Application to an HPSG Grammar for Hebrew. Ph.D. thesis, the Technion - Israel Institute of</Title>
  <Section position="3" start_page="0" end_page="807" type="intro">
    <SectionTitle>
1 Motivation
</SectionTitle>
    <Paragraph position="0"> Inefficiency is the major reason why the HPSG formalism (Pollard and Sag, 1993) has not been used for practical applications. However, one can claim that HPSG may not be so inefficient; it is just that an efficient implementation of HPSG has not been seriously pursued till now.</Paragraph>
    <Paragraph position="1"> We set a goal for the performance of our HPSG parser: 100 milliseconds of average parsing time on a sentence in real-world corpora. If our HPSG parser accomplished this goal, it would be capable to parse about 1,000,000 sentences in a day, and could be used for applications such as knowledge acquisition from corpora.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.1 Existing Systems for Typed Feature
Structures (TFSs)
</SectionTitle>
      <Paragraph position="0"> Since Typed Feature Structures (TFSs) (Carpenter, 1992) are the basic data structures in HPSG, the efficiency of handling TFSs has been considered as the key to improve the efficiency of an HPSG parser. There are two representative systems that handle TFSsl: ALE (Carpenter and Penn, 1994), a TFS interpreter written in Prolog, and ProFIT deg This research is partially funded by the project of Japan Society for the Promotion of Science (JSPS-RFTF96P00502). I LIFE (Ai't-kaci et al., 1994) is also famous, but we do not discuss it because it does not follow Carpenter's TFS definition. Moreover, our separate experiments show that LIFE is more than 10 times slower than emulator-based LiLFeS. As for AMALIA (Wintner, 1997), we cannot make experiments since it is not freely distributed. His experiments in his dissertation shows that AMALIA is 15 time faster than ALE at maximum; it is close to emulator-based LiLFeS, and is outperformed by native-code compiler of LiLFeS.</Paragraph>
      <Paragraph position="1"> (Erbach, 1995), a TFS-to-Prolog-term compiler.</Paragraph>
      <Paragraph position="2"> However, as the comparison of these systems with our system (Section 3.2) shows, neither of these two systems is able to achieve the efficiency we established as our goal. Moreover, these two systems have serious disadvantages as a framework for practical applications. The ProHT approach, for example, tends to consume too much memory for execution. It is also difficult, if not impossible, to combine them with other techniques like parallel parsing, etc., because these two systems have been embedded in Prolog.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.2 Our Approach
</SectionTitle>
      <Paragraph position="0"> One of the promising directions of improving the efficiency of handling TFSs while retaining a necessary amount of flexibility is to take up the idea of AMAVL proposed in (Carpenter and Qu, 1995) to design a general programming system based on TFS.</Paragraph>
      <Paragraph position="1"> LiLFeS is a logic programming system thus designed and developed by our group, based on AMAVL implementation. LiLFeS can be characterized as follows.</Paragraph>
      <Paragraph position="2"> * Architecture based on an AMAVL implementation, which compiles a TFS into a sequence of abstract machine instructions, and performs unification of the TFS by emulating the execution of those instructions. Although the proposal of such an AMAVL was already made in 1995, no serious implementation has been reported. We believe that LiLFeS is the first serious treatment of the proposal.</Paragraph>
      <Paragraph position="3"> * Rich language specification: We have adopted a language syntax similar to Prolog. LiLFeS as a programming language has almost the full capabilities of ordinary Prolog systems. Furthermore, we provide efficient built-in predicates that are often required in NLP applications, such as TFS copy, equivalence check, and associative arrays.</Paragraph>
      <Paragraph position="4"> * Independent language system: In order to develop an efficient and portable language system, we chose not to develop the language depending on an existing high-level language such as Prolog. Instead, we programmed the LiLFeS system from scratch. The independence also allows us to provide various built-in predicates in efficient ways.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="807" type="sub_section">
      <SectionTitle>
1.3 Structure of This Paper
</SectionTitle>
      <Paragraph position="0"> Section 2 describes LiLFeS as a programming  my_list &lt;- \[bot\].</Paragraph>
      <Paragraph position="1"> e list &lt;- \[my_list\].</Paragraph>
      <Paragraph position="2"> ne list &lt;- \[my list\] - + \[FI-RST\ bot, REST\ my list\]. append(e_list, X, X).</Paragraph>
      <Paragraph position="3"> append( (FIRST\ A &amp; REST\ X), Y, (FIRST\ A &amp; REST\ Z) ) :append( X, Y, Z ).</Paragraph>
      <Paragraph position="4">  language. Section 3 gives a brief description of the AMAVL we implemented, the core inference engine of the LiLFeS system. In Section 4, we discuss the current status of the LiLFeS system and the results of experiments on the system performance. Section 5 describes a native-code compiler we are currently developing on the LiLFeS system, and discusses its performance. 2 LiLFeS as a Programming Language LiLFeS has basically the same syntax as Prolog, except that it uses TFSs instead of terms. Types and features must be defined before being used in TFS terms.</Paragraph>
      <Paragraph position="5"> Figure 2 show the definition of the predicate append in LiLFeS. The first paragraph contains the type definitions of my_list, e list, and ne list. The type ne.__list, for e~ample, is a sub-iype of the type my_list, and has two appropriate features, FIRST and REST. The value of the feature REST is restricted to the type my__list or one of its subtypes. The type bot is the universal type that subsumes all types. The rest of the program is definite clauses. As one can see, the predicate append is represented by TFSs instead of Prolog first-order terms 2.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML