File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/a00-1015_intro.xml

Size: 3,192 bytes

Last Modified: 2025-10-06 14:00:42

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-1015">
  <Title>J avox: A Toolkit for Building Speech-Enabled Applications</Title>
  <Section position="4" start_page="0" end_page="105" type="intro">
    <SectionTitle>
2 Basic Operation
</SectionTitle>
    <Paragraph position="0"> Jivox can be used as the sole location of NLP for an application; the application is written as a non-speech-enabled program and JhvOX adds the speech capability. The current implementation is written in Java and works with Java programs. The linkage between the application program and JhvOX is created by modifying - at load time - all constructors in the application to register new objects with JAVOX.</Paragraph>
    <Paragraph position="1"> For this reason, the application's source code does not need any modification to enable JAVOX. A thorough discussion of this technique is presented in Section 4. The schematic in Figure 1 shows a high-level overview of the JAVOX architecture.</Paragraph>
    <Paragraph position="2"> Issuing a voice command begins with a user utterance, which the speech recognizer processes and passes to the NLP component, TRANSLATOR. We are using the IBM implementation of Sun's Java Speech application program interface (API) (Sun Microsystems, Inc., 1998) in conjunction with IBM's VIAVOICE. The job of TRANSLATOR - or a different module conforming to its API - is to translate the utterance into a form that represents the corresponding program actions. The current implementation of TRANSLATOR uses a context-free grammar, with each rule carrying an optional JSL fragment.</Paragraph>
    <Paragraph position="3"> A typical bottom-up parser processes utterances and a complete JSL program results. The resulting JSL is forwarded to EXECUTER, where the JSL code is executed. For example, in a hypothetical banking application, the utterance add $100 to the account might be translated into the JSL command: myBalance = myBalance + i00;  The job of EXECUTER - or a different module that conforms to EXECUTER'S API - is to execute and monitor upcalls into the running application. The upcalls are the actual functions that would be made by the appropriate mouse clicks or menu selections had the user not used speech. For this reason, we are currently concentrating our efforts on event-driven programs, the class of most GUI applications. Their structure is usually amenable to this approach. Our implementation of EXECUTER performs the upcalls by interpreting and executing JSL, though the technology could be used with systems other than JSL.</Paragraph>
    <Paragraph position="4"> In the banking example, EXECUTER would identify the myBalemce variable and increment it by $100.</Paragraph>
    <Paragraph position="5"> The main JAVOX components, TRANSLATOR and EXECUTER, are written to flexible APIs. Developers may choose to use their own custom components instead of these two. Those who want a different NLP scheme can implement a different version of TRANSLATOR and - as long as it outputs JSL still use EXECUTER. Conversely, those who want a different scripting system can replace JSL and still use TRANSLATOR and even EXECUTER's low-level infrastructure.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML