File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/01/h01-1049_metho.xml
Size: 6,707 bytes
Last Modified: 2025-10-06 14:07:34
<?xml version="1.0" standalone="yes"?> <Paper uid="H01-1049"> <Title>Listen-Communicate-Show (LCS): Spoken Language Command of Agent-based Remote Information Access</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. SYSTEM OVERVIEW </SectionTitle> <Paragraph position="0"> The LCS-Marine system consists of four major components: an SLS, a collection of agents for information access, real-world operational databases, and communications networks to connect the user to the SLS and the agents to the databases.</Paragraph> <Paragraph position="1"> The underlying architecture for the system is the MIT Galaxy II conversational architecture [3]. It is a distributed, component-based middleware product designed to be plug and play.</Paragraph> <Paragraph position="2"> Specialized servers handle specific tasks, such as translating audio data to text. All Galaxy II-compliant servers communicate with each other through a central server known as the Hub. The Hub manages flow control, handles traffic among distributed servers, and provides state maintenance.</Paragraph> <Paragraph position="3"> In the SLS, speech is sent from the Audio I/O server to the Recognizer. The top n recognitions are then parsed, prior context added, and processed using the Natural Language (NL) servers (Frame Construction and Context Tracking) to verify the new input's validity and context. The Turn Manager (TM) determines how to proceed with the conversations and generates a response. NL (Language Generation) converts it to text and the Synthesis server generates the verbal response.</Paragraph> <Paragraph position="4"> The audio server then speaks the waveform file to the user. We customize the various servers to work with domain specific issues and application-specific information and training.</Paragraph> <Paragraph position="5"> Figure 1 shows our LCS architecture.</Paragraph> <Paragraph position="6"> We have integrated an additional server into the architecture to support information accessan Agent server. The Agent server manages a collection of agents that can be tasked to accomplish a variety of missions, including migration to distant machines with possibly different operating systems to gather information or to monitor and report events [2].</Paragraph> <Paragraph position="7"> Typically, the Agent server receives its tasking from the TM and supplies the TM with information from the data source(s).</Paragraph> <Paragraph position="8"> For persistent tasks, the Agent server becomes the initiator of a dialogue to inform the user of specific events by passing agent reports to the TM. When a visual display is present, the Agent server will dispatch an agent to pass the updated information to the display machine.</Paragraph> <Paragraph position="9"> For the LCS-Marine application our agents had to interact with a logistics database that could be between one to one hundred miles away. We later describe how our agents were able to reach this live database over the tactical communication links available.</Paragraph> <Paragraph position="10"> Users interact with the LCS-Marine system using the voice capture device appropriate to their organization (telephone, cell phone, tactical radios, computer headsets, etc.).</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3. MARINE COMBAT SERVICE SUPPORT PROBLEM </SectionTitle> <Paragraph position="0"> Marines work in a dynamic, fluid environment where requirements and priorities are constantly subject to change.</Paragraph> <Paragraph position="1"> Under current operations, it might take up to 72 hours before a Marine in a Combat Service Support Operations Center (CSSOC) can confirm with a requesting unit that their order is in the logistics system. This is due to a lack of resources available to the tactical units as well as a difficulty in turning logistics data into information to enable timely analysis and decision making. For Marines conducting tactical operations, these restrictions and limited visibility into the supply chain hamper logistics planning, decision, execution, and assessment. Figure 2 shows the various echelons involved in tactical Marine logistics operations. It is noteworthy that tactical units have no organic means of accessing the logistical databases other than via radio contact with personnel at the CSSOC.</Paragraph> <Paragraph position="2"> The focus of the LCS-Marine project is to provide Marines in the field with this missing visibility into the supply chain. By using standard radio protocols and a common form, Marines can now converse with a system that understands their task and end goal and can assist them in getting both the information and supplies they need. Figure 3 shows a sample of the Rapid Request form, used when placing an order.</Paragraph> <Paragraph position="3"> Supporting the LCS-Marine domain required understanding and using proper radio protocols to communicate. It required the system to understand call signs, military times, grid coordinates, and special ordinance nomenclature. Additionally, to fully support the dynamic environment, LCS-Marine needed the ability to understand and translate usages of the military phonetic alphabet. This alphabet is used to spell difficult or unusual words. For example, to give the point of contact for the request as Sergeant Frew, the user could say: P O C is Sergeant I spell Foxtrot Romeo Echo Whiskey over.</Paragraph> <Paragraph position="4"> LCS-Marine would convert the phonetic words to the proper letter combination. This way the vocabulary is potentially much larger than that used for system training.</Paragraph> <Paragraph position="5"> Supporting the dynamic aspects of the Marine environment, the system is speaker independent. This is critical in applications where the user may change and there is no additional time for training the system for a new operator.</Paragraph> <Paragraph position="6"> The recognizer is trained on the domain vocabulary, but not on individual operator voices. The system also fully supports natural, conversational dialogue, i.e., the recognizer expects utterances at a normal rate of speech and the speaker does not need to enunciate each syllable.</Paragraph> <Paragraph position="7"> It is important to note that the amount of time spent training personnel to use the LCS-Marine system is generally less than 10 minutes. After a short introduction, the user is shown a sample dialogue for familiarization. The user is also given information about meta-instructions how to start over or to clear their previous statement before they begin operation.</Paragraph> </Section> class="xml-element"></Paper>