XML Viewer - n06-4011

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/n06-4011_evalu.xml
Size: 4,240 bytes
Last Modified: 2025-10-06 13:59:39
<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-4011">
  <Title>AUTOMATEDQUALITYMONITORINGFORCALLCENTERSUSINGSPEECHANDNLP TECHNOLOGIES</Title>
  <Section position="7" start_page="293" end_page="294" type="evalu">
    <SectionTitle>
4. END-TO-ENDSYSTEMPERFORMANCE
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="293" end_page="293" type="sub_section">
      <SectionTitle>
4.1. Application
</SectionTitle>
      <Paragraph position="0"> This section describesthe user interface of the automatedquality monitoringapplication. As explained in Section 1, the evalua-Fig. 2. Interface to listento audio and updatethe evaluationform.</Paragraph>
      <Paragraph position="1"> tor scores calls with respect to a set of quality-relatedquestions after listeningto the calls. To aid this process, the user interface provides an efficientmechanismfor the humanevaluatorto select calls,e.g.</Paragraph>
      <Paragraph position="2">  The automated quality monitoringuser interface is a J2EE web applicationthat is supported by back-end databases and content managementsystems 1 The displayedlist of calls provides a link to the audio, the automaticallyfilled evaluation form, the overall score for this call, the agent's name, server location,call id, date and durationof the call (see Figure 1). This interface now gives the agent the ability to listen to interestingcalls and update the answersin the evaluationform if necessary(audio and evaluation form illustratedin 2). In addition,this interface providesthe evaluator with the ability to view summarystatistics(average score) and additionalinformationaboutthe qualityof the calls. The over-all systemis designedto automaticallydownloadcalls from multiple locationson a daily-basis,transcribeand index them,thereby making them available to the supervisorsfor monitoring. Calls spanning a month are available at any given time for monitoring purposes.</Paragraph>
    </Section>
    <Section position="2" start_page="293" end_page="294" type="sub_section">
      <SectionTitle>
4.2. Precisionand Recall
</SectionTitle>
      <Paragraph position="0"> This section presents precision and recall numbers for the identificationof &amp;quot;bad&amp;quot; calls. The test set consistsof 195 calls that were manuallyevaluatedby call centerpersonnel.Basedon these manual scores, the calls were ordered by quality, and the bottom 20% were deemed to be &amp;quot;bad.&amp;quot; To retrieve calls for monitoring, we sort the callsbasedon the automaticallyassignedqualityscore and returnthe worst. In our summaryfigures,precisionand recall are plotted as a function of the number of calls that are selected for monitoring. This is importantbecause in reality only a small numberof callscan receive humanattention.Precisionis the ratio  of bad calls retrieved to the total number of calls monitored,and recall is the ratio of the number of bad calls retrieved to the total numberof bad callsin the test set. Threecurves are shown in each plot: the actuallyobserved performance,performanceof random selection, and oracle or ideal performance. Oracle performance shows what would happen if a perfect automaticordering of the calls was achieved.</Paragraph>
      <Paragraph position="1"> Figure 3 shows precision performance. We see that in the monitoring regime where only a small fraction of the calls are monitored,we achieve over 60% precision. (Further, if 20% of the calls are monitored,we still attainover 40% precision.) Figure4 shows the recall performance.In the regime of lowvolume monitoring,the recall is midway between what could be achieved withan oracle,and the performanceof random-selection.</Paragraph>
      <Paragraph position="2"> Figure5 shows the ratioof the numberof bad callsfoundwith our automatedrankingto thenumberfoundwithrandomselection.</Paragraph>
      <Paragraph position="3"> This indicatesthat in the low-monitoringregime, our automated techniquetriplesefficiency.</Paragraph>
      <Paragraph position="4"> 4.3. Humanvs. ComputerRankings As a final measure of performance, in Figure 6 we present a scatterplotcomparing human to computer rankings. We do not have calls that are scored by two humans,so we cannot presenta human-humanscatterplotfor comparison.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML