File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/p02-1002_concl.xml

Size: 2,441 bytes

Last Modified: 2025-10-06 13:53:18

<?xml version="1.0" standalone="yes"?>
<Paper uid="P02-1002">
  <Title>Sequential Conditional Generalized Iterative Scaling</Title>
  <Section position="6" start_page="3" end_page="3" type="concl">
    <SectionTitle>
5 Discussion
</SectionTitle>
    <Paragraph position="0"> There are many reasons that maxent speedups are useful. First, in applications with active learning or parameter optimization or feature set selection, it may be necessary to run many rounds of maxent, making speed essential. There are other fast algorithms, such as Winnow, available, but in our experience, there are some problems where smoothed maxent models are better classifiers than Winnow.</Paragraph>
    <Paragraph position="1"> Furthermore, many other fast classification algorithms, including Winnow, do not output probabilities, which are useful for precision/recall curves, or when there is a non-equal tradeoff between false positives and false negatives, or when the output of the classifier is used as input to other models. Finally, there are many applications of maxent where huge amounts of data are available, such as for language modeling. Unfortunately, it has previously been very difficult to use maxent models for these types of experiments. For instance, in one language modeling experiment we performed, it took a month to learn a single model. Clearly, for models of this type, any speedup will be very helpful.</Paragraph>
    <Paragraph position="2"> Overall, we expect this technique to be widely used. It leads to very significant speedups - up to an order of magnitude or more. It is very easy to implement - other than the need to transpose the training data matrix, and store an extra array, it is no more complex than standard GIS. It can be easily applied to any model type, although it leads to the largest speedups on models with more feature types. Since models with many interacting features are the type for which maxent models are most interesting, this is typical. It requires very few additional resources: unless there are a large number of output classes, it uses about as much space as standard GIS, and when there are a large number of output classes, it can be combined with our clustering speedup technique (Goodman, 2001) to get both additional speedups, and to reduce the space requirements. Thus, there appear to be no real impediments to its use, and it leads to large, broadly applicable gains.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML