File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/73/c73-2010_metho.xml

Size: 7,180 bytes

Last Modified: 2025-10-06 14:11:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="C73-2010">
  <Title>TABLE 4. Thomistic Works Proper Works Commentaries Stenographic Reports Works of Dubious</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
FRANCO MELETTI S. J.
PROPORTION BETWEEN NUMBER OF WORDS
AND NUMBER OF FORMS IN THE WORKS
OF THE &amp;quot;INDEX THOMISTICUS&amp;quot;
1. THE PROBLEM
</SectionTitle>
    <Paragraph position="0"> At the conclusion of his long work of elaboration of the Index Thomisticus (the concordance of all the works of St. Thomas Aquinas), Father ROBERTO BUSA s.j. proposed the following problem to me: For each of the 179 elaborated works, we know the exact number of words and the exact number of graphic forms. Obviously, from this data it also results that the ratio between the number of forms and the number of words decreases with an increasing number of words.</Paragraph>
    <Paragraph position="1"> Does a formula exist that will compare two works of different length so that one can affirm that one work has comparatively more forms than another? For instance, the work with the code number 017-QDI contains 10,238 words and 1,471 forms. Does this work have comparatively more or less forms than work 012-QDV which has 114,991 words and 9,016 forms? Father Busa asked me for the pure quantitative data. He also told me that he would not interpret or evaluate this data in terms of&amp;quot; lexical wealth&amp;quot; as he doubts that anything can be concluded in this regard on the basis of this data alone. Indeed, he believes that lexical wealth must be measured on the basis of a much wider range of parameters.</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. THE ANSWER
</SectionTitle>
    <Paragraph position="0"> Following the process reported below, a relation between the number of words (W) and the number of graphic forms (F) has been found: it results that this relationship can be expressed with a straight line on a bi-logarithmic diagram:  (i) log F= log A + B' 1o~(W/10,000) 90 PRANCO MELETTI S. J.</Paragraph>
    <Paragraph position="1"> where log A and B are two constants which characterize the linear function. Therefore the function F =f(W) is expressed by a generalized parabola: (2) F= A. (W/lo,ooo)&amp;quot;  The values of A and B are the following for the set of the works of the entire Index Thomisticus Corpus and for the various groups of works according to the subdivisions indicated below:</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3. GRAPHIC AND CALCULUS PROCEDURE
</SectionTitle>
    <Paragraph position="0"> To establish first of all whether a relationship exists between the values of W and F in the works of the Index Thomisticus Corpus, the points corresponding to couples IV, F are traced on a bi-logarithmic diagram, that is with scales log Wand log F. The choice of the logarithmic scales was suggested directly by the distribution of the values of W and F arranged in increasing order. These values on a logarithmic scale would result well distributed along the entire W and F interval.</Paragraph>
    <Paragraph position="1"> From the diagram log W- log F it immediately appeared clear that a straight line could well approximate the distribution of the points all over the W interval, from minimal to maximal values (see figure 1).</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
PROPORTION BETWEEN WORDS AND FORMS &amp;quot; INDEX THOMISTICUS &amp;quot; 91
</SectionTitle>
    <Paragraph position="0"> Therefore the relationship between log W and log F is linear as expressed in (1) and the function F=f(W) is as expressed in (2).</Paragraph>
    <Paragraph position="1"> The expression (2) has been preferred to the expression F = A&amp;quot; W B, because in (2) the constant A has a significant value, that is the number of forms corresponding to 10,000 words, mean value in the interval of log W. It is evident that the expressions (1) and (2) are valid only within the intervals of the respective works. For lower values of W the function F =f(W) obviously links with the straight line F = W, which is valid for very short texts.</Paragraph>
    <Paragraph position="2"> The constants log A and B of (1) were computed by least-squares polynomial approximation on the logarithmic diagram.</Paragraph>
    <Paragraph position="3"> The complete results of the computations are reported in Tables 3 92 rl~NCO MELETTI S. J.</Paragraph>
    <Paragraph position="4"> and 4, for the entire Index Thomisticus Corpus; for two major divisions of the Corpus: Thomistic Works and Parathomistic Works (works of other authors included in the Corpus); for four divisions of Thomistic works: Proper Works (OPERA PROPRIA), Commentaries (COM-MENTARIA: Aristotle's, Bible's, etc.), Stenographic reports (REPOR-TATIONES), works of dubious authenticity (OPERA DUBIAE AUTHENTICITATIS).</Paragraph>
    <Paragraph position="5"> In Tables 3 and 4 the following values are reported: the constants log A, A, B, the log F RMS deviation, that is, the RMS deviation between the true values of log F and the mean values of log F computed by (1), the max. upper deviation, that is the ratio (Ft,,JF,~,~),~,, for values F~,~, &gt; F,,,,,=, the max. lower deviation, that is the ratio (F~,,,/F,~,,~),~I,, for</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
PROPORTION BETWEEN WORDS AND FORMS &amp;quot; INDEX THOMISTICUS &amp;quot; 93
4. NOTES, RESERVATIONS AND COMMENTS
</SectionTitle>
    <Paragraph position="0"> The values of the constants A and B with reference to four divisions of 118 Thomistic works (Tables 2 and 4) present not significant differences, which are included in the random variations. On the other hand, the constants A and B of the Thomistic works compared with those of the Parathomistic works (Table 1 and 3) present significant differences, using the Student-Fisher's test (see Figure 2).</Paragraph>
    <Paragraph position="1">  In Table 4 one can notice that among the Thomistic works the Stenographic reports (REPOtLTATIONES), which are distinct from the truly dictated works (these are among the proper works), present a RMS deviation of F which is about half of that of the other groups of Thomistic works; is this a significant fact? The straight-line behaviour of the function F =f(W) on the logarithmic diagram is seen along the whole interval of W, from the</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
94 FRANCO MELETTI S. J.
</SectionTitle>
    <Paragraph position="0"> minimum of about W = 200 up to a maximum of about 600,000 words, without showing a tendency to flatten with regard to the W axis.</Paragraph>
    <Paragraph position="1"> However the sum of the words of the entire Thomistic Corpus (W = 10,600,085) presents a number of graphic forms F = 132,125 which is lower than the mean (extrapolated) value computed by the expression (1); but this point does not much exceed the double of the RMS deviation (see figure 1).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML