File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-2801_intro.xml

Size: 4,807 bytes

Last Modified: 2025-10-06 14:04:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2801">
  <Title>TextLinkagein theWiki Medium- A Comparative Study AlexanderMehler Departmentof ComputationalLinguistics&amp; Text Technology BielefeldUniversity</Title>
  <Section position="3" start_page="0" end_page="1" type="intro">
    <SectionTitle>
2 NetworkAnalysis
</SectionTitle>
    <Paragraph position="0"> For the time being,the overall structureof complex networks is investigated in terms of Small Worlds (SW) (Newman,2003). Since its invention by Milgram(1967),this notionawaited formalizationas a measurablepropertyof large complex networks whichallows distinguishingsmall worldsfromrandomgraphs.Sucha formalization was introducedby Watts &amp; Strogatz (1998)who  characterizesmallworldsbytwo properties:First, otherthanin regulargraphs,any randomlychosen pair of nodesin a smallworld has, on average,a considerablyshortergeodesicdistance.1 Second, comparedto randomgraphs,smallworldsshow a considerablyhigherlevel of clusterformation.</Paragraph>
    <Paragraph position="1"> In this framework, cluster formationis measuredbymeansoftheaveragefractionofthenum- null bertriangleinv(vi) of trianglesconnectedto vertex vi and the numberorunderscore(vi) of triplescenteredon vi (Watts</Paragraph>
    <Paragraph position="3"> Alternatively, the cluster coefficientC1 computesthefractionof thenumberof trianglesin the wholenetwork and the numberof its connected vertex triples.Further, themeangeodesicdistance l ofa networkis thearithmeticmeanofallshortest pathsof all pairsof verticesin thenetwork. Watts andStrogatzobserve highclustervaluesandshort averagegeodesicdistancesin smallworldswhich apparentlycombineclusterformationwith shortcutsas prerequisitesof efficientinformationflow. In the areaof informationnetworks,thisproperty has been demonstratedfor the WWW(Adamic,</Paragraph>
    <Paragraph position="5"> In additionto the SW modelof Watts &amp; Strogatz, link distributionswerealso examinedin orderto characterizecomplex networks:Barab'asi&amp; Albert(1999)arguethatthevertex connectivityof socialnetworksis distributedaccordingto a scale-free power-law. They recur to the observation confirmedby many social-semioticnetworks,but not by instancesof the randomgraph model of Erd&amp;quot;os &amp; R'enyi (Bollob'as, 1985)- that the number of links per vertex can be reliablypredicted by a power-law. Thus,the probabilityP(k) thata randomlychosenvertex interactswithk otherverticesof thesamenetworkis approximately</Paragraph>
    <Paragraph position="7"> Successfullyfittinga power law to the distribution of out degrees of verticesin complex net- null other. Note that all coefficientspresentedin the following sectionsrelateby defaultto undirectedgraphs.</Paragraph>
    <Paragraph position="8"> poorlyconnected,whilea selectminorityof hubs will be very highly connected.&amp;quot; (Watts, 2003, p.107). Thus, for a fixed numberof links, the smallertheg value,the shallowerthe slopeof the curve in a log-logplot, the higherthe numberof edgestowhichthemostconnectedhubis incident.</Paragraph>
    <Paragraph position="9"> A limitof this modelis that it views the probabilityof linkinga sourcenode to a target node to dependsolely on the connectivity of the latter. In contrastto this, Newman(2003)proposes a modelin whichthisprobabilityalsodependson the connectivityof the former. Thisis donein order to accountfor socialnetworks in whichverticestendto be linked if they sharecertainproperties (Newmanand Park, 2003),a tendency which is calledassortativemixing. Accordingto Newman&amp; Park (2003)it allows distinguishingsocial networks from non-social(e.g. artificialand biological)oneseven if they are uniformlyattributed as smallworldsaccordingto the modelof Watts &amp; Strogatz (1998). Newman&amp; Park (2003)analyze assortative mixingof vertex degrees,that is, the correlationof the degrees of linked vertices.</Paragraph>
    <Paragraph position="10"> They confirmthatthiscorrelationis positivein the case of social,but negativein the case of technical networks (e.g. the Internet)whichthus prove disassortative mixing(ofdegrees).</Paragraph>
    <Paragraph position="11">  AlthoughtheseSWmodelswereappliedtocitation networks,WWWgraphs,semanticnetworks and co-occurrencegraphs,and thus to a variety of linguisticnetworks, a comparative studywhich focusesonwiki-basedstructureformationin comparisonto othernetworksof textualunitsis missing so far. In this paper, we presentsucha study. Thatis, we examineSWcoefficientswhichallow  distinguishingwiki-basedsystemsfrommore&amp;quot;traditional&amp;quot;networks. In orderto do that,a generalizedwebdocumentmodelis neededto uniformly representthe documentnetworksto be compared.</Paragraph>
    <Paragraph position="12"> In thefollowingsection,a webgenremodelis outlinedforthispurpose. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML