File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-2801_intro.xml
Size: 4,807 bytes
Last Modified: 2025-10-06 14:04:04
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-2801"> <Title>TextLinkagein theWiki Medium- A Comparative Study AlexanderMehler Departmentof ComputationalLinguistics& Text Technology BielefeldUniversity</Title> <Section position="3" start_page="0" end_page="1" type="intro"> <SectionTitle> 2 NetworkAnalysis </SectionTitle> <Paragraph position="0"> For the time being,the overall structureof complex networks is investigated in terms of Small Worlds (SW) (Newman,2003). Since its invention by Milgram(1967),this notionawaited formalizationas a measurablepropertyof large complex networks whichallows distinguishingsmall worldsfromrandomgraphs.Sucha formalization was introducedby Watts & Strogatz (1998)who characterizesmallworldsbytwo properties:First, otherthanin regulargraphs,any randomlychosen pair of nodesin a smallworld has, on average,a considerablyshortergeodesicdistance.1 Second, comparedto randomgraphs,smallworldsshow a considerablyhigherlevel of clusterformation.</Paragraph> <Paragraph position="1"> In this framework, cluster formationis measuredbymeansoftheaveragefractionofthenum- null bertriangleinv(vi) of trianglesconnectedto vertex vi and the numberorunderscore(vi) of triplescenteredon vi (Watts</Paragraph> <Paragraph position="3"> Alternatively, the cluster coefficientC1 computesthefractionof thenumberof trianglesin the wholenetwork and the numberof its connected vertex triples.Further, themeangeodesicdistance l ofa networkis thearithmeticmeanofallshortest pathsof all pairsof verticesin thenetwork. Watts andStrogatzobserve highclustervaluesandshort averagegeodesicdistancesin smallworldswhich apparentlycombineclusterformationwith shortcutsas prerequisitesof efficientinformationflow. In the areaof informationnetworks,thisproperty has been demonstratedfor the WWW(Adamic,</Paragraph> <Paragraph position="5"> In additionto the SW modelof Watts & Strogatz, link distributionswerealso examinedin orderto characterizecomplex networks:Barab'asi& Albert(1999)arguethatthevertex connectivityof socialnetworksis distributedaccordingto a scale-free power-law. They recur to the observation confirmedby many social-semioticnetworks,but not by instancesof the randomgraph model of Erd&quot;os & R'enyi (Bollob'as, 1985)- that the number of links per vertex can be reliablypredicted by a power-law. Thus,the probabilityP(k) thata randomlychosenvertex interactswithk otherverticesof thesamenetworkis approximately</Paragraph> <Paragraph position="7"> Successfullyfittinga power law to the distribution of out degrees of verticesin complex net- null other. Note that all coefficientspresentedin the following sectionsrelateby defaultto undirectedgraphs.</Paragraph> <Paragraph position="8"> poorlyconnected,whilea selectminorityof hubs will be very highly connected.&quot; (Watts, 2003, p.107). Thus, for a fixed numberof links, the smallertheg value,the shallowerthe slopeof the curve in a log-logplot, the higherthe numberof edgestowhichthemostconnectedhubis incident.</Paragraph> <Paragraph position="9"> A limitof this modelis that it views the probabilityof linkinga sourcenode to a target node to dependsolely on the connectivity of the latter. In contrastto this, Newman(2003)proposes a modelin whichthisprobabilityalsodependson the connectivityof the former. Thisis donein order to accountfor socialnetworks in whichverticestendto be linked if they sharecertainproperties (Newmanand Park, 2003),a tendency which is calledassortativemixing. Accordingto Newman& Park (2003)it allows distinguishingsocial networks from non-social(e.g. artificialand biological)oneseven if they are uniformlyattributed as smallworldsaccordingto the modelof Watts & Strogatz (1998). Newman& Park (2003)analyze assortative mixingof vertex degrees,that is, the correlationof the degrees of linked vertices.</Paragraph> <Paragraph position="10"> They confirmthatthiscorrelationis positivein the case of social,but negativein the case of technical networks (e.g. the Internet)whichthus prove disassortative mixing(ofdegrees).</Paragraph> <Paragraph position="11"> AlthoughtheseSWmodelswereappliedtocitation networks,WWWgraphs,semanticnetworks and co-occurrencegraphs,and thus to a variety of linguisticnetworks, a comparative studywhich focusesonwiki-basedstructureformationin comparisonto othernetworksof textualunitsis missing so far. In this paper, we presentsucha study. Thatis, we examineSWcoefficientswhichallow distinguishingwiki-basedsystemsfrommore&quot;traditional&quot;networks. In orderto do that,a generalizedwebdocumentmodelis neededto uniformly representthe documentnetworksto be compared.</Paragraph> <Paragraph position="12"> In thefollowingsection,a webgenremodelis outlinedforthispurpose. null</Paragraph> </Section> class="xml-element"></Paper>