File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/i05-2005_metho.xml

Size: 5,032 bytes

Last Modified: 2025-10-06 14:09:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-2005">
  <Title>A Novel Method for Content Consistency and Efficient Full-text Search for P2P Content Sharing Systems</Title>
  <Section position="4" start_page="26" end_page="27" type="metho">
    <SectionTitle>
3 System Architecture
</SectionTitle>
    <Paragraph position="0"> Figure 1 shows the architecture of our system. As described earlier, we chose a central server architecture to provide a full-text search of the contents.</Paragraph>
    <Paragraph position="1"> The public keys of the clients are stored on a central server. By sending a request to the server, a client can obtain a public key of another client that is connected with the central server. The central server also has private and public keys. Its public key is available to all the clients.</Paragraph>
    <Paragraph position="2"> Each client has a unique ID. When a client connects to the central server, it sends its own IP address. Another client can obtain the IP address of a client by querying to the server using its client ID. The central server provides a content consistency maintenance mechanism and a full-text search engine. These mechanisms are described in the following sections.</Paragraph>
    <Section position="1" start_page="26" end_page="26" type="sub_section">
      <SectionTitle>
4 Content Consistency Maintenance
4.1 Data Structure for Content Management
</SectionTitle>
      <Paragraph position="0"> In this system, a publisher of a document digitally signs a document with its private key and registers its sign to the central search server with its unique ID.</Paragraph>
      <Paragraph position="1"> When a document is a text document, a client performs morphological analysis to generate search key-words from a document.</Paragraph>
      <Paragraph position="2"> The ID of contents and digital signs corresponding to different versions are managed on the central search server. Using the ID and version, a client can obtain a digital sign for a document by querying to the central server using its ID and version. Using a digital sign ensures that a malicious client does not tamper with a document.</Paragraph>
      <Paragraph position="3"> A search result obtained from the central server is also digitally signed to ensure that a client does not tamper with it. As described in detail in section 5, a search result is cached on a client and can be modified. To prevent this, a search result comprises the ID of contents and a digital sign.</Paragraph>
      <Paragraph position="4"> In this system, a client can obtain the latest version of a document when a document is updated, by querying its ID to the central server. However, a limitation associated with this method is that only the latest version of documents can be obtained. For example, by using indirect files and hash values of contents as in Freenet, we can obtain previous versions of a document by directly specifying a hash value of an earlier version. However, neither does Freenet assure that the latest version is always obtained nor does it assure that a particular earlier version is obtained because a previous version may be deleted if there is no request for it in a certain period. In our system, we consider only the latest version of a document which can be obtained at any time. Thus, we define our document query protocol in order to obtain the latest version.</Paragraph>
      <Paragraph position="5"> In order to prevent the concentration of download requests on a certain client, our system manages a list of clients that have downloaded the latest version of a document and distributes download sources to these clients using this list.</Paragraph>
      <Paragraph position="6"> In this method, the ID of a client that downloads the latest version of a document is added to a list; this ID corresponds with the ID of the document. When a client sends a request to the central server to download a document, the central server selects an appropriate client from a downloader's list and returns its ID to the client. When the publisher updates a document, the list corresponding to that document is emptied.</Paragraph>
      <Paragraph position="7"> We describe this procedure by the following pseudo codes, where download is a function that requests the download of a document, nodeId is the ID of a client that requests the download, update is a function that requests the update of a document, and getNodeId is a function that gets the ID of a client that downloads a document whose ID is docId.</Paragraph>
      <Paragraph position="8"> nodeIdList: document ID x node ID list</Paragraph>
      <Paragraph position="10"/>
    </Section>
    <Section position="2" start_page="26" end_page="27" type="sub_section">
      <SectionTitle>
4.2 Tracing How Contents are Exchanged
</SectionTitle>
      <Paragraph position="0"> In a P2P content sharing system that uses a simple download protocol, such as Napster, when a service  - Client public keys - Contents certificate - Links to contents - Full-text search index - Contents</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML