P2P

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search

This page is meant to be as a first draft of a system, which provides articles (from Wikipedia) in peer-to-peer networks.

P2Pedia[edit]

The idea behind a P2Pedia (P2P + Wikipedia) is, that the technical goods of some already existing and maybe some future networks should be used to provide the collected knowledge to all who can't or won't reach the official main servers. There would be also an increasing download-speed possible, especially for big medias or preloading and maybe the official http servers would be less used, so their reachability could be increased. According to the philosophy of a Wiki, the ability to get edited by all, a decentralized editing without redaction, the publishing of the data would be also decentralized.

Client[edit]

It is aimed to use existing open source p2p-clients to download articles, but maybe this is only possible through a modification of that software. This could be possible through providing modules or patches, which will give them the ability to download these articles. It is also advisable to contact the developer communities of such software.

- maybe a new software will created(best method?), which uses both connection to main servers(to identifier actual article, search, edit, download text) and connection to p2p-networks(to download media and preloading)

Editing[edit]

not (easily) possible - main servers must have the actual version, so editing is only possible at the main servers - p2p networks that support it, or combined clients, might proxy to the main servers; this is useful for accessing wikipedia from hostile environments

Fileformat[edit]

- situation: Existing p2p-networks identifier a file by a content based hash-code and most of them are searchable through filename (including extension), but not by content. Some software (Shareaza, Trid,...?) is able to identify a binary format and/or extract file-embedded metadata (ID3 tags, movie length, etc.) index the metadata and search for the metadata.

Needs[edit]

  • wikiidentifier (en, de, meta)
  • title
  • version/date
  • keywords, categories

Information used to identifier an actual article[edit]

If you call for an article in the Wikipedia, you need only the title of that article. The server will give you the actual version of the article.

- situation: in p2p-networks there are many different versions of the same article(with same title)

...

Links[edit]

  • maybe have to be modified
  • have to contain identifier for articles(see above)
  • there is the need to reserve a keyspace for any and all articles.
    • One article-->one keyspace
    • One article revision --> one key (sequential?) in the keyspace

Existing P2P Wiki Projects[edit]

Bouillon purports to be a P2P-oriented Wiki, implemented over Jabber (wp). There's no single global version of any page—each user's agent retrieves and assembles pieces of a page from user's friends, friends-of-a-friend, etc.

Networks[edit]

eDonkey eMule[edit]

PROs

  • Bigger network population
  • Long availability

CONs

  • It can not host many files using the server infrastructure
  • It could require a dedicated server to index the articles available

Kademlia[edit]

  • DHT

Freenet[edit]

  • slow (should be somewhat faster in 0.7)
  • anonymous
  • hostile environment capable (with darknet)
  • naming is easy - SSK@pubkey,cryptokey,extra/wikipedia/edition#/articleName-language
  • linking is likewise easy (relative links within the SSK)
  • good match to web-style stuff like this
  • 0.8 may have tor-style internal servers which could be used for editing

Gnutella 2[edit]

PROs

  • fast
  • good keyword searches
  • good metadata searches
  • list of available articles is updated immediatly

CONs

  • QHT is limited to 1 million entries (20 bits);
  • QHT bigger than 24 bits are improbable, for now;
  • this probably needs a dedicated QHT (1 entry per article file) as QHT need to be filled no more than 5-10%

BitTorrent[edit]

PROs

  • fast for big files

CONs

  • probably not so fast for small ones e.g. articles?
  • what exactly is distributed-tracker bittorrent capable of?
    • DHT need to insert in the network one key/article at time, so inserting too many articles is problematic for a fleeting client

Distributed revision control systems[edit]

I think p2pedia should work as p2p distributed revision control systems.

See also[edit]

External links[edit]