User:Leucosticte/Inclupedia/Proposal

From Meta, a Wikimedia project coordination wiki

This is a proposal for a new wiki, Inclupedia. It exists in several different places in the Wikisphere, but here it is again.

Problem[edit]

Hundreds of useful/interesting articles are deleted from Wikipedia every day for failing to pass notability criteria. Examples would include articles describing the ICapella business application software, speculating on the possibly upcoming Xbox 360 Slim, and biographizing data conversion innovator Daniel Anthony O'Hara, but there are many others. The deletion debates on these articles consume a great deal of time and cause bad blood among editors. Various ideas for deletion reform have been proposed, but the community, fearing harmful consequences of untested ideas, has balked at implementing them.

There is a website, Deletionpedia, that uses a bot to harvest some articles whose deletion is pending. These articles are posted on Deletionpedia, but the wiki is locked from editing, so that further collaboration to improve the articles is impossible. Furthermore, the site's wikilinks to existing Wikipedia articles are non-functional.

Another problem is that Wikimedia splits its content among many wikis, rather than putting it in one place. None of its wikis have been particularly successful, except for Wikipedia. Wikiquote, Wiktionary, and Wikisource, for instance, are superfluous; it would be better to simply have Wikipedia articles entitled, say, quotes related to anime, definitions of anime, and text of Romeo and Juliet, rather than putting those subjects on separate wikis. The fragmentation of content among many wikis would not matter so much if there were other methods of integration, such as multi-wiki watchlists and cross-wiki page existence detection, but this capability has not been developed yet.

Solution[edit]

Create a new MediaWiki-based wiki that collects all the recent changes from Wikipedia as they are being made and imports them into the database. Implement Pure Wiki Deletion to ensure that some of the same deletion-related problems that arise on Wikipedia do not arise on this wiki. Import all Wikipedia content to Inclupedia and update it in real time. Implement other ideas such as forced wikilinks, front and back matter, etc. which are needed to ensure editorial independence from Wikipedia with reference to articles mirrored to the wiki, and which will be explained in more detail later.

Specifics[edit]

There will be two types of articles on Inclupedia.

  1. Articles that exist on Wikipedia and are mirrored on Inclupedia. If you click to edit one of these articles, it takes you to the edit screen on Wikipedia. However, you can click on the History tab and be shown the history as it has been mirrored on Inclupedia.
  2. Articles that do not exist on Wikipedia, but exist on Inclupedia.

Articles in the first type can become articles of the second type, if they are deleted from Wikipedia. Articles of the second type can likewise become articles of the first type, if they are created on Wikipedia. For this reason, Inclupedia will automatically track the New Pages and Deleted Pages via Wikimedia's RecentChanges IRC channels and API and update our database accordingly. There will be a field in the page table to track page type (1 or 2).

Opportunities[edit]

Wikimedia makes (outdated) database dumps available to the public on a regular basis, which can provide a starting point for the wiki database. The database can then be brought up to date within a month or two of collecting recent changes. Most Wikimedia content is Creative Commons-licensed, which obviates most copyright issues. The MediaWiki codebase, including extensions, is open source and available under the GNU Public License. Collaboration with other programmers is possible via MediaWiki.org and the MediaWiki Bugzilla. Used of shared web hosting services (including virtual private servers) could help minimize costs in the early stages of the project. There are large numbers of Wikipedia editors and ex-editors who are disgruntled at current deletion policy. The major for-profit competition, Wikia, founded by Wikipedia founder Jimbo Wales, is poorly-managed and has been mocked by the media and criticized by its users for its shortcomings. First-mover advantages could be significant, given the network effect inherent in wikis. In some ways, the trail has already been blazed for us, because there are already massive MediaWiki installations on the Internet (most notably, Wikimedia) that have figured out the methods of load-balancing, Squid caching, and such and made those methods publicly available.

Challenges[edit]

Technical[edit]

Many parts of MediaWiki code are inadequately documented, which increases the time required to figure out how to work with it. Creativity will be needed to come up with ways of implementing the functionality we need without creating a fork of the code, which would cause us to lose many of the advantages of being part of the MediaWiki development community. The OAI-PMH Wikimedia update feed service apparently is no longer publicly available, and feasibility of other methods, such as RSS, has not yet been examined in detail. When we get into our stride, we will probably be the largest wiki in the world, managing file systems containing millions of files and massive databases, some of whose tables will have millions of rows. The sheer quantity of data will present challenges, including the financial cost of storing, manipulating and serving all that data, which will amount to several terabytes.

Legal[edit]

Some content is provided under "fair use"; the legal issues involved there will need to be examined in greater detail. Wikimedia is sensitive to the legal issues involved in resurrecting deleted content, due to the possible presence of copyright violations, libel, threats, etc. and may oppose this project as a potential legal liability for them. Wikimedia monitors traffic to its sites and blocks websites, especially commercial sites retrieving data from Wikipedia for what it considers unacceptable purposes, that cause excessive load on its server resources. Jimbo Wales still serves on the seven-member Wikimedia Board of Trustees, and may not be very pleased to see competition overtake Wikia.

Competitive[edit]

On the other hand, because all of our project's software will be open-source, Wikia and other sites will also be able to readily copy and use it. Thus, we could find ourselves taking on the risk while other sites reap the rewards. It is also possible, especially if we don't stay on our toes with reference to constantly making the site the best it can be, that another wiki could do to our wiki the same thing that we are planning to do to Wikipedia – that is, mirror the content, improve upon site implementation, and integrate the material into a larger system. That result might not be inconsistent with our mission, however.