Reliable Wiki and Structured Information

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
This is a proposal for a new WMF sister project.
Reliable Wiki and Structured Information
Status of the proposal
ReasonInactive proposal. --Sannita (talk) 13:09, 18 September 2013 (UTC)

Disclaimer: This is a vision and an improvement suggestion for Wikipedia (and all related sister projects of Wikipedia that use similar infrastructure). While I am very eager to see this happen and will be happy to devote significant voluntary effort if this is undertaken, at present I am not in the position of starting the project (and say come up with a demo) alone. To make this project happen we need volunteers with PHP and DBMS expertises, besides initiative of the core members of the Wikimedia Foundation. So, if you show interest in joining the project please include your expertises and describe how you think you will be able to contribute to the project by signing here.

  1. To increase reliability of the contents of Wikipedia while remaining consistent with the core Wikipedia philosophies, so as to make Wikipedia an acceptable and reliable source of knowledge in main-stream academia.
  2. To organize information, data and facts in Wikipedia in a more structured manner so as to enable easier access, presentation and adaptation to future technologies.

The Proposal: Reliable Wiki and Structured Information - The future of Wikipedia[edit]

I personally believe, and all of us will agree, that Wikipedia has been one of the biggest achievements on the internet over the last decade or so. Besides being the largest encyclopaedia ever created by human, I guess it's fair to call it the biggest, most comprehensive and best organized library in history. With technology progressing at amazing rates the potential of Wikipedia for future is beyond our imaginations. However at some point we all have realized that the very strengths which have been the reason for the amazing success of Wikipedia have often fired back as potential pitfalls. Also, sometimes we have wondered how Wikipedia will adapt to the progress of technology with time. It is important that something like Wikipedia, that had the effort of hundreds of thousands of people put behind its development and that has so much potential, be sustainable and successfully be able to live upto the expectations and popularity that it has gained in long term. Here are a few thoughts that Wikimedia Foundation may consider for implementation in future for its sustainable growth and usability.

Reliablity - slow and steady (growth) will win the race (in long run)[edit]

The issue of lack of reliability because of the "open structure" of Wikipedia has plagued it for a long time. I know students who have used reference to amazing articles on Wikipedia for their school/college projects, only to be refuted because Wikipedia can be edited by anyone and hence isn't reliable! Moreover I came across some recent news articles that talked about the decreasing number of expert volunteers for wikipedia, and hence the declining quality of articles. Let's face it - this strength of Wikipedia that has helped it to grow exponentially has fired back! If we just take a while to look at the bigger picture, not blindly sticking to the Wikipedia religion of "anybody can edit", we should be able to make some compromise and revise the methods, rules & policies of editing Wikipedia. I guess it's high time that Wikipedia go more technical and rely more on experts for contents rather than any random volunteers who may or may not be main-stream experts on the field. For an encyclopaedia, it's important to present the main-stream and widely accepted concepts accurately, while also covering the alternative perspectives. So it's important that it gets input from main-stream "experts" on the subjects with different perspectives. Here are a few thoughts towards implementation of such a structure:

  1. The campaign for recruiting experts as volunteers: This is the first step that should be taken towards implementation. Much of the attention of the core volunteers and employees of Wikipedia should go towards recruiting experts and requesting experts to check newly added contents rather than themselves maintaining/editing the articles. Wikipedia has now grown to such a level of popularity that many experts will be able to value its importance & potential in academia and will volunteer to develop & maintain it. For example, students may request their professors (expert on certain fields) to volunteer for Wikipedia, who in turn will spend some of their valuable time developing & approving contents for Wikipedia. Wikipedia should gradually create a database of experts, their respective fields of expertise and how much time they have volunteered to devote for Wikipedia. Choosing an expert may be done by assessment of his/her publications on the field in main-stream journals, his/her experience working in that field, his/her affiliations, and through references.
  2. Holding new contents & edits for approval: This is a technical feature that needs to be added. Instead of letting random volunteers edit the contents directly, any edit made by non-expert volunteers will be held for scrutiny by an expert or by core volunteers. Only experts will be able to approve major edits or addition of new contents (possible after they make their own edits, if required), and when they approve it his/her signature will be added to the new edit/content so that users can reliably refer to it. The core group of volunteers will be able to approve only minor edits. Upon submitting a edit or new content to an article, the non-expert volunteer should be able to appeal to a list of experts for approval. Otherwise the review request will be sent to all relevant experts, who will be able to see that edit/addition as an item on their "pending reviews" page of their Wikipedia account. Once one of the experts approves/rejects it, the request will be removed from the "pending reviews" pages of all the experts or will be moved to "rebuttal request" page (say, if the original submitter of the content disputes a rejection). It is to be noted that this feature needs to be in addition to the present revision comparison/revert feature. Only experts should have the privilege of revision reversion. There will be two talk pages - the general talk page editable by anyone, and a expert discussion page editable to the experts only.
  3. Roles of experts: The primary roles of the experts will be:
  • Reviewing, correcting and endorsing existing contents in Wikipedia that has been developed so far.
  • Assigning/requesting the tasks of creating, correcting and reviewing specific contents to individuals/volunteers/experts that he/she thinks will be able to do justice to the topic.
  • Helping in recruiting new experts by refering them.
  • Creating contents (under creative commons license) with proper references and following the present Wikipedia quality standards.
  • Reviewing, approving/rejecting and passing on (to other experts) new contents and edits made by non-expert volunteers.

Structured information & dynamically generated contents - a vision for the future[edit]

While text contents, organized in paragraphs and structure of English language, is good for readability, it's probably not the best as far as maintainability is concerned. For many items Wikipedia maintains same/similar contents in multiple places, all of which need to be updated once the information changes. For example, a statistical data (say population of a certain country) gets updated periodically. There may be multiple articles, multiple places in a single article, multiple figures/graphs, etc that use the same data. Every time the data gets updated, all these contents need to be individually updated as well. This, besides involving unnecessary repeated manual task, eventually may lead to missing updates, old & inconsistent information. Well, this is a challenge that can possibly be solved using the growing and powerful IT tools. Some thoughts follow.

  1. Structured database of information & data: Wikipedia can maintain a structured database of information and statistical data that are prone to change. Items in this database should be accessible through easy wiki markups. For example, the present population of USA can be referenced as <<>> in any wiki content, and every time the wiki page is displayed the database is looked up for the corresponding value and the text is generated. Thus, all the pages in wikipedia that has referenced the value of present population of USA will be updated automatically as soon as the corresponding database entry "" is updated - thus reducing the burden of updating all the pages. One should be able to search the database for items for referencing the appropriate ones in new contents. The updating of the database and adding new items can be done/approved by experts using similar method & policy as described before.
  2. Derived items in database: The database described above may contain derived items using complex formulae. For example, say <<>> gives the present population of the Pennsylvania state (and similarly for other states), and <<>> returns an array of the state codes of USA. Then <<>> in the database may be a derived quantity given by "%v=0; for (each %sc in {; } return %v;".
  3. Dynamically generated media: Graphs, charts, etc may be generated dynamically using items from the aforesaid database. One may imagine that there may be APIs (functions) that generate vector graphics using items from the database. So, a pie-chart showing the distribution of population among the countries, will get updated as soon as the database entry for "" is updated. Creating such media from a user's perspective is a different story though - some user-friendly GUI tools may be developed for creating such media. However to relieve the burden of the computationally expensive process of dynamically generating the graphics every time the page/content is requested, it is always possible to maintain a cache where the generated graphics will be stored as a image file, which in turn can be updated periodically by just running the API that generate the graph by fetching the data from the database.

In future, information will be even more easily accessible & navigable than clicking on links while reading through paragraphs. It may be integrated with our brain and senses for fast and quick retrieval. It's like having the whole encyclopaedia in ones head - the third super-hemisphere of the brain. The way that information can be structured to make it accessible using such devices is not through paragraphs of texts, but through a graph structure of information. The paragraphs and texts in English or any other language may be a layer on top of the underlying information graph, just for the purpose of presentation & readability. One good initiative in such a direction can be found in the Visual Thesaurus or Google Image Swirl. It is important that Wikipedia (and in fact the whole of the internet) starts organizing its contents & information in such structure so that it can to adapt to the new technologies of the future with least hiccups. In fact in this process Wikipedia may set new standards and methods of sharing information over the internet and hence taking a lead in progressing information technology as a whole.

Wikipedia - the free, ever-growing, most informative and largest Encyclopaedia - that's the way we want it to be. It has come a long way towards it's goals - thanks to the great initiatives and efforts made by some extraordinary people who have taken action to realize the dream of a world where knowledge & information is free and accessible to all. Now it's time to make it stronger and grow in a sustainable way. My dream, and that of many others, is to see Wikipedia as the central library of the entire human knowledge that connects each of us and help in progressing humanity towards a better future.

Subh83 20:09, 26 November 2009 (UTC)

People Interested in joining[edit]

Besides your name/userid please include your expertises and how you think you will be able to contribute to the project. We will start a mailing list once we have significant number of volunteers for this project.

  • Subhrajit Bhattacharya, Subh83 - I am willing to develop the framework for MediaWiki that will enable the implementation of the above ideas. I hope my PHP and SQL expertises will be of some help.


Add your comments in the talk page