The Wikipedia Library/Partners

From Meta, a Wikimedia project coordination wiki

The Wikipedia Library

Donor outreach:
Contacting high-quality publishers for donations of free access

The Wikipedia Library aspires to have access to the best available academic and scholarly reference sources in the world. Here is an overview of our existing relationships and our first targets for future partnerships.

TWL has a strong track record of working with database providers. We have worked with over 40 organizations that have donated accounts free of charge to selected, vetted, top Wikipedians. The program is very popular among our volunteer editor base, many of whom do not have access to large, professional databases.

The English Wikipedia sees millions of page views every month, and many of our entries are the top search engine result on that topic. Our content greatly benefits from the research of publishers and databases, and outgoing citation links from articles would benefit databases and publishers by exposing it to our very large readership.

Direct benefits include:

  • Links through Wikipedia's citations to databases, journals, and articles
  • Greater name recognition for partners through being associated with our articles and our donation signups
  • Good publicity from partnering with a very popular free online resource that serves an inspiring global nonprofit mission

The Wikipedia Library takes care of all account management, including vetting suitable recipients for accounts (our criteria include tenure, track record of content work, and account stability), dispensing logins and passwords, and maintaining metrics for how the accounts are used. Publishers and databases simply provide the login information, and TWL takes it from there.

We are available to further discuss these collaborations by phone, email, or video chat for all interested in pursuing a partnership. It has the potential to benefit both our organization and theirs, as well as helping spread properly researched, high-quality encyclopedic knowledge to the world.

Mutual benefit[edit]

Collaborating with Wikipedia provides great benefit to our editors and readers as well as to the publishers and databases. We like to work in a way that helps everyone achieve their goals, win-win.

Background: The scope and scale of Wikipedia[edit]

  • Wikipedia is huge: Wikipedia has over 30 million articles, 4 million of them in English. There are over 16 million images. The site receives 8000 views per second and 500 million unique visitors per month. It's the 6th most popular website in the world. Wikipedia was composed of 2.1 billion edits. It's over 2000 times as large as Encyclopedia Britannica's 2002 deluxe edition.
  • Wikipedia is voluntary: Wikipedia is created almost entirely by volunteers. There are 20 million registered users, but usage follows an internet 'power law'. 130,000 are active each month, and English Wikipedia has only 1,400 administrators... all working for free, with no central control
  • Wikimedia is Global: Wikipedia exists in 286 languages. It is part of the Wikimedia movement, which includes 15 projects covering images, data, dictionary, travel guide, species, quotes, books, source material, wiki software. The sum of all human knowledge in all its parts. All of this content is free for anyone to use, reuse, or even sell. This means that all of the content incorporated into these projects must also be free (with few exceptions for Fair Use).
  • Wikipedia is nonprofit: Wikipedia is facilitated by the Wikimedia Foundation in San Francisco, a donor-funded nonprofit with under 200 employees. This is a minuscule staff and budget compared to other for top 10 web companies. We will never accept advertisements.
  • Wikipedia is principled: Wikipedia is built on a foundation of summarizing rather than engaging in debate, with the infamous Neutral point of view as its guiding star. High quality sources are weighed and incorporated in proportion to the reliability and prominence of their content and views. The site relies on information being Verifiable, meaning an editor or reader must be able to ‘look it up’ in one of those reliable sources. Wikipedia's community is driven by a Consensus model of discourse. Rather than voting in a democratic fashion, or deferring to an oligarchy of judges, Wikipedia almost always relies on extensive discussion and compromise to reach agreement. The community places a high value on civility, the ability to treat collaborators with decency and respect.
  • Wikipedia is reliable: In a 2005 study in the journal Nature, researchers found that Wikipedia was at least as accurate as Encyclopedia Britannica. Follow-up studies have had similar results. Both encyclopedias were found to have minor flaws. Britannica was a bit better organized. Wikipedia's errors were fixed more quickly, however. Wikipedia maintains reliability with a virtual filter on automated, semi-automated, and human review. Edits are informally peer reviewed post-publication, in a rolling process of vandalism hunting, source-checking, and content verification. The encyclopedia embodies the coders' motto: "Many eyeballs make all bugs shallow". With enough good people involved, anything becomes possible.
  • Wikipedia's use is ubiquitous: If there was any remaining debate about Wikipedia's ubiquitous usage, consider surveys about Wikipedia’s medical content. Surveys found that 50% to 90% of physicians, 35 to 70% of pharmacists use Wikipedia, and 94% of medical students use Wikipedia. Rates of usage are even higher for the lay public, and especially for high school and college students.

Principles[edit]

  • Access to high quality published sources enhances the encyclopedia's mission, improves our reliability, and enhances the efficiency of vital research.
  • A variety of free sources are available in local libraries, university libraries and through Google (search, news, archives, books, scholar).
  • Free and universally accessible sources are preferable to use on Wikipedia
  • Many sources are not free or not accessible, requiring one to be in physical proximity to a building or have a subscription to view content.
  • Paywalled databases provides access to a variety of sources that Wikipedians would find useful in their regular content work.
  • Paywalled databases are not inexpensive and would be unaffordable to a majority of volunteer editors who work on the encyclopedia.
  • A collaboration between research databases and Wikipedia would be mutually beneficial.

What's in it for Wikipedia?[edit]

  • Access to publications without paying a subscription fee
  • Enhanced community relations with a provider of education resources
  • Motivation for expanding programs similar collaborations with research databases
  • Another tool in the community's and editors' bag for improving articles

What's in it for publishers?[edit]

Mission advancement
  • Opportunity to improve the content on the largest encyclopedia in the history of the world
  • Deep altruistic motivation to improve a public good which benefits everyone
  • Alignment with organizational goals to spread high-quality information to the public
Exposure and promotion
  • Visibility within the community as having helped out with an essential aspect of site operations
  • Broad promotion of account sign-up opportunities
  • Social media, blog, and newsletter mentions
  • Opportunity to announce partnerships through internal press releases
  • Prominent placement in the Wikipedia Library navigation header
  • Direct links within references back to their websites and articles
  • Publisher credit using the |via parameter of our citation templates
  • Customized userboxes for editors to announce their subscription
High impact on viewership and readers
  • Tremendous leverage of highly trafficked Wikipedia articles, often the most read source on that subject in the world
  • Increased usage of their resources on Wikipedia (prior partnerships have seen 500–600% increases)
  • Wikipedia conducts metrics analysis of entire site data dumps to determine resource usage increases
  • Wikipedia editors provide global exposure to resources beyond the value of purchasing an individual license
  • Greater awareness among readers who follow links that the resources exist and are of useful quality
Security and predictability
  • Only highly active, experienced editors receive accounts
  • Because of strict signup requirements, very little risk of cannibalizing primary revenue streams
  • Wikipedia editors respect copyright and do not plagiarize from articles.
  • Wikipedia editors do not share their account logins
  • Account recipients can agree to appropriate and necessary terms of use
  • Pilot programs with some of the top databases in the world have been tried and tested as successes
Easy and flexible implementation
  • Wikipedia handles the entire signup, distribution, and account management process
  • No contract or formal agreement required
  • Wikipedia handles all customer service issues directly with participants
  • Ability to trial programs with a limited number of accounts
  • Freedom to select whatever number of donated accounts works for the organization
  • Opportunity to review metrics before expanding or renewing partnerships

What it's not[edit]

  • A formal partnership or contractual relationship
  • A formal endorsement of one resource over other similar and competing research services
  • An agreement to advertise the resource services beyond what is normally done for the use of any source
  • An agreement to use a paywalled source where free versions of the same publications are available elsewhere

Metrics[edit]

Credo[edit]

Credo usage increased 500%, from 64 outgoing links to 302 outgoing links, over 4 years.

External Link Growth Resulting from Wikipedia Library & CREDO Partnership
Extended content
  • The results below include all links in articles, provided they were like credoreference.com/something or www.credoreference.com/something or corp.credoreference.com/something.
All links to credoreference.com/something in articles
Date Unique articles Unique links Total links
2009-01-01 48 51 64
2009-03-01 50 53 66
2009-06-01 55 59 73
2009-09-01 56 59 73
2009-12-01 58 61 75
2010-01-01 58 61 75
2010-03-01 59 62 76
2010-06-01 71 77 93
2010-09-01 76 85 102
2010-12-01 88 100 119
2011-03-01 88 100 119
2011-06-20 129 137 163
2011-07-22 137 145 172
2011-09-01 145 152 180
2012-02-11 159 175 209
2012-03-07 158 175 209
2012-04-03 159 174 207
2012-05-02 160 174 208
2012-06-01 165 178 213
2012-07-02 166 181 215
2012-08-02 168 175 206
2012-09-02 167 174 205
2012-10-01 189 195 229
2012-11-01 214 226 260
2012-12-01 218 232 267
2013-01-02 221 235 270
2013-02-04 224 239 274
2013-03-04 222 238 273
2013-04-03 224 237 274
2013-05-03 224 236 273
2013-06-04 225 237 274
2013-07-08 225 238 274
2013-08-05 225 238 273
2013-09-04 226 238 273
2013-10-01 241 258 293
2013-11-04 248 267 302

Notes: The earliest enwiki external links dump available in August 2012 is for July 2011. I already had the June 2011 dump, but none prior to that date. The history method results were found by making an assumption and doing some tricky stuff. I listed the articles containing Credo links in August 2012, then downloaded the wikitext for old revisions of those articles, at the dates shown above; the results were obtained from counting links in those old revisions.

The table method results were found by extracting the information from dumps of the external links table at the dates shown above (same as done for the Highbeam links).

The last row shows results from the external links dump for July 2, 2012. There were links in 166 different articles, and there were 181 different links, giving a total of 215 links.

LinkSearch shows there are currently 355 *.credoreference.com links in all namespaces (and some of those links would be just credoreference.com without a "something" following, and so are not included in the above data).

HighBeam[edit]

References to HighBeam rose over 6,000 links in under 2 years (from 11,308 to 17,773). Incoming traffic doubled.

External Link Growth Resulting from Wikipedia Library & HighBeam Partnership
Traffic from Wikipedia to HighBeam Research after The Wikipedia Library partnership
Extended content
All links to highbeam.com/doc/something in articles
Date Unique articles Unique links Total links Increase
2012-02-11 8462 10277 11308
2012-03-07 8527 10349 11388 80
2012-04-03 8579 10399 11444 56
2012-05-02 8818 10853 11943 499
2012-06-01 9104 11321 12455 512
2012-07-02 9295 11712 12883 428
2012-08-02 9409 12007 13190 307
2012-09-02 9757 12627 13913 723
2012-10-01 10090 13164 14555 642
2012-11-01 10312 13647 15096 541
2012-12-01 10548 14025 15537 441
2013-01-02 10799 14435 16009 472
2013-02-04 11116 14948 16607 598
2013-03-04 11352 15351 17063 456
2013-04-03 11497 15617 17383 320
2013-05-03 11666 15966 17773 390
Notes from the data guru

For example, the first row shows results from the external links dump for February 11, 2012, counting only links of the format shown (or with "www." in front) in articles. There were links in 8462 different articles, and there were 10277 different links. Some links have been used more than once, giving a total of 11308 links in articles. The final column shows the increase in the total from the previous period.

Since I'm recording facts, the new data required downloading 11 files with a total size of 15.8 GB; those files expanded to a total size of 99.2 GB. I noticed a big jump up and down in the size of the files with external links for highbeam.com (up in August 2012 and down in October 2012). My curiosity then led to making the graph shown above, but I haven't tried to find out what was responsible. Johnuniq (talk) 05:03, 18 May 2013 (UTC)[reply]

  • In the two months before the partnership began there were an average 68 references per month to HighBeam
  • In the months following the donation, there were a total of 5,700 links to HighBeam added; 487 references per month
  • That's an average increase per month of 419 references to HighBeam; and a percent change per month of 516%

JSTOR[edit]

Links to JSTOR went from 33,000 in 2011 to 138,000 in 2013.

External Link Growth Resulting from Wikipedia Library & JSTOR Partnership
Extended content

I started by looking at all the JSTOR links in articles in December 2013, and saw no reason to exclude some of them, so I ended up searching for all links of form "jstor.org/something" (not case sensitive, but virtually all occurrences are lowercase).

The following shows all external links of form "jstor.org/something" (not case sensitive) in articles, found by examining dumps like enwiki-20131202-externallinks.sql.gz downloadable from here. Links in pages other than articles are not included.

All links to jstor.org/something in articles
Date Unique articles Unique links Total links
2011-06-20 19,953 27,727 33,001
2011-07-22 20,263 28,121 33,486
2011-09-01 20,696 28,825 34,359
2012-02-11 22,808 29,886 36,294
2012-03-07 23,471 30,600 37,115
2012-04-03 23,733 30,963 37,588
2012-05-02 24,318 31,619 38,400
2012-06-01 24,668 32,598 60,531
2012-07-02 25,698 33,290 40,313
2012-08-02 26,089 33,783 40,897
2012-09-02 26,412 34,164 41,431
2012-10-01 26,756 34,614 41,981
2012-11-01 27,070 35,103 42,578
2012-12-01 27,607 35,803 43,402
2013-01-02 28,055 36,460 44,187
2013-02-04 28,496 37,111 45,016
2013-03-04 82,983 91,903 100,024
2013-04-03 83,679 92,826 100,984
2013-05-03 84,666 94,049 102,352
2013-06-04 84,881 94,304 102,644
2013-07-08 86,430 96,285 104,920
2013-08-05 87,253 97,268 106,059
2013-09-04 88,258 98,458 107,415
2013-10-01 89,031 101,316 121,514
2013-11-04 90,125 101,643 137,188
2013-12-02 90,843 102,537 138,574

The number of pages and links dramatically rose during February 2013—the number of articles tripled, and the number of links more than doubled. I investigated to see if that was some bug in my scripts, but it appears to be correct. It looks as if a tremendous amount of wikignoming went on in that month, but the most significant factor is that the number of "doBasicSearch" links rose from 1300 to 55,300, and most of them appeared to have been placed in articles which had not previously had a JSTOR link. So, in February 2013, 54,000 JSTOR "doBasicSearch" links were added to articles! Looking at a couple of examples shows that doBasicSearch is coming from {{notability}} which generates "Find sources" with a JSTOR search. Confusingly, "Find sources" is not visible on some articles (example), apparently due to {{notability}} being embedded in {{multiple issues}}—however, the JSTOR link is present in the html source of the article, and it is counted as an external link. An example of an article where the JSTOR search is visible is here. All that suggests that I should run my scripts again, but adjust the search to eliminate doBasicSearch links. On the other hand, the fact that a search to JSTOR is visible at the top of thousands of articles could well be seen as a bonus for JSTOR, so for the moment I'll just report what I have found, despite the fact that a significant number of the links are not visible. While comparing the February and March 2013 results it was clear that hundreds of standard reference links were added, as well as the 54,000 doBasicSearch.

Questia[edit]

Incoming traffic from Wikipedia to Questia doubled.

Traffic from Wikipedia to Questia Online Library after The Wikipedia Library partnership