Wikimedia Enterprise is a service of the Wikimedia Foundation available via enterprise.wikimedia.com. The goal of the service is to build services for high-volume commercial reusers of Wikimedia content. The service was announced in March 2021 (blogpost, WIRED article) and launched in October 2021 (Press release, OpenFutures article).
Idojukọ naa wa lori awọn ẹgbẹ ti o fẹ lati tun akoonu Wikimedia pada ni awọn aaye miiran, pese awọn iṣẹ data ni iwọn nla, ki wọn yara ati okeerẹ diẹ sii, igbẹkẹle, ati aabo. Idawọlẹ Wikimedia ni ero lati mu ilọsiwaju olumulo ti awọn oluka Wikimedia kọja awọn oju opo wẹẹbu tiwa; mu arọwọto ati iwari akoonu; ati ilọsiwaju imọ ati irọrun ti ikasi ati ijẹrisi nipasẹ awọn ajo ti o tun lo data akanṣe Wikimedia julọ — nipasẹ awọn iṣẹ igbeowosile ara ẹni.
Idiwo ti o ga pupọ wa si titẹsi fun lilo data Wikimedia, ni ita awọn ọran lilo ti o wọpọ ti kika tabi ṣiṣatunṣe. Eyi jẹ nitori pe akoonu jẹ lile fun awọn ẹrọ lati pin ati oye, eyiti o ni ipa lori bii data iṣẹ akanṣe Wikimedia ti de opin ilolupo ti ara wa, ati iwọn ipa ti o le ni.
In the Movement Strategy recommendations to increase the sustainability of our movement and improve user experience there are the recommendations to, respectively: "Explore new opportunities for both revenue generation and free knowledge dissemination through partnerships and earned income—for example...Building enterprise-level APIs," and "Make the Wikimedia API suite more comprehensive, reliable, secure and fast, in partnership with large scale users.... and improve awareness of and ease of attribution and verifiability for content reusers."
It is well known that a few massive companies use our projects' data. Those companies recognize that without the Wikimedia projects, they would not be able to provide as rich or reliable an experience to their own users. There has long been a feeling among community members that these companies should do more to reinvest in the Wikimedia communities for the benefits they gain from the content and resources they use.
This led to the idea of developing a new approach that is more sustainable in the long term and provides a much clearer relationship between Wikimedia and enterprise users. Most financial benefit for Wikimedia would likely only come from a very small handful of heavy for-profit users, and would feed back into the Wikimedia movement.
As this idea developed, it became clear there is a responsibility to democratize our data for organizations that do not possess the resources of these largest users, to ensure we are leveling the playing field and helping to foster a healthy internet without reinforcing monopolies. The benefits of such a service shouldn't just be for startups or alternatives to the internet giants, but also for universities and university researchers; archives and archivists; along with the wider Wikimedia movement.
Idojukọ Idawọlẹ Wikimedia wa lori awọn iṣowo ti o tun lo akoonu wa, ni igbagbogbo ni iwọn nla-fun apẹẹrẹ, ṣepọ sinu awọn aworan imọ, wiwa, awọn oluranlọwọ ohun, awọn maapu, ijabọ iroyin, awọn irinṣẹ agbegbe, awọn ohun elo ẹnikẹta, ati awọn iwadii iwadii kikun-corpus. Imudara ọpọlọpọ awọn data data ti Wikimedia lati fi igbekalẹ lẹhin akoonu wa ti a ko ṣeto yoo gba gbogbo awọn olumulo akoonu wa laaye lati pade awọn ibeere kọọkan wọn lakoko ti o tun ṣeto wa lati kọ awọn irinṣẹ ati awọn iṣẹ tuntun ni ọjọ iwaju, ti o wa fun gbogbo eniyan. Awọn atunlo akoonu wa n wa awọn paati pataki mẹta:
- Frequency: Regular current snapshots of Wikimedia projects
- Reliability: Dependable, accessible infrastructure
- Quality: a “best last revision”
Emphasizing a self-funding set of specific use cases allows the Wikimedia API team to focus on volunteers, teams, and organizations looking to access (and, most importantly, interact with) our data sets. This includes the majority of community editing tools, which will be out of scope for this service. For more information on improvements to the existing Wikimedia APIs see the service page on the "API Gateway" initiative.
Awọn ibi-afẹde eto:
- 'Akoonu: Ṣe diẹ sii ti akoonu iṣipopada wa ni awọn ọna kika ẹrọ deede, wa larọwọto fun gbogbo awọn oniwadi ati awọn olumulo tun-olumulo.
- Iru-orisun: Din iwulo fun aaye-kikan-giga-scraping nipasẹ igbohunsafẹfẹ-igbohunsafẹfẹ ati awọn atunlo iwọn-giga, eyiti o fojusi lọwọlọwọ awọn olupin iṣelọpọ wa.
- Ikowojo': Pese ọna ti o han gedegbe ati diẹ sii fun awọn olumulo tun-olumulo lati tun ṣe idoko-owo awọn anfani ti a mu pada si iṣipopada, dipo ṣiṣe awọn ẹbun altruistic lẹẹkọọkan ti o yatọ ni iwọn.
Contact the team if you would like to arrange a conversation about this service with your community, at a time and meeting software platform of your choice.
Past public meetings:
2021 March #1
2021 March #2
Oṣù Kẹfà 2022
Oṣù Kejì 2023
Following are the introduction paragraphs for a much more detailed Community essay.
Libre and Gratis are the two meanings of “free,” commonly phrased as free as in speech, or free as in beer.
Wikimedia projects are, have always been, and will always remain libre. The principles of free cultural works mean that anyone can use Wikimedia without restriction, including commercially. As a movement, we embrace this. It is why we reject ‘non-commercial’ licenses, as they would limit the kinds of reuse possible. And it is why we consider commercial reuse an important means of distributing knowledge to audiences.
Equally, Wikimedia projects are, have always been, and will always remain gratis. The ability to freely access the knowledge available across all Wikimedia projects has always been core to the mission of the Foundation and the movement. We provide this access not only to individuals visiting our websites but also programmatically to machines so that our content can be repurposed in other environments. The full corpus of Wikimedia content always has been, and will continue to be, made available for reuse in various forms (including but not limited to database dumps, APIs, and scraping) at no cost.
As a result, our content is often repurposed by for-profit organizations that rely on it to support their business models, and which consequently earn revenue from it. Outside of voluntary corporate donations to the Wikimedia Foundation, the movement has never received benefits from any of this revenue through return investment. In acknowledgement of this, under the heading of Increase the sustainability of our movement the Movement Strategy process asked the Wikimedia Foundation to explore, among other things, “enterprise-level APIs...models for enterprise-scale for-profit reusers, taking care to avoid revenue dependencies or other undue external influence in product design and development.” Furthermore, under the heading Improve User Experience, a further recommendation stated, "Make the Wikimedia API suite more comprehensive, reliable, secure, and fast, in partnership with large scale users where that aligns with our mission and principles, to improve the user experience of both our direct and indirect users, increase the reach and discoverability of our content and the potential for data returns, and improve awareness of and ease of attribution and verifiability for content reusers."
The Enterprise project team is developing a new resource aimed at for-profit content reusers, who have product, service, and system requirements that go beyond what we freely provide. Use of this offering will not be required for for-profit content reuse; companies can continue to use the current tools available at no cost. All Enterprise API revenue will unequivocally be used to support the Wikimedia mission—for example, to fund Wikimedia programs or help grow the Wikimedia Endowment.
This project represents a new kind of activity at the Foundation. The project is at a very early stage that should be considered a learning period. We will have successes, we will make mistakes, and we will need to adapt our strategies. The team is committed to listening, engaging, and where possible, integrating the feedback we get on our work. This document is organic and is reflective of the team's current thinking; we are attempting to document as much work as possible in the open. Up until now, our work has been shaped by a series of initial interviews with community members, Wikimedia Foundation Board and staff, researchers, and reusers.
Given the nature of the service, primary decision making for it will rest with the Wikimedia Foundation. We are seeking community input, in particular from the technical community and those who have been involved in the strategy process, throughout the lifetime of the service. Technical feedback has been gathered from colleagues at the Wikimedia Foundation, industry and research partners, technical partners across the movement, and with the broader technical communities via Phabricator. Input into the funding development side of the service will follow a similar pattern. We will continue gathering input via research interviews and focus groups, as well feedback here on Meta as per our principles.
Over time, the "product" being offered will grow and improve. This information is accurate as of February 2023.
High-volume reusers that use an infrastructure reliant on the EventStream platform depend on services like RESTBase to pull HTML from page titles and current revisions to update their products. High-volume reusers have requested a reliable means to gather this data, as well as structures other than HTML when incorporating our content into their KGs and products.
Wikimedia Enterprise On-demand API, at release, will contain:
- A commercial schema
High-volume reusers currently rely heavily on the changes that are pushed from our community to update their products in real time, using EventStream APIs to access such changes. High-volume reusers are interested in a service that will allow them to filter the changes they receive to limit their processing, guarantee stable HTTP connections to ensure no data loss, and supply a more useful schema to limit the number of api calls they need to make per event.
Enterprise Realtime API, at release, will contain:
- Filtering of events by Project or Revision Namespace
- Guaranteed connections
- Commercially useful schema similar* to those that we are building in our On-demand API and Snapshot API
*We are still in the process of mapping out the technical specifications to determine the limitations of schema in event platforms and will post here when we have finalized our design.
For high volume reusers that currently rely on the Wikimedia Dumps to access our information, we have created a solution to ingest Wikimedia content in near real time without excessive API calls (On-demand API) or maintaining hooks into our infrastructure (Realtime).
Enterprise Snapshot API, at release, will contain:
- 24-hour JSON, Wikitext, or HTML compressed dumps of "text-based" Wikimedia projects
- A hourly update file with revision changes of "text-based" Wikimedia projects
- JSON dumps will contain the same schema per page as the On-demand API.
There are several methods to obtain access to the Enterprise API datasets:
- Realtime API (Batch and Streaming) and daily dump file in NDJSON format through the dedicated Enterprise API product website https://enterprise.wikimedia.com/
- Creating an account via the Enterprise API product website includes 10,000 on-demand API requests and monthly snapshot API file in NDJSON format at no cost.
- An update of the Enterprise API data is provided for all every two weeks at https://dumps.wikimedia.org/other/enterprise_html/
- Snapshot API + Realtime (Batch) via Data services, available to anyone with a Wikimedia cloud services account.
- Those who have a non-commercial and mission-relevant use-case, which cannot be fulfilled by existing free-access APIs/dumps etc, can request ongoing access to the paid service at no cost.
The Wikimedia Foundation staff who work specifically on this project:
Names in bold indicate management.
Many people from different teams also contribute significantly, including from the WMF Legal, Engineering, Partnerships, Design, Communications teams etc. Additional contract work provided by: Speed & Function are providing engineering support; PartnerHero provide customer support services; Vuurr are assisting our sales process; and Super Natural Design are the designers of the project website.
The board of the LLC overseeing the project are Ex officio from Wikimedia Foundation leadership, representing their Wikimedia Foundation staff roles. This includes the Chief Advancement Officer Lisa Seitz-Gruwell; General Counsel Stephen LaPorte; Chief Product and Technology Officer Selena Deckelman; and Lane Becker who serves as the LLC's president. The LLC is subject to the governance of the Wikimedia Foundation Board of Trustees as described at the Wikimedia Foundation Board Statement on Wikimedia Enterprise revenue principles.
Documents covering the legal relationship of the LLC to the Wikimedia Foundation are published on the Governance Wiki under "Category:Wikimedia Enterprise". Specifically, these are the operating, cost-sharing, and inter-company licensing agreements.The LLC's legal registration can be found at the State of Delaware, Division of Corporations, Entity name: Wikimedia, LLC, File number: 7828447.
|Commercial launch - October 2021|
Wikimedia Foundation Press release
Of particular note:
|First customers - June 2022|
Of particular note:
- API: Oju-iwe akọkọ – atokọ aarin ti gbogbo Wikimedia APIs.
- Wikitech: Awọn iṣẹ data portal – Atokọ awọn iṣẹ ti o dojukọ agbegbe ti o gba laaye fun iraye si taara si awọn ibi ipamọ data ati idalẹnu, bakanna bi awọn oju opo wẹẹbu fun ibeere ati iraye si eto si awọn ile itaja data.
- Ibudo Idawọlẹ – oju-iwe kan fun awọn ti o nifẹ si lilo sọfitiwia MediaWiki ni awọn ipo ajọ.