Ask a question/FAQ
- 1 Business
- 1.1 Where can I learn more about advertising on Wikimedia Foundation projects?
- 1.2 How can I request Wikimedia support for my charity?
- 1.3 How can I propose a new project or a new way to do things on an existing project?
- 1.4 May I use your logos or trademark?
- 1.5 May I use your software on my own site?
- 1.6 May I reuse text or images from your sites?
- 1.7 May I mirror or copy your sites?
- 2 Wikimedia Foundation
- 3 Content
- 4 Participation
- 5 Research
- 5.1 I have a question about Wikimedia projects that I want answered
- 5.1.1 Where do I find general statistics?
- 5.1.2 Where do I find statistics about a specific product audience?
- 5.1.3 Where do I find official numbers to prepare a talk or a press release?
- 5.1.4 I have an urgent press query and I need some data, who do I talk to?
- 5.1.5 Where can I find the exact definition of an official metric used by WMF?
- 5.1.6 What does the Wikimedia Research team do? Can it support my team’s data analysis needs?
- 5.1.7 I want to pitch a new project to Research and Data, what should I do?
- 5.1.8 How do newly designed features and products go through usability testing?
- 5.2 I want to conduct my own research on Wikimedia projects
- 5.2.1 Does my project need approval?
- 5.2.2 How can I propose a formal collaboration between my team and the Wikimedia Foundation?
- 5.2.3 Can the Wikimedia Foundation financially support my research?
- 5.2.4 Can the Wikimedia Foundation write a letter of support for my grant proposal?
- 5.2.5 Does the Wikimedia Foundation share private data under NDAs for research purposes?
- 5.2.6 Where do I find a list of open datasets released by WMF for research purposes?
- 5.2.7 Can the Wikimedia Foundation help me collect data for my study?
- 5.2.8 How do I get special API privileges for my research?
- 5.2.9 How do I release a dataset?
- 5.2.10 What kind of traffic data is collected by the Wikimedia Foundation?
- 5.2.11 What instrumentation data is collected for product analytics?
- 5.2.12 How do I access instrumentation data (EventLogging)?
- 5.2.13 I wrote a schema and I need to verify the quality of the data collected, who can help me?
- 5.2.14 Can I get access to Wikimedia's production databases to run some analysis?
- 5.2.15 I want to run a survey, how do I get started?
- 5.3 I want to learn about current research findings related to Wikimedia projects
- 5.4 I want a job researching Wikimedia projects
- 5.1 I have a question about Wikimedia projects that I want answered
Where can I learn more about advertising on Wikimedia Foundation projects?
The Wikimedia Foundation does not accept advertisement.
The Wikimedia Foundation is not against the world of online advertising or against other organizations that host ads, but it does not believe that advertising belongs in a project devoted to education, particularly one that is driven by the values consistent with a balanced, neutral encyclopedia. The global volunteer community has always felt that advertising would have a major effect on our ability to stay neutral and that ultimately ads would weaken the readers' overall confidence in the articles they are reading. Even if advertisers put no pressure on us to slant articles to their favor, readers may fear that they exert an influence, consciously or otherwise.
In addition, the Foundation has strong views about reader privacy. Current models for web advertising are inconsistent with these, particularly contextual advertising, which reads the content you are viewing. The Foundation also thinks it intrusive to deliver ads to readers based on their geography.
If you'd like to read more about the history of discussions about advertising Wikipedia - including both pros and cons - the volunteer community has written a page about it here.
How can I request Wikimedia support for my charity?
As you may already know, the Wikimedia Foundation is a nonprofit charitable organization with a very specific goal, which is to develop and maintain our suite of online open-content educational resources in all the languages of the world, to be distributed free of charge to the public. As we are a nonprofit organization incorporated in Florida, United States, local and national laws prohibit us from using our funds for anything but this purpose.
How can I propose a new project or a new way to do things on an existing project?
We appreciate that you're spending time thinking about ways to improve Wikimedia.
Because the Wikimedia Foundation does not create or curate the contents on Wikipedia or the other sites we manage —this work is done by a vast community of volunteers—we are not able to implement most suggestions directly by email request. Changes to existing projects, like Wikipedia, and new project approval (like new languages for existing projects or entirely new concepts for sites) come from this community of volunteers.
If you want to propose an entirely new project, please visit Project proposals. If you'd like to propose that we create a new language version of an existing project, please visit Meta's "Requests for new languages" page. If what you'd like to propose is a new feature for one of our existing projects, please share your thoughts with the volunteers at that project. There are community forums for ideas and suggestions. Most of our projects have a link on the right hand side to community discussion points, sometimes given titles like "village pump" or "cafe" or "travelers pub." For example, the English Wikipedia discusses ideas and proposals here.
Just as article pages can be edited by anyone, so can these discussion pages. If you're not already familiar with how to edit our pages, the MediaWiki guide on editing can be useful. If you need further assistance, either with editing or with finding a proper home for your discussion, the community maintains a volunteer email response team who should be able to assist. They can be reached at infowikimedia.org. Please be sure to tell them specifically which project and in what language you are working on; if you are more comfortable corresponding in languages other than English, they can in many cases communicate with you in your native tongue.
We hope that the community will be receptive to your idea.
If you have a business proposal that is not a new project, a new language version of an existing project, or a modification of practices on an existing project, please contact businesswikimedia.org
May I use your logos or trademark?
To find out more about using Wikimedia's trademarks, including whether your intended use is permitted, please see our Trademark policy.
Please do not hesitate to contact us at trademarkswikimedia.org if you are not sure whether your use is in compliance with that policy or local trademark laws.
May I use your software on my own site?
We welcome and encourage you to do so! MediaWiki is free server-based software which is licensed under the GNU General Public License (GPL). MediaWiki is an extremely powerful, scalable software and a feature-rich wiki implementation that uses PHP to process and display data stored in a database, such as MySQL.
The MediaWiki homepage should have all the information you need to install and use the software for your own purposes. While the Wikimedia Foundation encourages people to use the software for their own purposes, we are not able to provide training in the use of the software or answer questions about it, but the volunteer community is frequently able to help. If you cannot find the answer you seek about using MediaWiki software at the help page on the website, there is a support desk where you can ask specific usage questions at the Support Desk. You can also reach assistance through IRC or email at the avenues listed at The "Communication" page.
May I reuse text or images from your sites?
Note: If you are interested in hosting a mirror or downloading considerable content from one or more of our projects, see "May I mirror or copy your sites? instead.
May I mirror or copy your sites?
Note: If you are interested in using only a small amount of content from one or more of our projects, see "May I reuse text or images from your sites?" instead.
Since sharing information is a key part of our mission, we welcome and try to facilitate the spread of our content. You are, of course, welcome to link to our website in the traditional sense that you inform your readers of its existence and offer them the option to visit our website for further information. However, we do not support live mirroring (remote loading or hotlinking) of our websites. Given our unique copyright situation, it's actually better for you to host a text dump of earlier versions or locally hosted copies of our articles (see this section). For downloadable dumps of Wikipedia article database, see the database downloads page. You may also be interested in the Mediawiki API.
How can I tell if a "wiki" or other website belongs to you?
Because "wiki" is a generic term that may be used by anyone, there is sometimes confusion about which websites the Wikimedia Foundation hosts. Beyond this, other sites may use MediaWiki software—which is free and open for anyone—and therefore look similar to our projects. The only projects which are part of the Wikimedia Foundation are those listed at the "Our Projects" page.
If you're trying to identify who owns a website that does not belong to us, you may need to look for a link on the site that indicates it is "About" the site or use a WHOIS tool.
If you should happen upon a website that uses the Wikimedia Foundation's marks or indicates it is hosted by the Wikimedia Foundation but that is not on the list, we would appreciate you letting us know at answerswikimedia.org.
How can I learn more about job opportunities at the Wikimedia Foundation?
Thank you for your interest in working for the Wikimedia Foundation!
Job openings are listed here. If you see a job that looks like a good match for you, please click on it to read a more complete description. Information on where to send your resume will be included in the description. Usually, there is a link at the bottom that says "Apply Now."
If none of the jobs currently listed there seem suitable to you, please keep an eye on that page. New job openings are listed there routinely as they become available.
What are the Wikimedia Foundation's major values and beliefs?
The Wikimedia Foundation believes that all people everywhere should be afforded equal access to information. It supports network neutrality and the free culture movement. It believes in the need to conquer the digital divide, which results in the economic or cultural marginalization of individuals with limited access to technology. It respects the rights of human beings to basic privacy and dignity. The Wikimedia Foundation also believes that the environment is important; it strives for sustainable business practices.
The Foundation holds that censorship is incompatible with its mission. In May 2011, when the Board of Trustees passed its resolution on dealing with controversial content, it affirmed that "Wikimedia projects are not censored." Curating knowledge for an international community of all ages will certainly mean the display of materials that some may find offensive or upsetting. The Board supported the principle that users should be able to choose what content to access and encouraged the responsible curating of content so users might reasonably expect what they will encounter when viewing a page or using a feature, but continued in its explicit support of access to information for all.
How do I contact the people or entities that are written about on your sites?
While the websites we maintain host articles about a wide number of people, companies and corporations, we do not hold contact details for them. Contact details may be included in the article about the subject or in links provided within the article. If not, you may wish to use a search engine or other resource to try to locate this information. As an organization that relies almost entirely on the good will of volunteers, we do not have the resources to research inquiries of this nature.
How do I find out more information about subjects on your sites?
Since content on our educational projects is not created, reviewed, or controlled by a central authority, but by members of the public volunteering to help out, we are not able to offer more information about the subjects on our sites via email. As an organization that relies almost entirely on the good will of volunteers, we do not have the resources to research such material.
However, you may be able to get additional information on the site itself. Many languages of Wikipedia host a "reference desk"—an online resource where various volunteers do try to answer knowledge-based questions. The English Wikipedia's, for instance, is located here. You can find a long list of reference desks in other languages in the bottom of the toolbar on the left side of the page. Though there is no guarantee that they can provide an answer, they are often able. Please be specific in your question so that others can better assist you.
Anything you post to the Reference Desk will become public. Therefore, we do not recommend that you post personal information such as email addresses or phone numbers. Generally, once you post your question you can check back on the webpage in a day or two to see if volunteers have been able to answer you.
How can I report misuse of my copyrighted content on your sites?
We're sorry to hear that you've encountered this problem on our site. Unfortunately, the Community Advocacy team, which answers questions here at the Hub, cannot assist directly in content takedowns. The Wikimedia Foundation does not create or curate content on our sites; rather, this work is done by a vast community of volunteers.
There are several approaches to having content removed if it infringes your copyright.
First, you can reach out to our designated agent for a DMCA takedown request (see this page for more details). You can reach our designated agent via email at legalwikimedia.org. If you'd like to review our DMCA policy, it is located here.
Alternatively, you can reach out to the community of volunteers directly to request content removal. They are available via email at info-en-cwikimedia.org. Wikimedia sites, including Wikimedia Commons, do not have a central authority, but the volunteers who work at these email addresses are experienced users who know policies and processes and can assist you with such requests.
Please do not send requests to both addresses, as this may delay handling of your request. You should either process your request through the legal team at the first address or through the courtesy queue staffed by volunteers.
How do I report inaccuracies in content on one of your projects?
Our projects are "wikis", which means that anyone visiting the site can edit or add to most pages. In most cases, if you believe that content could be improved, we ask you to address it on the site yourself.
First, you can edit almost any page directly. You don't need to apply or get special permission to join us. At the top of each page is an "edit" label. Try it for example at the sandbox on the English Wikipedia. You don't even need to log in to edit, although creating an account gives you more options and helps you keep track of your contributions. You can create an account on our educational projects by pressing "create account" in the top right corner.
Our projects are open to volunteers and encourage people to pitch in. You can generally find information on how in the sidebar of each project. The English Wikipedia's Introduction and Tutorial are useful reading for how to edit MediaWiki software, if you choose to contribute directly.
Please note that while contributions are welcome, the volunteer communities who create and curate content do have policies and guidelines which they have crafted to which content must adhere. These will vary according to the project you are editing and can generally be read by following the links on the left side of the page on a given project. For instance, the policies that govern Wikipedia ask that you remain neutral in your prose and provide reliable sources to substantiate the information you add. Content that does not meet local policies may be modified or removed.
If you do not wish to correct the issue yourself, you can raise your concern for review by members of the community. Each page on our projects has an associated "discussion page" or "talk page"; you can access this by clicking the "discussion" link at the top of the page. You can then voice your concerns by selecting the "new section" link in the tabs at the top of the page. You will see two text boxes for you to write in: one for a title for your note and one for the note itself. (See the MediaWiki help page on talk pages if you would like more information on using them).
If other contributors are not receptive to your note or edits, there are dispute resolution processes you can follow on the sites. You can frequently find more information about these by pressing the "help" link found on most projects in the sidebar on the left. If you cannot find the dispute resolution processes on a given project, you should be able visit the help desk or community portal to ask local volunteers on that project how to proceed.
In addition to dispute resolution processes within a specific project (like the Wikipedia project that is concerning you), there is a cross-wiki discussion point called "Meta" which is intended to coordinate work across projects. If a particular project is having internal issues that the local community cannot overcome, it may be possible to reach out to other Wikimedians around the world for assistance there. The process used for this is called "Requests for Comment". We recommend being as concise possible in explaining the issue and offering clear "diffs" or "links" to pages and edits that exemplify the issue. It will be helpful to show the Meta community where members of the local community have tried to resolve the problem and failed.
In some cases, it may be appropriate to reach out for information or assistance to the volunteer email response team at infowikimedia.org. The volunteer email response team receives a large number of emails every day, and they do not have the capacity or the mandate to help with most minor corrections or standard content disputes. They may be able to assist people in special circumstances, however. Before writing, it's a good idea to check on the project where you are encountering difficulties to see if there are specific instructions for contacting volunteers on that project or specific information on how they may be able to help. For instance, the Dutch Wikipedia page on their volunteer email response system includes specific details for what to do in various circumstances. (You may be able to locate this information on other projects by pressing the magnifying glass in the search bar, typing "OTRS" in the box, and pressing "Help and Project Pages" beneath the box.) Some projects include that information in the link on the left labeled "Contact page". (See, for example,the English Wikipedia's "Contact us" page.)
If you do choose to reach out to the volunteer email response team, please keep in mind that our projects have no central editorial board. While volunteer responders are chosen from among the volunteer community by other volunteers for their experience on the projects, they can only act in accordance with the community-created policies and processes of the projects they serve. In some very exceptional circumstances, they may be able to help you directly, but, if not, should often be able to help you determine the best way to proceed.
If contacting the volunteer email response team, please clearly explain the issues you are encountering and, if you are writing the general address, please specify the language and project where you are experiencing the issue (for example, French Wiktionary; Russian Wikipedia).
How can I donate my own copyrighted content to your sites?
Thank you for your interest in donating content to our projects!
The volunteer community who create and curate the bulk of the content on our projects have crafted processes for facilitating such donations. Recommended steps on the English Wikipedia can be found at Wikipedia:Donating copyrighted materials. Several other language Wikipedias have pages describing the process; you can see the list of languages in which it is available and access those by clicking "languages" in the left toolbar. Specific information on donating images and other media files can be found at Commons:Email templates. If you have any questions on donating copyrighted content not answered at those pages, you may wish to consult the web-based "help desk" on Commons or use the "Help" link in the left toolbar to locate a help forum on the project to which you wish to donate. You can also write to permissionwikimedia.org.
How can I help with translating content to other languages?
Wikipedia relies on volunteers who generate and maintain all content as well as creating policies and guidelines to govern the site. It is a collaborative project, with people from all over the world bringing their skills and interests to join in the compilation and dissemination of knowledge to everyone, everywhere, free of charge. The other projects we maintain are also collaborative, crowd-sourced projects that rely on volunteers. Translation is a volunteer-driven activity on our websites, just like content creation.
A general approach to translation from English Wikipedia to other projects is provided here. This approach is likely to succeed on most projects with most languages. If you want specific advice from other volunteers, you can reach out to the "help" or community discussion forum on the project where you want to place the translation. These are generally linked from the side of every page. If you can't find it, you can write to infowikimedia.org for more information. Please, in that case, specify the language project where you want to work (for instance, Italian Wiktionary; French Wikipedia).
If you are interested in helping to translate official documents used for management of Wikimedia projects, this work is also done by volunteers. Meta's "Babylon" page is a good place to begin. There is a section there on getting started which includes some important links, and there is also a section on communication that tells you some of the best places to get in touch with other translators, who may be able to give you specifics about the work. We recommend reading the tutorial linked from the "getting started" section before beginning, if you choose to pitch in, as the system actually looks more complex than it is.
The following information is transcluded from Research:FAQ.
I have a question about Wikimedia projects that I want answered
Where do I find general statistics?
Most reports and dashboards used by WMF are maintained by the Analytics Engineering team in coordination with individual audience teams. The most frequently used reports are:
- Wikistats V2
- Meta:Statistics (includes reports by department)
- Pageviews Analysis
- Vital Signs
- Browser Stats
In particular, for more information about the Pageviews Analysis tool, see this meta page. For Wikistats V2, see this page as well as this FAQ about how it differs from the now legacy Wikistats. The new Wikistats V2 is run by the Analytics Engineering team. Analytics Engineering is responsible for the Wikimedia Foundation's analytics and data computing infrastructure, but as a general rule the team is not responsible for defining or providing ad hoc analyses of specific metrics, which as a general rule are owned by the respective Product Audience teams. You can contact the Analytics Engineering team via the Analytics-l mailing list: analyticslists.wikimedia.org.
Where do I find statistics about a specific product audience?
Looking for data, metrics and statistics about editors, readers, donors etc? You may be able to find an answer at the Audiences hub, where the Wikimedia Foundation's Audiences (formerly Product) group provides metrics for each audience area that are reported on a monthly basis to the Foundation's Board of Trustees. Separately, you may also want to check this extensive list of stats resources. In many cases, a search on Wikistats may give you the answer you need. For further information, you can reach out to the appropriate audience team.
The following is a list of audience teams and examples of the key metrics they report on:
- User-perceived load time, zero results rate, API usage, search user engagement, search engine ranking, referred-traffic, maps and WDQS usage
- Active editors, new active editors, Wikipedia article edits, Wikipedia article edits via mobile
- Pageviews by desktop and mobile web, usage breakdown by global region, web unique devices, Android and iOS app uniques and installs
As a general rule, the best way to issue a request is not to directly contact a data analyst but to direct your query to the product manager.
Teams in the Audiences group use Phabricator tags to track analysis-related requests:
Similarly, the Fundraising Tech team uses the fundraising-analysis tag.
The best way to formulate a request in Phabricator is to specify:
- What's requested. If you know what you want, be specific! For example, don't just ask for "data about multilingual Wiktionary editors", ask for "the number of contributors who edited more than one Wiktionary in the past month". If you have a question but don't know how it can be answered, say what you've already tried.
- Why it's requested You don't have to write an essay, but give me enough context that I can interpret, adapt, and prioritize your request. For example, "the number of multilingual Wiktionary users will help us decide whether to give a developer a $10,000 grant to write a tool for them."
- When it's requested. If you have a deadline, explain what it is and what it's tied to. For example, "the Wiktionary tool developer needs to make summer plans, so we need this information by 15 May." Don't just say "as soon as possible." If we drop everything we're doing and work all night, tomorrow is probably possible; is your request so urgent that we need to do that? :)
- Any other helpful information, like relevant documentation.
Where do I find official numbers to prepare a talk or a press release?
The Communications team maintains a press kit. You can reach the Communications team via communicationswikimedia.org. For official monthly metrics (vetted by the respective audience teams, who report these to the Wikimedia Foundation board every month), see mw:Wikimedia Audiences. For other statistics, please look up the list of reports mentioned above or get in touch with the appropriate audience team.
I have an urgent press query and I need some data, who do I talk to?
As a general rule, press queries about specific audiences, related metrics and products should be directed to the Wikimedia Foundation's Communications team, they will in turn connect with the appropriate team. Please contact the Communications team if you receive a query from a reporter and they will help you handle the response: communicationswikimedia.org.
Where can I find the exact definition of an official metric used by WMF?
Metrics used in WMF's official reports maintained by the Analytics team are defined here. If you have questions about the definition of metrics relevant to a specific audience, please contact the corresponding team.
What does the Wikimedia Research team do? Can it support my team’s data analysis needs?
The Wikimedia Research's mandate is to help design and test technology informed by qualitative and quantitative research methods and produce scientifically rigorous knowledge about Wikimedia's users and projects. Examples of projects led by the Research team include:
- models to detect missing citations (blog)
- recommender systems for expanding Wikipedia across languages
- multi-faceted approaches to characterizing reader behavior (blog)
- guidance of how ethics and human-centered AI should be incorporated within Wikimedia
The team can provide guidance on metric definitions, experimental design, statistical and methodological support on an ad hoc basis. Individual Audiences teams are responsible for data analysis and metric definition for their corresponding audiences. You can contact the R&D team via our (internal) department mailing list research-wmflists.wikimedia.org.
I want to pitch a new project to Research and Data, what should I do?
The Research and Data team partners with other teams in the organization, community members and academic researchers to design and run projects that typically span multiple months of work. In order to engage with the team, your project will likely be:
- a minimum of one or two quarters in projected time frame
- ahead of specific products or interventions being designed or tested
If you think your project meets these requirements, you can contact the team via this mailing list: research-wmflists.wikimedia.org or by creating a Phabricator task in the backlog of the Research and Data board. If you are looking for audience-specific metrics and statistics, please get in touch with the respective team's product owner.
How do newly designed features and products go through usability testing?
The Design Research team (DR) supports iteration of concepts and functionality toward a usable and intuitive experience for users. It also provides guidance to other WMF teams via a range of qualitative methods including, but not limited to, usability testing. Requests for the team can be submitted via phabricator. The team also conducts generative research and collaborates with Research and Data and other teams in order to help define what products and user experiences at a high level should be built (and why) for specific types of users, based on their needs. You can contact the DR team via our (internal) department mailing list research-wmflists.wikimedia.org.
I want to conduct my own research on Wikimedia projects
Does my project need approval?
Most research is conducted independently, without knowledge by or approval from the Wikimedia Foundation. Rarely, the Wikimedia Foundation will provide practical support for certain research projects, such as projects that require access to non-public data. Researchers may not claim any support, approval, or special privileges from the Wikimedia Foundation unless they have a signed, written agreement with the Wikimedia Foundation that says they do.
Observational research generally does not require approval from anyone. Interventional research projects may require cooperation from the affected communities. Before beginning an interventional research project, we recommend disclosing it at a community forum, such as the local community's village pump. You should be prepared to engage in discussion with community members and, if necessary, to modify your research plan based on their feedback. Some communities require such disclosure and discussion.
How can I propose a formal collaboration between my team and the Wikimedia Foundation?
As of March 2015, all formal collaborations with external researchers – both in academia and in industry – that receive support by Wikimedia Foundation staff are subject to our open access policy, which secures the openness and immediate reusability of the output of these projects: data, code and scholarly publications. In most cases, a formal collaboration will require the partner team to sign a Memorandum of Understanding (MOU) acknowledging the terms of the policy. Additional agreements may be required, particularly when the collaboration includes access to private data. For frequently asked questions about this policy, you can read this page. A list of current formal collaborations with the Research team – as well as the process followed by the team to set up new ones – can be found on this page.
Can the Wikimedia Foundation financially support my research?
The Wikimedia Foundation sponsors research projects of strategic importance in the form of grants. Grants can be issued to individuals and organizations alike and can be awarded via calls for participation or directly allocated in the case of research commissioned by the Foundation. More information on different types of grant, and the corresponding requirements, can be found on this page. Research sponsored by a grant from the Wikimedia Foundation is subject to the terms of our open access policy.
Can the Wikimedia Foundation write a letter of support for my grant proposal?
The Wikimedia Foundation does not directly participate, unless in exceptional circumstances, in grant applications or research consortia as a partner institution, due to legal and financial constraints that come with restricted funding. However, we are happy to support individual research projects of particular strategic importance by providing formal endorsements. Letters of endorsement are signed by a C-level or by their delegate, they form part of a formal collaboration and are subject to the terms of the Wikimedia Foundation's open access policy.
The Wikimedia Foundation can issue non-disclosure agreements (NDAs) to allow researchers to access private date under the terms of the Open access policy and subject to the organization's priorities and policies. The process to request access to private data requires writing a research proposal, describing the type of data requested, why it's needed, and the expected outcomes of the proposal. More details can be found on the Wikimedia Research team's formal collaboration instructions.
Where do I find a list of open datasets released by WMF for research purposes?
A comprehensive list of open datasets published by WMF for research purposes can be found at m:Research:Data. You can also search the Wikimedia Foundation's entry on the DataHub or Wikimedia-related datasets available on Figshare. For access to private data, see this question.
Can the Wikimedia Foundation help me collect data for my study?
As a general rule, researchers at the Wikimedia Foundation have little bandwidth to provide data collection / data analysis as a service, outside of the scope of formal collaborations. We are always happy to provide guidance and recommend the appropriate tools, data sources and libraries for a given study on an informal basis. The best way to get support is to post a request to wiki-research-l (for anything related to research design, methods, state of the art on a specific research topic) or to analytics-l (for data sources and APIs maintained by the Wikimedia Analytics Engineering team). You can also get support via the corresponding IRC channels, irc:wikimedia-research and irc:wikimedia-analytics. If your request is about recruiting participants for a survey or study, see the corresponding question.
How do I get special API privileges for my research?
You can access the MediaWiki API to retrieve data from Wikimedia projects with the standard permissions that are granted to your registered username. For most types of data you will not need any kind of special privilege. In some cases the Wikimedia Foundation can grant special permissions (such as high API request limits) on a temporary basis to individual users for research purposes. When these privileges are granted by WMF staff, they form part of a formal collaboration and are subject to the terms of the Wikimedia Foundation's open access policy.
How do I release a dataset?
Releasing open data about Wikimedia projects for research purposes, while respecting our privacy and data retention policies, is in line with Wikimedia's values and mission to disseminate open knowledge. The Wikimedia Research team maintains an open data repository via the DataHub that anyone can contribute to. We also register and host open datasets for research purposes on Figshare, for citability and discoverability. If you are in a team at WMF dealing with sensitive data, before releasing a new dataset, particularly data obtained from private sources and/or containing personally identifiable information, it is mandatory to consult with the Legal and Security teams. The Research and Data team can provide best practices on how to publish and document the dataset, once its publication has been cleared by these two teams. The release of data from Fundraising is subject to additional restrictions due to our donor policies: before publishing any reports including anonymized or aggregate data from Online Fundraising, please review these guidelines and obtain explicit approval from the team.
What kind of traffic data is collected by the Wikimedia Foundation?
The Wikimedia Analytics Engineering team maintains several large-scale datasets on Wikimedia traffic, stored and processed via a Hadoop cluster, from unsampled HTTP request data to aggregated pageview data. Detailed documentation on these datasets can be found at: wikitech:Category:Data_stream. As a general rule, traffic data hosted on Hadoop is considered private and accessing it is restricted to WMF staffers and people covered by an NDA. If you are part of a team at the Wikimedia Foundation interested in analyzing traffic data, you can get in touch with Analytics Engineering to request access to the corresponding data stores. Article-level pageview data can be publicly accessed by anyone via the Pageview API.
What instrumentation data is collected for product analytics?
A comprehensive list of instrumentation data collected via EventLogging (and their respective owner) can be found at m:Research:Schemas. To inquire about a specific schema, please contact the owner on the schema's talk page.
How do I access instrumentation data (EventLogging)?
Instrumentation data used for testing new products and features and measuring how users interact with our sites is provided by a platform maintained by the Analytics Engineering team called EventLogging. As a general rule, instrumentation data is private and access is restricted to WMF staffers and people covered by a non-disclosure agreement. If you are part of a team at the Wikimedia Foundation interested in producing or analyzing instrumentation data, you can get in touch with Analytics Engineering to request access to the corresponding data stores.
I wrote a schema and I need to verify the quality of the data collected, who can help me?
Product and engineering teams at the Wikimedia Foundation use a platform called EventLogging to collect data and measure user interactions with Wikimedia sites. If you are a user of this platform, you'll find yourself asking not only where the data lives but also if the data collected matches the specification and if the instrumentation captures the data as intended. Extensive documentation on EventLogging, its architecture and the data stores it uses is available on this page. Schemas defining data that is collected via EventLogging can be found on this page. The responsibility to audit the quality of data collected lies with the engineers who wrote the instrumentation, in coordination with analysts within their team and product managers who are familiar with feature design and workflows. The Analytics Engineering team can provide guidance about the collection of high-throughput data and help identify appropriate sampling rates, when applicable, as well as providing information on the retention window for sensitive data, subject to the Wikimedia Foundation's privacy and retention policies.
Can I get access to Wikimedia's production databases to run some analysis?
The Analytics Engineering team maintains real-time replicas of Wikimedia's production databases for analysis purposes via an internal SQL cluster. Production databases contain private data and their access is restricted to WMF staffers and people covered by a non-disclosure agreement. If you work for a team at the Wikimedia Foundation interested in analyzing this data, you can get in touch with the Analytics Engineering team to request access to the cluster. Alternatively, if your request does not involve private data, you can use Quarry or PAWS to perform and save queries against a censored version of Wikimedia's entire production databases.
I want to run a survey, how do I get started?
The Learning and Evaluation team maintains the Survey Support Desk - a one-stop shop for anything related to surveys in the Wikimedia context for Wikimedia Foundation staff, Wikimedia affiliates, and volunteers. The team also maintains and provides access to survey platforms used by WMF. The Design Research team can provide overall guidance and support to other teams at WMF on survey design. The Research and Data team can provide guidance on best practices on strategies for participant recruitment on-wiki. All WMF-run surveys must be reviewed by the Legal team -- see this internal page for more information.
Surveys run by academic researchers need to meet community expectations before participant recruitment can begin. Creating a research project and discussing the proposed recruitment strategy on wiki-research-l are good, preliminary steps towards successful recruitment of participants for a study. There aren't any global policies regulating third-party research or mechanisms for large-scale subject recruitment, but best practices have been discussed in a number of contexts. en:WP:Research and en:WP:SRAG are the product of the joint efforts of the research community and the English Wikipedia community to try and satisfy two goals:
- Create a mechanism for mass subject recruitment
- Protect the community (and individuals) from the disruption that mass recruitment could cause
Along with these two documents, a few essays are available as tools for educating Wikipedians about research:
Where can I learn about current research projects at WMF?
We run a weekly, cross-departmental research group every Thursday at 9:30am PT to discuss research in progress, present early results or get feedback on the design of new projects. The meeting is regularly attended by members of the Research and Data and Design Research teams, analysts with various Product teams and from Learning and Evaluation but it's open to anyone in the organization interested in participating. We also host more formal, public presentations on a monthly basis via our Research Showcase and at Monthly Metrics meetings, which you can attend in person if you're in the SF office or watch online via YouTube.
Where can I learn about existing research on a specific topic?
There are several places where you can learn about previous and current research. The most comprehensive resource covering research on Wikimedia projects is the Research Newsletter. The newsletter is a collaboratively maintained monthly overview of new research, edited by Tilman Bayer and Dario Taraborelli with contributions by several volunteer reviewers. It has been published monthly since 2011 and has a fully searchable archive. You can also follow the latest research updates hot off the press via the @WikiResearch handle on Twitter, by subscribing to wiki-research-l or by attending the Wikimedia Research monthly newscase (also available on YouTube). The Wikimedia Research Codex is a complementary effort to summarize past research by organizing it by topic instead of by date; it in currently in progress, and topics are prioritized depending on team needs.
What conferences should I attend to learn about academic research on Wikimedia projects?
There are several scholarly conferences with dedicated tracks on Wikimedia research and/or a long record of publications in the field. The best research on Wikipedia and other Wikimedia projects today happens at conferences such as CSCW, ICWSM, OpenSym, WWW. Wikimania also has regular tracks dedicated to research on our projects.
I want a job researching Wikimedia projects
Are there any research and analytics jobs at the Wikimedia Foundation?
Current openings for part-time and full-time positions in Research, Analytics Engineering and Product are listed on the Wikimedia Foundation's jobs website.