Research:Data/FAQ

From Meta, a Wikimedia project coordination wiki

What are your open access policies?[edit]

WMF encourages open access and ties technical, financial and data collection support for research to openness requirements as an incentive for open-access projects. See Wikimedia Foundation support for details.

Where can I download images and media files?[edit]

For more information about downloading images and media files see Where are images and uploaded files.

Where can I get RDF data?[edit]

Wikipedia data is available in RDF format on the third-party website DBpedia.

Where can I get page view data?[edit]

The best source of page view data are the Pageview statistics collected from Squid logs. WikiStats offers some information about page views based on the same source. Note that these are not unique hits.

Do not use the API to get page view stats as many requests only make it to the Squid cache.

Where can I get unique visitors data?[edit]

See https://analytics.wikimedia.org/dashboards/vital-signs/

Where can I get editor data?[edit]

Edit counters:

You can also get a variety of editor data through the API, IRC recent changes feeds, and WikiStats (see WikiStats).

Is it OK to spider the website?[edit]

It's a very bad idea to spider Wikipedia projects' websites. It puts load on the servers and will take you longer than downloading the project's dump. Use the dumps when possible. If you have to spider the website, spread your requests in time and use low-traffic hours.