Research:Social media traffic report pilot
Social media direct a high volume of reader traffic to Wikipedia articles. These spikes may lead to increases in vandalism or other disruptive behavior. However, Wikipedia editors currently have no reliable source of information about which articles are receiving high volumes of traffic from social media at any given time.
We propose an experiment in which we will publish a daily report of articles that have recently received a high volume of traffic from social media platforms, to help editors monitor and maintain the quality of these articles. The platforms we currently plan to focus on for the experiment are Facebook, Twitter, Reddit, and YouTube. The experiment will last for 1-3 months, after which we will evaluate the impact of the intervention through both analyses of editing activities on those articles and by eliciting feedback from editors who used the report.
Background and motivation
Wikipedia articles are shared widely on social media platforms, by platform users and the platforms themselves. Wikipedia is huge and covers a wide variety of topics, any of which could be relevant to current events happening anywhere in the world. Because of the viral nature of information propagation on the internet, Wikipedia articles shared on these social media can experience huge, sudden traffic spikes. External events can cause an apparently uncontroversial, low-traffic article to quickly go from languishing in obscurity to being in the global spotlight in a matter of days or even hours. Furthermore, some social media platforms link to Wikipedia articles to “fact check” controversial content shared by their users. The potential unintended consequences for Wikipedia of social media platforms using it as a free fact checking service in this way is not known.
Currently, the Wikimedia Foundation makes aggregate counts of article views available publicly via the REST API and the PageViews tool. However, these counts represent total traffic (e.g. external traffic from search engines and news articles, or internal traffic from other Wikipedia articles). The Wikimedia Foundation does log information about the platform that traffic comes from (e.g. google.com, facebook.com) but this data has traditionally not been made available publicly due to sensitivity/privacy concerns. A highly-aggregated form of this referral data can be found in the monthly clickstream datasets.
Despite the lack of granularity of the current public pageview data, editors still rely on this information extensively to monitor reader activity and support patrolling work. On English Wikipedia, the top 25 report and the the top 5000 report together received close to 500 views per day in 2019, which approaches the daily traffic of popular editor-facing pages like the Village Pump.
- Assess user acceptance and usefulness of the traffic reports
- Elicit design requirements for improving the traffic reports
- Assess the impact of traffic reports on editor behavior
- Characterize the kinds of articles that receive social media traffic spikes
- Evaluate the impact of social media traffic (vs. other traffic sources) on article quality
- Identify potential disinformation campaigns coordinated via social media