Hashtags

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Other languages:
English • ‎dansk • ‎español • ‎العربية • ‎مصرى • ‎中文
Screenshot of the homepage of the tool.

Hashtags is a tool for monitoring and searching the usage of hashtags in Wikimedia edit summaries. The tool monitors edits to all Wikimedia projects (except Wikidata, see below) and allows users to search for them. It was initially created in 2016, but was rewritten in 2018.

Usage[edit]

Example search results from the tool (https://hashtags.wmflabs.org/?query=1lib1ref).

The tool is hosted on a Cloud VPS instance at https://hashtags.wmflabs.org/. The 'Trending Tags' section on the right lists the most commonly used hashtags from the past 30 days.

To search for a specific hashtag, simply enter it in the Hashtag field and click Submit. Results can be further limited by Project (e.g. `fr.wikisource.org`) and within a date range with one or both of a start and end date. To search for multiple hashtags within the same search, simply separate them with commas, e.g. 1lib1ref, 1bib1ref. Edits with multiple searched hashtags in the same edit summary are only returned once.

URLs take the form https://hashtags.wmflabs.org/?query=<hashtag>&project=<project>&startdate=<YYYY-MM-DD>&enddate=<YYYY-MM-DD>. Any parameter except query can be omitted.

To get into more details for a particular search, click on the Show statistics button. Currently, three graphs will be displayed - Top projects, Top users and Edits over time. The Top projects and Top users graphs show the top 10 Wikimedia projects and top 10 users respectively, both sorted by decreasing number of edits. To view the full list of projects and users click on View full stats button. The Edits over time graph shows edits over days if the range of search results is less than 90 days, edits over months if the range is more than 90 days but less than 3 years, and edits per year in other cases. You can also download each of these statistics as a CSV file by clicking the Download CSV button.

Until 2018 the tool was hosted at https://tools.wmflabs.org/hashtags. Due to high resource usage and a very large database, the tool was rewritten and moved to a dedicated Cloud VPS instance. Most URLs from the old tool now redirect to the new one, including any search queries.

Hashtags[edit]

The Hashtags tool monitors hashtag usage from all Wikimedia projects except for Wikidata via the recentchanges EventStream. Hashtags are matched using the regex (?:^|\s)[##]{1}(\w+) which matches all hashtags that aren't only numbers.

Wikidata is currently excluded from data collection due to the large volume of data it would generate. As an example, nearly 7 million monthly edits are tagged with #quickstatements there. See T207029 for details.

Bot edits are also excluded from data collection due to high edit rates. Individual bot edits can easily be tracked by looking at the contribution history of those individual bot accounts.

From 2016 to 2018 data was collected from various language Wikipedia projects, as each was requested by editors. Therefore data earlier than August 2018 does not include other Wikimedia projects or many Wikipedia languages. There is a gap in the data from August 8th to early September 2018 as a result of the time between the old tool being taken down and the new one starting up.

Contributing[edit]

Contributions to the hashtags tool's development are welcomed. The source code is available on Github and open tasks are listed on Phabricator. The tool runs on Django via Docker containers. Instructions on local setup can be found in the Github README.