Wikimedia Suomi software project ideas
This page is for documenting potential project ideas for Google Summer Code (2026), Outreachy (round 31), thesis works, interns etc.
Ukbot Development
[edit]Ukbot is a bot which counts scores for editing competitions, developed originally by Danmichaelo and now managed by WMNO (Jon Harald Søby) and WMFI (Zache). WMFI/WMNO are currently focusing on more structured development for it. Primary focus is on bugfixes and crosswiki support.
- Project repository: https://github.com/WikimediaNorge/UKBot
- Example competition: https://fi.wikipedia.org/wiki/Wikipedia:Viikon_kilpailu/Viikon_kilpailu_2025-07
- Programming language: Python
- Network/hardware requirements: Low
- Existing coding skill requirement: Low (tasks range from documentation to refactoring)
- Ongoing work
- Finalize slow SPARQL query fix by moving backend API helper script to be as part of Python web UI (currently it is separate PHP script helper)
Cat-a-lot development
[edit]Cat-a-lot is a JavaScript gadget mainly used in Wikimedia Commons for categorizing photos. About 5% of daily Wikimedia Commons users use it. Zache from WMFI fixed some bugs in autumn 2024 and it was our Outreachy project in Outrachy (round 30) where more bugs were fixed. However, it doesn't have an active developer and there are some features to be implemented from round 30, so this could be suitable if there's interest for somebody to adopt it.
- Documentation: https://commons.wikimedia.org/wiki/Help:Gadget-Cat-a-lot
- Programming language: JavaScript / HTML / CSS
- Network/hardware requirements: Low
- Existing coding skill requirement: Moderate
- Phabricator tickets
Imagehash project
[edit]Wikimedia Finland has a project where we are indexing all Wikimedia Commons images using perceptual hashes. The dataset is used matching images between repositories and detecting if image is already uploaded to Wikimedia commons. The current main task would be to write a Java version of phash/dhash algorithms that would produce the same results as the Python imagehash library. An alternative task could be implementing the image hashing function used by ISCC to Python's imagehash library.
- https://github.com/Wikimedia-Suomi/ImageHash-Toolforge
- https://github.com/JohannesBuchner/imagehash/issues/212
- Programming language: Python / Java
- Network/hardware requirements: Moderate
- Existing coding skill requirement: Moderate to high
Add RDF as export option to Overpass-turbo
[edit]Currently, there is an option to download OpenStreetMap data in GeoJSON format which can be converted to RDF. However, it would be useful to be able to directly download it in RDF format suitable for importing to local SPARQL tools that support GeoSPARQL, such as Apache Jena. The task would involve modifying Overpass-turbo code to add an RDF export option, configuring Apache Jena to support GeoSPARQL 1.1, and then writing a tutorial about how to import data from Overpass-turbo and Wikidata to Jena-Fuseki and querying combined data using GeoSPARQL.
- https://github.com/tyrasd/overpass-turbo
- https://jena.apache.org/documentation/fuseki2
- Programming language: JavaScript (Overpass), Java (Jena), SPARQL
- Network/hardware requirements: Linux
- Existing coding skill requirement: High
- https://overpass-turbo.eu
FlaggedRevs PendingChangesBot
[edit]Some Wikipedias (dewiki, plwiki, huwiki, fiwiki, ruwiki; see the full list) use an extension named mw:FlaggedRevisions for tracking changes to articles. There are two different modes. In the first mode, edits need to be approved before they are shown by default to unregistered users. In the second mode, edits are directly visible to all users, and FlaggedRevs is used for approving changes. In most configurations, regular users are approved automatically, while edits from unregistered and new users are reviewed via FlaggedRevs. This system tends to generate a huge backlog, which is handled in Finnish Wikipedia by SeulojaBot—originally developed as a proof-of-concept at a hackathon in 2016 using PHP. The world has moved forward, and notably there are now LLMs that can be used for analyzing edits. Therefore, it is time to rewrite it using Python, with a proper end-user web interface and support for multiple different Wikipedias.
- Historical python version: https://github.com/zache-fi/PendingChangesBot
- Outreachy round 31 version: https://github.com/wikimedia-suomi/PendingChangesBot-ng
- Phabricator board: https://phabricator.wikimedia.org/tag/pendingchangesbot/
- Programming language: Python, Pywikibot, Django + machine learning and LLM:s are nice
- Network/hardware requirements: low
- Existing coding skill requirement: low to moderate
Wikikysely
[edit]Prototype implementation of a multilingual wiki survey tool built with Django and with Wikimedia Oauth login. All questions belong to a single main survey. Administrators can edit the survey description, manage questions and change the state of the survey.
- Source code: https://github.com/Wikimedia-Suomi/wikikysely
- Homepage: https://wikikysely.toolforge.org
- Dev / testing version: https://wikikysely-dev.toolforge.org
- Projectpage: https://fi.wikipedia.org/wiki/Wikiprojekti:Wikikysely_2025
Pywikibot development
[edit]PendingChangesBot and UkBot include features that could be upstreamed to Pywikibot, such as FlaggedRevs API support, Liftwing/ORES interfaces, and page load statistics. As UkBot is a stable, widely-used project, it could also be suitable as a GSoC or Outreachy project.
Other projects
[edit]These are projects what we developing as we are using tools by ourselves. However, they are not considered to be GSoC, Outreachy or student projects as they would require either preparation work first, or they are too tightly linked to be finnish language.
FinnaUploadBot
[edit]This is a generic topic for all Finna-related work on Wikimedia Commons and Wikidata. Finna is central aggregator for Finlands universities, libraries, museums and archives.
Work is linked to the imagehash database, as Finna image uploading and image metadata updates requires methods for matching images between Commons and external repositories. This work is also linked to Wikidata imports, as image metadata requires Wikidata items and properties. Finna also contains a vast amount of CC0-licensed metadata, which is imported to Wikidata to be used independently. We are trying to design a system where users can select photos using a web UI to be uploaded to Commons independently, instead of making import requests. The current bottleneck for this is that Finna metadata is not unified**,** so it will need human curating at the import phase.
- Main account with bot info: c:user:FinnaUploadBot
- Source code: https://github.com/Wikimedia-Suomi/Finna-uploader
- Images uploaded: c:Category:Files uploaded by FinnaUploadBot
WikiShootMe (fork)
[edit]A generic use case for photography competitions, photowalks, and POI-based crowdsourcing tasks is that we have tools which show searchable/filtered data on a map. Currently, we are using modified WikiShootMe and Toolforge wrappers for this. However, this is in github and should be cleaned up and parts which are suitable (Geojson layers etc) should be upstreamed.
Examples:
Fiwiki local support
[edit]Helping with maintaining templates, Help pages, Lua modules, TemplateData, gadgets, and tracking Phabricator tickets etc. Accessibility Improvements: Converting Finnish Wikipedia's inline CSS styles in articles and templates to templatestyles. The aim is to ensure compatibility with the Mediawiki’s new dark mode and with WCAG 2.1 accessibility standards. This started in 2024 and will continue in 2025 also.
Project pages:
- w:fi:Wikipedia:Ulkoasun ja mallineiden ylläpito ja korjaukset
- w:fi:Wikiprojekti:Wikidata/Ranskankielisen Wikipedian Wikidata-moduuli
Misc ideas
[edit]Misc ideas which are on project plans, but doesn't have any timetable.
- Howto query Toolforge replica databases using Ontop SPARQL server and performance analysis. (Java, SQL, SPARQL)
- Ajapaik Flutter App coding - (Flutter OR Python)
- Importing Ajapaik photos and metadata to Wikimedia Commons (Python, Django)
- WikiScore development (editing competition tool by Wikimedia Brazil)
- Wiki Loves Monuments web app by Wikimedia italy development - source code, backend is Django, frontend is React)
- Montage is used in WLM Jury but it has bugs and is unmaintained.