Jump to content

Wikimedia Suomi software project ideas

From Meta, a Wikimedia project coordination wiki

This page is for documenting potential project ideas for Google Summer Code (2026), Outreachy (round 31), thesis works, interns etc.

Ukbot Development

[edit]

Ukbot is a bot which counts scores for editing competitions, developed originally by Danmichaelo and now managed by WMNO (Jon Harald Søby) and WMFI (Zache). WMFI/WMNO are currently focusing on more structured development for it. Primary focus is on bugfixes and crosswiki support.

Ongoing work

Cat-a-lot development

[edit]

Cat-a-lot is a JavaScript gadget mainly used in Wikimedia Commons for categorizing photos. About 5% of daily Wikimedia Commons users use it. Zache from WMFI fixed some bugs in autumn 2024 and it was our Outreachy project in Outrachy (round 30) where more bugs were fixed. However, it doesn't have an active developer and there are some features to be implemented from round 30, so this could be suitable if there's interest for somebody to adopt it.

Phabricator tickets

Imagehash project

[edit]

Wikimedia Finland has a project where we are indexing all Wikimedia Commons images using perceptual hashes. The dataset is used matching images between repositories and detecting if image is already uploaded to Wikimedia commons. The current main task would be to write a Java version of phash/dhash algorithms that would produce the same results as the Python imagehash library. An alternative task could be implementing the image hashing function used by ISCC to Python's imagehash library.

Add RDF as export option to Overpass-turbo

[edit]

Currently, there is an option to download OpenStreetMap data in GeoJSON format which can be converted to RDF. However, it would be useful to be able to directly download it in RDF format suitable for importing to local SPARQL tools that support GeoSPARQL, such as Apache Jena. The task would involve modifying Overpass-turbo code to add an RDF export option, configuring Apache Jena to support GeoSPARQL 1.1, and then writing a tutorial about how to import data from Overpass-turbo and Wikidata to Jena-Fuseki and querying combined data using GeoSPARQL.

FlaggedRevs PendingChangesBot

[edit]

Some Wikipedias (dewiki, plwiki, huwiki, fiwiki, ruwiki; see the full list) use an extension named mw:FlaggedRevisions for tracking changes to articles. There are two different modes. In the first mode, edits need to be approved before they are shown by default to unregistered users. In the second mode, edits are directly visible to all users, and FlaggedRevs is used for approving changes. In most configurations, regular users are approved automatically, while edits from unregistered and new users are reviewed via FlaggedRevs. This system tends to generate a huge backlog, which is handled in Finnish Wikipedia by SeulojaBot—originally developed as a proof-of-concept at a hackathon in 2016 using PHP. The world has moved forward, and notably there are now LLMs that can be used for analyzing edits. Therefore, it is time to rewrite it using Python, with a proper end-user web interface and support for multiple different Wikipedias.

Wikikysely

[edit]

Prototype implementation of a multilingual wiki survey tool built with Django and with Wikimedia Oauth login. All questions belong to a single main survey. Administrators can edit the survey description, manage questions and change the state of the survey.

Pywikibot development

[edit]

PendingChangesBot and UkBot include features that could be upstreamed to Pywikibot, such as FlaggedRevs API support, Liftwing/ORES interfaces, and page load statistics. As UkBot is a stable, widely-used project, it could also be suitable as a GSoC or Outreachy project.

Other projects

[edit]

These are projects what we developing as we are using tools by ourselves. However, they are not considered to be GSoC, Outreachy or student projects as they would require either preparation work first, or they are too tightly linked to be finnish language.

FinnaUploadBot

[edit]

This is a generic topic for all Finna-related work on Wikimedia Commons and Wikidata. Finna is central aggregator for Finlands universities, libraries, museums and archives.

Work is linked to the imagehash database, as Finna image uploading and image metadata updates requires methods for matching images between Commons and external repositories. This work is also linked to Wikidata imports, as image metadata requires Wikidata items and properties. Finna also contains a vast amount of CC0-licensed metadata, which is imported to Wikidata to be used independently. We are trying to design a system where users can select photos using a web UI to be uploaded to Commons independently, instead of making import requests. The current bottleneck for this is that Finna metadata is not unified**,** so it will need human curating at the import phase.

WikiShootMe (fork)

[edit]

A generic use case for photography competitions, photowalks, and POI-based crowdsourcing tasks is that we have tools which show searchable/filtered data on a map. Currently, we are using modified WikiShootMe and Toolforge wrappers for this. However, this is in github and should be cleaned up and parts which are suitable (Geojson layers etc) should be upstreamed.

Examples:

Fiwiki local support

[edit]

Helping with maintaining templates, Help pages, Lua modules, TemplateData, gadgets, and tracking Phabricator tickets etc. Accessibility Improvements: Converting Finnish Wikipedia's inline CSS styles in articles and templates to templatestyles. The aim is to ensure compatibility with the Mediawiki’s new dark mode and with WCAG 2.1 accessibility standards. This started in 2024 and will continue in 2025 also.

Project pages:

Misc ideas

[edit]

Misc ideas which are on project plans, but doesn't have any timetable.