Jump to content

Wikimedia CH/Software Development Guidelines

From Meta, a Wikimedia project coordination wiki

Software Development Guidelines

[edit]

Welcome to the software development guidelines, flavored by Wikimedia CH (Switzerland).

The Software Development Guidelines by WMCH can help to quickly discover how to join the technical Wikimedia movement, and how to have impact, with relax, fun, and effectiveness.

Target of this Document

[edit]

This document is designed to cover the needs of:

  • enthusiastic volunteer technical participants;
    including technical volunteers, technical hackathon participants, etc. - to explore the available Wikimedia code and hosting platforms, best practices, etc.
  • a curious audience with little free time;
    including you, me, etc. to quickly scroll tech buzzwords and say "uh! interesting".
  • technical staff in Wikimedia CH;
    including Wikimedia CH contractors, consultants, etc. - to speedup their on-boarding, etc.
    including projects under the umbrella of the WMCH innovation program, to avoid pitfalls, understand expected quality standards, and support the projects and their communities at our best.

Welcome, Tech Contributor, in the Wikimedia Movement

[edit]
Welcome in the Wikimedia family!

This section is just to welcome you, new technical contributor!

Think about Wikipedia.

Think about being a top-10 website in the planet.[1]

Think about Wikidata, Wikimedia Commons, Wikivoyage, Wikibooks, Wiktionary, Wikiversity, Wikiquote, Wikinews, Wikisource, Wikifunctions, Wikispecies, and counting.[2]

Think about writing contents in 340+ language communities.[3]

Think about a website made by users, for users, that never sells your data. No third-party cookies. No spyware. No advertisements. No user profiling. No DRM. No "register to read more". No email required. No app required. No "supported only by Google Chrome". No web trackers. No fees.

Think about legally enabling massive human collaboration, to produce Open Contents. Where, out of there, our obsolete world is stuck on "all rights reserved".

Think about 100% Open Source software. Where, out of there, our obsolete world creates social network apps that have access to your every sensor, and if you try to study these apps, you are prosecutable.

You can rely on this fact: there are millions of lines of source code out of there, and you won't have difficulty finding things to do. 😎 Thanks for your help in the Wikimedia Infrastructure!

Glossary

[edit]
Wiki
quick collaborative website.
Wikipedia
Wikipedia is the biggest Free collaborative encyclopedia. It's "just" one of our projects.

Wikimedia
Wikimedia is the cultural movement that collaborate on the projects listed in https://wikimedia.org/ including Wikipedia.
Wikimedia Foundation
the non-profit organization that hosts Wikipedia, and all Wikimedia projects, from United States.
Wikimedia CH
the official non-profit organization that is active in Switzerland to help Wikimedia projects.

MediaWiki
the Open Source software that Wikipedia uses under its hood.

For the purpose of this document, some common words have specific meanings, as per RFC2119:

  • MAY: this word mean that an item is truly optional.
  • MUST: means the definition is an absolute requirement
  • MUST NOT: means the definition is an absolute prohibition
  • SHOULD: means the definition can be ignored only under specific, well-understood, exception
  • SHOULD NOT: means the definition can still be enforced, but only under specific, well-understood, exception

Onboarding

[edit]

Premising that Wikimedia Projects are designed to work as anonymous, at the beginning of your edit contributions, and especially for technical contributions, you SHOULD have a lovely Wikimedia account.

Onboarding in Wikimedia Foundation Infrastructure

[edit]

Wikimedia Foundation already provides a structured, modern infrastructure, without costs to the end-users, able to minimize fragmentation, costs, complexity, gatekeeping. This infrastructure also provides effective user management, so, more efficient collaboration.

Here is an overview of the two separate types of access, and what they provide:

Meta-wiki provides a global account to access collaborative public access tools, including:

Wikipedia
the Free encyclopedia
Wikidata
database of structured data
MediaWiki.org
documentation for MediaWiki (the software running Wikipedia)
Wikimedia Phabricator
bug tracker and more
Quarry
SQL query service
and more

Wikitech provides a separate account, to access restricted technical tools, including:

Wikimedia GitLab
code hosting with git
Wikimedia Cloud services
virtual private servers (VPS) on OpenStack (PaaS)
Wikimedia Horizon
OpenStack administration panel
Wikimedia Toolforge
shared hosting based on Kubernetes (IaaS)
and more

All these services are privacy-friendly. All are hosted on machines physically controlled by Wikimedia Foundation. It is not normally required in any way to adopt third party services. like Service as a Software Sobstitute. For example, when "GitLab" is mentioned, we normally mean https://gitlab.wikimedia.org/ and not https://gitlab.com/ etc.

Make sure to have a Meta-wiki global account.

Meta-wiki: your Wikimedia Global Account (also for Phabricator!)

[edit]
This section in short:
  1. Try to see your Meta-Wiki Preferences
  2. Try to click on "Subscribe" in this example page: Phabricator: C4

Problems doing your logins? Just read this section! ⚡ 2 minutes reading.

With a Wikimedia Global Account you can use talk pages in Wikipedia and receive notifications, and more. With a Wikimedia Global Account you can also login in Wikimedia Phabricator to follow technical discussions.

All users SHOULD follow these steps, to communicate with others on Meta-wiki.

Developers MUST follow these step, to be not cut off from tasks from Phabricator.

Paid developers MUST follow these steps twice (to have a volunteer account, and an organization account. Contact your organization in case, to follow their naming conventions.)

  1. Meta-wiki: global Wikimedia account
    (account registration)

    Your account in Meta-wiki allows to obtain a global identity, to login in a lot of websites:
    • meta.wikimedia.org - the community of coordination and planning
    • wikipedia.org - the worldwide Free encyclopedia
    • mediawiki.org - the documentation about core software
    • quarry.wmcloud.org - community query service
    • and additional collaborative websites for the general community. ✨
    • and phabricator.wikimedia.org - community bug tracker (see below)
  2. Wikimedia Phabricator: bug tracking
    (OAuth login)

    Please try your first login in Wikimedia Phabricator, to join technical discussions, follow tasks, etc.
    Visit the page phabricator:auth/ and use the button on the left to do the login (the one called "Log in or Register MediaWiki")
    If you are redirected to a login page in "mediawiki.org", just enter the credentials related to your Meta-wiki account.

Wikitech: your Wikimedia Developer Account

[edit]

The next step is obtaining a developer account for Wikitech.

Remember: Wikitech and Meta-wiki are completely separated accounts (mostly for security reasons).

All users MAY follow these steps (to put stars on GitLab repositories, etc.)

Developers MUST follow these step (to be able to create repositories, push changes, etc.)

Paid developers MUST follow these steps twice (to have a volunteer account, and an organization account. Contact your organization in case, to follow their naming conventions.)

  1. Wikitech: technical wiki (account registration)
    Your registration in Wikitech allows to use important technical resources, including:
    • wikitech.wikimedia.org - the technical wiki
    • horizon.wikimedia.org - the OpenStack panel
    • and other resources for the technical community.
    • and gitlab.wikimedia.org (see below)
  2. Wikimedia GitLab: source code (login)
    Just enter your credentials of Wikitech to do your first login in Wikimedia GitLab.

Onboarding your Team

[edit]

Once created the accounts as mentioned above, it is good practice to customize your own user pages in a way which is meaningful and helpful for others.

A good public user page is short, nice, useful and updated.
The first principle is that your public user page is not intended for self-promotional purposes, but to communicate your background, your purposes, and to declare your conflict of interests, your role, your paid assignments. The purpose of this is to help others understand where you can contribute on what area and why.

Here's a handy checklist to follow when starting one's assignment or when a role changes:

  1. https://meta.wikimedia.org/
    Login. Visit your user page (usually it's mentioned on the top-right corner)
    1. add info about your role, in English
      Example: "Hello! In YEAR I'm paid to work on project ABC for Wikimedia CH. My specific role is: developer"
    2. mention your nickname of Phabricator
      Example: "On Phabricator you can find me as @supermario-wmch"
    3. mention your nickname of Wikitech
      Example: "On Wikitech I'm this user: https://wikitech.wikimedia.org/wiki/User:SuperMario-WMCH"
  2. https://wikitech.wikimedia.org/
    1. Login. Visit your user page
    2. add a link pointing to your Meta-wiki user page
  3. https://gitlab.wikimedia.org/
    1. Login. Select your profile picture > Edit profile.
    2. add a link to your Meta-wiki user page

You have success if your user page in Meta-wiki helps others in finding you in all other services, and if people from somewhere-else (for example Wikitech) can easily find your Meta-wiki account.

Please help your coworkers in doing the same.

Bonus point: on your Meta-wiki user page you can mention each other, to help others find your coworkers and better understand your team divisions. Creating a template for your team would be an easy-manageable solution to keep everyone connected without the need of reviewing a lot of user pages for any change in the team.

Extra notes for Paid Work

[edit]

If your time is financed to technically help Wikimedia volunteers, that's a great responsibility! Your boss is: the humanity, donors, volunteers, and your organization. This means, your boss may also be a volunteer. So, extra care will be surely appreciated to be even more proactive in communicating your progresses and any doubt.

Remember that your organization's priorities MUST be compatible with community expectations.

Try to help the community at least in the bug triage process. Discuss with your superiors if your working hours fail to cover the community's expectations. Try to describe the missing technical figure to your superior, if another person in your team is missing to help the community.

While we care about your burnout, please also do not put the community in burnout. Do not assign Tasks to the community (that's not "community management"). Do not solicit the community with unnecessary pings. Avoid a big amount of consequent actions (such "edits") that may cause a lot of notifications, etc.

Thanks for your extra help in documenting your user pages with these extra information:

  • your working role
  • the start date of your working contributions, and expected end date (if any / known)
  • how to contact you, also in case of problems with your edits
  • how to contact your superior

Sharing the above information takes 10 minutes, but it's quite important, also for temporary contractors.

Thank you so much for your tech professional contributions to the Wikimedia community!

Possible Tech Contributions

[edit]

Now that you have the necessary accounts to act, where should you start contributing?

The first entry point to Wikimedia CH software should be Phabricator: this tool can be used for track projects, act on software and to foster communication among members.
There are several projects affiliated to WMCH:

By visiting the links above you can figure out what the current projects are and how the community looks like.
Please don't start to contribute just yet, complete the reading of the remainder of this document.

Software Guidelines

[edit]

The Wikimedia technical community maintains hundreds of thousands of software projects.[4][5] Your help is important to follow basic best practices.

Operating System

[edit]
An happy web application supports Debian GNU/Linux servers.

Software SHOULD be designed in a way that it executes successfully on Debian GNU/Linux stable.[6][7]

The reason is, among others, Debian stable is the only operating system supported by Wikimedia Cloud Services. This is true for both OpenStack and Kubernetes.[6]

To install your dependencies you SHOULD adopt these sources and in this order of preference:

  1. Debian apt with repository "main" or "security"
  2. Debian apt with repository "backports"
  3. Flatpak / snap
  4. Docker

To install your dependencies your SHOULD NOT use these sources:

  • manual download of ".deb" packages from the web
  • download of generic software from the web
  • execution of installation scripts like "wget" in pipe at sudo

Things that MUST NOT be used:

  • adoption of the Debian apt repository "non-free"
  • download of any software that is not released under a Free/Libre and Open Source license
[edit]

Example of solutions that MUST NOT be used:

  • adoption of proprietary dependencies (both in content and in software)
  • adoption of proprietary user trackers (such as Google Analytics etc.) - instead you can adopt WMCH Matomo
  • adoption of external web resources (such as Google Font etc.) - instead you can serve them as local resources

Example of solutions that SHOULD NOT be used:

  • adoption of unstable / nightly software as base
  • adoption of esoteric programming languages

Since the operating system adopted in the Wikimedia movement is Debian stable, there are well-known stable versions that SHOULD be supported.

Your server-side software SHOULD NOT be incompatible with these versions (updated with Debian GNU/Linux current stable, that is: Debian bullseye):

Your client-side software MUST NOT be incompatible with these versions:

In addition to the above software versions, please note that further software versions, officially supported at least by Wikimedia Toolforge, are listed in the following page:

If these versions are too outdated for you, as already mentioned, check the next section for some acceptable workarounds.

Workarounds for Missing Software Versions

[edit]

If the indicated versions are too obsolete for your needs, you can still follow one of these accepted workarounds:

These and similar workarounds allow flexibility with Debian stable compatibility, still with sources that are trusted and has security updates.

Please refrain from using esoteric versions, unstable versions, or anything that may be super-difficult to the majority of others to maintain. Thanks!

Editors

[edit]

In order to efficiently develop and write code, tools like the basic code editors or the full-fledged IDEs are an absolute necessity. There are hundreds of tools out there, but we can suggest a couple as industry-standard suggestions:

Infrastructures Comparison

[edit]

This table helps in finding the infrastructure best suited for your projects.

#Wikimedia Foundation Infrastructure #Wikimedia CH Infrastructure
In short, suitable for Community projects Approved WMCH projects
Super-easy to add and remove users and set access roles Yes No
Available to the entire tech community Yes No
OpenStack and Kubernetes Yes No
Supports different GNU/Linux distros than Debian No (only Debian) Yes (all distributions)
Renewable energy 74% renewable energy in data centers (2022).[8][9] 60% renewable energy in data centers (2023).[10]
Primary Data Center United States Switzerland

Hosting

[edit]

This table summarizes suitable hosting solutions according to your need:

Need Implemented with Platform to be adopted Platform owner
shared hosting Kubernetes Wikimedia Toolforge #Wikimedia Foundation Infrastructure
VPS OpenStack Wikimedia Cloud
VPS hypervisor WMCH internal cloud #Wikimedia CH Infrastructure

To have a more comprehensive overview, check these resources:

Here some use-cases:

  • hosting a bot operating in read/write mode on wiki contents
    Toolforge
  • hosting a web tool useful for Wikimedia purposes with stable versions of PHP/Python/Ruby on Debian:
    Toolforge
  • hosting a web tool useful for Wikimedia purposes with recent software versions on Debian:
    Cloud VPS
  • hosting a tool with custom software on custom GNU/Linux:
    → WMCH internal cloud

Coding Conventions

[edit]

Developers may appreciate these coding conventions when writing source code:

A new project SHOULD adopt simple and libre tools to help newcomers in adopting the Coding Conventions successfully and without frictions. For example adopting PHP CodeSniffer etc.[11]

Additional useful resources:

Documentation

[edit]

People in the development team are the ultimate experts in their own creation. This is why we need your help to take care of the documentation.

The purpose of documentation is not to write it but to read it.

A good documentation helps in avoiding to abandon a project, or rewrite it from scratch, due to lack of shared knowledge.

The documentation must be released under a free license. Suggestions:

Sysadmin Documentation

[edit]

We need your help to create a Sysadmin Documentation. This documentation will be useful to future GNU/Linux system engineers handling your service. The sysadmin documentation should be in English.

The goal is understanding how to (re)create a testing and a production environment and how to update, backup and restore.

Usually useful information to mention:

  • software dependencies (packages to be installed in a new minimal Debian GNU/Linux stable)
  • installation instructions (what apt command, etc.)
  • configuration instructions (which system configuration files should be changed, etc.)
  • references to other applications, repositories and documentation related to or useful for the deploy
  • log file paths (those relevant for investigating application issues)
  • Unix users in play (perhaps the app uses system users such as www-data or has custom users)
  • system daemons related to the application (e.g. own custom systemd units, mention other services in use such as mariadb, apache2, nginx, etc.)
  • listening ports (TCP/UDP) and from which app component
  • security instructions
    • which not paths should be externally exposed (example: ./my-temp etc.)
  • hardening instruction
    • paths that must be writable by the app (and thus assigned to a possible my-app user) during its normal operation (example: ./my-upload, ./my-tmp, etc.)
    • paths that not should be writable by the app (and therefore assignable to root) (example: all executable files, ./my-conf, etc.).
  • update notes
    • application update procedure
  • backup and restore procedure
    • where on the filesystem are the user data to be preserved (example: ./my-upload etc.)
    • which databases should be preserved
    • how to enable any maintenance mode

Not all of these points may be relevant. If you feel something is missing, better add it.

The documentation must not contain any secret or password.

The Sysadmin Documentation can be a section in the README file in the project repository.

Development Documentation

[edit]

A short documentation in English describing the structure of the project is useful to help other technical people to approach, orient and contribute to the software.

Information that should be covered:

  • purpose of the project/challenges encountered
  • structure of the project (to navigate the directories)
  • how to test the code locally
  • how to configure the application (testing / production)
  • where to find application logs, how to examine

A section called "Development Documentation" in your README file of your project can be a good starting point to help other developers.

User Documentation

[edit]

Any good software has good User Documentation. The User Documentation allows end-users to master the software.

Tips:

  • start the draft in English, especially if you plan to make it multi-language in the future
  • it is also okay to write it just in the language of your main destination community
  • start the draft without paying too much attention to formatting (example: OK an Etherpad, a wiki, a README, ...)
  • avoid proprietary tools from the beginning (example: avoid Google Docs)

Note: the user documentation is usually improved by non-technical users. So, it's probably better to adopt a wiki, than a README on git.

You can omit obvious details. For example you can omit steps already covered by in-application wizards, etc.

Server Inventory

[edit]

Any server that is not inventoried sufficiently may risk elimination by the Wikimedia Foundation[12]

  • verify that adopted servers are well-known and mentioned in the documentation
  • check that any custom domain is mentioned as well
  • check if the documentation mentions the "The Cloud VPS Instance lifecycle" sufficiently

Maintenance

[edit]

Until the end of their assignment your team should take care of the application maintenance. Examples:

  • take care of the initial setup
  • perform routine maintenance to keep it running
  • apply security updates on application dependencies
  • apply security operating system updates (when applicable)
  • verify that the #Backup procedures are working

Insights:

If the application is already in production:

  • schedule and communicate the intervention windows required to perform maintenance activities that may cause downtime (polite 6-hour notice)

Backup

[edit]

At the beginning of the assignment, the team sets up a simple automatic backup plan, first of all storing a copy on the same server where the application is located, to implement a first on-site backup.

Purpose of on-site backup: to allow developers to restore data from a point in time before a (their?) mistake, independently and quickly.

Minimum and recommended on-site backup parameters:

  • Data to be saved: the minimum data needed to do a project restore
    • Examples you can include: databases, private application configurations, user uploads, application logs of the day, ...
    • Examples you can exclude: operating system files, files already published elsewhere (git), caches, logs already old, ...
  • Frequency: every night
  • Time of day: the night between 01:00 and 04:00 in UTC+1 (Switzerland, Geneve time)
  • Data retention: 24 hours (one copy only, each new backup overwrites the oldest)

Once implemented, the backup should be supervised at least until the end of the assignment.

Tip: it it's useful, a simple on-site backup can be done thanks to a simple crontab line, using simple tools like mysqldump and/or rsync etc.

The backup directory must not be accessible to the public or to users that are not trusted.

Suggested destination for your on-site backup in your VPS:

/var/backups/wmch/$HOSTNAME/daily/files...

Suggested permissions: chown app:app with chmod o= ....

An on-site backup is not sufficient. It is just the first step to quickly setup an off-site backup.

Verify that both the backup procedure and its restore are well-defined in the #Sysadmin Documentation.

Code of Conduct

[edit]

You (and your team) accept the following code of conducts:

In short: be nice with others.

Terms of Service

[edit]

You (and your team) accept the Wikimedia Foundation Terms of Service:

Software License

[edit]

Any new software created for Wikimedia CH and to be used in Wikimedia projects must be released with a Free/Libre and Open Source license.

If you work in a company, be sure that the person in charge of your company allows you to release such software. Get it written down. Details:

Offboarding

[edit]

Prior to the conclusion of a team member's assignment, that person follow this checklist:

  1. update the Team Documentation to reflect the role change
  2. communicate the accounts that need to be deactivated
  3. communicate the list of personal information that should be removed (not guaranteed to be removed)
  4. contribute to the relevant beautiful #Documentation

Communication

[edit]

A good Communication helps users to be aware of development directions. An optimal Communication helps the team in avoiding design mistakes.

Let's start by saying that this is not that easy, since some volunteers could create a controversy if they are not involved in early phases, while some other people just want to choose to be not involved.

A Communication compromise is necessary since the community is big. Some volunteers are conservative, since the Wikimedia platforms they use are almost assimilated as a working desk, and any change can waste their time and create frustration.

The development team, on the other hand, should be able to have creative and positive space to deliver something new.

Communicate Early

[edit]

Don't wait the final project conclusion to communicate progresses. This is important also because many projects will be probably never declared as completed. You might be surprised how long your software might last (as example, let's mention Wikipedia).

Do you have a new project challenge? do you have a new progress? do you have a new important bug? That can be a good moment to propose a date for a quick meeting to show that. It is not required to plan a one-hour presentation each day or each week. However, it would be a mistake not to dedicate five-minute to share your screen sometime.

Online meetings should be open to the public, in order to allow some people at least to join and listen. For this reason, the video conferencing platform should be libre (example: Jitsi, BigBlueButton, etc.) otherwise, technical contributors may be excluded and there wouldn't be much benefit.

Contact / Questions

[edit]

If something is not clear, please share your opinion in the talk page or contact us:

https://wikimedia.ch/en/contact/

Note

[edit]