Small wiki audit

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

The small wiki audit is a proposal to check the content of various small Wikimedia projects for linguistic accuracy and other issues. Ideally this would be performed on a regular basis by a team with relevant skills. This differs from the Small Wiki Monitoring Team, which mainly focuses on reverting vandalism and spam in real time.

The idea was proposed by RexSueciae during a 2020 discussion about the Scots Wikipedia.

Here, "small" is defined as low number of active users. Some Wikipedias may contain a large number of bot-generated articles but still have a small community.

As of writing, "small wiki audit" refers to an informal, non-binding process that produces reports, which can later be brought over to a local or Meta-Wiki RfC to discuss solutions.

Ideal skillset for auditors[edit]

  • Trained or amateur linguists with good general research ability
  • People (not necessarily linguists) with knowledge of a specific language or language family


  1. Decide on a language to audit.[1] Smaller wikis should be prioritized in auditing, as there are less contributors and smaller communities. One can choose a project based on:
    1. Wikis with many articles created by one person who isn't a native speaker of/fluent in the language
    2. Romance creoles and Germanic creoles/pidgins (because, similar to Scots Wikipedia, they may be more likely for people to contribute mistakenly thinking they know the language.)
    3. Languages supported by Google Translate or other machine translation service but have a small community (e.g. Xhosa, Shona, Hawaiian)
    4. Minor dialects/orthographies (yet enough for a separate Wikipedia) related to a major language (e.g. Western Armenian, Aromanian, Pennsylvania German, Wu Chinese, Nynorsk Norwegian)
    5. Constructed languages (e.g. Interlingue, Volapuk) and extinct/liturgical language (e.g. Old Church Slavonic)
    6. Languages or projects with large numbers of bot-created content
  2. Find an auditor to organize the review of the project. Ideally, the auditor would know the language, but could also be a linguist or other person willing to learn the basics of the language.
  3. Search Meta-Wiki archives to see if the project has been discussed before, especially in a request for comment or project closure proposal. If any significant issues were highlighted in past discussions, keep them in mind.
  4. The auditor performs a basic check on the project. At this stage, the auditor should be able to determine clearly obvious inaccuracies from their study. This step, while recommended to save time, is not strictly necessary to proceed to the next steps. If only this step and the first is completed, then the wiki may pass an audit but will not be recognized as entirely audited. This could be done by checking the most common language of discussion (possibly problematic if not the language of the wiki) and the babel status of the users creating recent articles.
  5. If the auditor has further concern and/or notices language inaccuracies, they should contact native/fluent speakers. This could be done by looking at the relevant language categories of Wikipedians on another project, reaching out on the wiki itself, or reaching out to experts/speakers of the language off-wiki.
  6. This step involves the comprehensive audit of the article that native language speakers would undertake. It would involve checking:
    1. The recent changes and newly created pages.
    2. A random sample of articles.
    3. (?) Special:LongPages, Special:ShortPages, Special:UnconnectedPages, maybe some other statistics like that.
    4. That the content passes a basic sanity test
    5. That the content is in the correct language
    6. That the content is formatted sanely
    7. That the content does not consist of copyright violations
    8. Or just, try XTools, that can machinely check many (but not all) of the problemic pages on any single wiki
  7. Determine whether there are any obvious issues with cross-project policies such as BLP issues and neutral point of view, which applies to many but not all Wikimedia projects. It would also involve checking with global policies and the founding principles, and reviewing the main policies of each wiki to determine their compatibility with any global policies.
  8. If there are any problems, attempt to fix them. Continuously report back to the report page, which may function as a centralized discussion, and try to engage the broader language community outside of the wiki to fix the issues. Report your findings in the Community Portal of each wikimedia project in that language, and bring them to the attention of the broader wikimedia community if necessary. In general, try to be active in bringing attention to the problem and helping that language community in fixing it.
  9. Prepare a final report for the centralized page on Meta-Wiki. The audit and the final report could be done on a subpage entitled Small wiki audit/language/year. For example, it could be called Small wiki audit/Malagasy Wiktionary/2020 and collapsed on the subpage Small wiki audit/audits. Highlight the report by linking from the Community Portal of the wiki in question, and possibly also from those of each wikimedia projects in that language. The final report, if the auditors encounters problems, should have a recommendation like an rfc on the matter.


Ideally, each small wiki should be audited and once it is audited it will not need to be revisited for some time. Each audit could last from a few days of focused work with easy to reach contributors or up to months with wikis and communities that are harder to reach.

See also[edit]

  1. This process is designed for Wikipedias. For other projects, such as Wiktionaries, this process can be adapted to fit the specific situation.