Project/in situ

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
Imbox style.png
statusexperimental
Astrocat 08.svg
[[|Project summary]]
Wikt rei-artur4.svg
in situ
Experimenting a Wikibase that responds to requirements of wiktionarian projects
Hex icon with lightning white.svg
creator
psychoslave
Hex icon with hand white.svg
volunteer
Csisc
this project needs...
Hex icon with hand black.svg
volunteer
Hex icon with circles black.svg
developer
Hex icon with star black.svg
template guru
Hex icon with hexes black.svg
sysop
Hex icon with flask black.svg
ontologist
Hex icon with bee black.svg
communication faciliator
join
endorse
created on15:01, 31 May 2021 (UTC)

This project aims to experiment what a Wikibase instance could bring to fill wiktionarian projects in term of cohesively structured data.

Rational[edit]

The various linguistic version of Wiktionary collect a lot of redundant information, that is they don't share much common information. Even in a given instance, information like a quotation or a definition on a page will be often manually be fully duplicated on an other page, and nothing will prevent divergent evolution of these duplicated data. Furthermore they are not structured in a fashion that ease querying information at a fine level granularity nor to simplify cross-referring data.

On the other hand projects like Wikidata, and Lexicographical data follow a path toward a more cohesively structured data that ease these points. But currently they don't provide much things to leverage on for tackle Wiktionary specific needs, as they are oriented to very different goals and priorities. Furthermore Wiktionary being licensed under CC-BY-SA and Wikidata using CC-0 make any significant transfer of information legally impossible from Wiktionary to Wikidata and its Lexeme extension.

Of course, Wikitionary as it is do have many conveniences, like the flexibility of structuring data through simple wikicode, templates, modules and so on. It has several solid linguistic communities with over a decade of common work and an international user group, the Tremendous Wiktionary User Group (TWUG).

No obvious simple quick path is known to get the best out of these two approaches. So this project doesn't come with any grand scheme to aim at this. Instead this project will go through little steps of experiment, gather feedback, improve, repeat.

Contributors wanted[edit]

This project is specifically willing to help wiktionnarian communities, so having contributors from its different linguistic versions would be warmly welcome. A simple hello on the talk page would already be greatly appreciated, and more thorough comments are encouraged. Face-smile.svg

We also specifically need people with:

  • knowledge to tweak the {{Probox}} used on this page to include this list Face-grin.svg (template gurus)
  • skills to spread the word both within Wikimedia circles and beyond (communication facilitators)
  • will to formalize lexicological/lexicographic data models (ontologist)
  • interest in developing Mediawiki/Wikibase extensions (developers)
  • experience with Wikibase deployment and maintenance, especially of tools in Wikimedia Cloud Services (sysops)

Current focus[edit]

The project currently focus at setting up a Wikibase instance on Wikimedia Cloud Services (WCS) and fill it with some quotes imported from wiktionarian projects. Quantitatively, it's not expected to go further than import a few thousand items as a high limit, if bot are to be used.

Please note that this first experiment will especially not include material such as definitions, grammatical classes, and so on. Indeed, this choice of focusing on quotations is done to make something already going on, build a team with experience in deploying and maintaining a Wikibase instance, and transfer some wiktionarian data into it. That way, the whole project won't be completely stuck with the data modeling part before anything browsable can be shown. This approach nonetheless already requires a proper model for quotations. Luckily the Structured Wikiquote project already paved the way on this regard.

Roadmap[edit]

This section gather some data on what was already done and what is expected along this project

  • Yes check.svg Done Data gathering about possibility to host a Wikibase in WCS during Wikimedia Hackathon 2021
  • State of the art
    • Yes check.svg Done find if other initiative already made something around Wikibase and quotation
    • Yes check.svg Done fill the below See also section with related links
  • Structuring the project
    • Yes check.svg Done Meta page
    • Create some instant messaging room to discuss informally
  • Team building and community involvement
    • making wikimedians aware of the project
      • on wiki calls to join the project
      • spread the word on instant messaging platform and social media
        • Discord
        • Facefook
        • Telegram
        • Matrix
        • Twitter
        • Zulip
    • determine and announce needed skills and resources
  • Wikibase instance
    • deployment with required ontology to test import of quotes extracted from wiktionarian projects
  • Lexical data model
    • animate conversations around what is needed and idea to match these requirements
    • work out at least one specific proposal, build a consensual proposal, refine into a data model
    • implement the data model, deploy on the Wikibase instance
    • test the model, specify what should import data,
    • call for more tests from community
  • Assessment of obtained results, determination of next steps

Data model proposals[edit]

Several data model (ontology) and approach might be envisioned to meet requirements of in situ. Two main roads are already identified :

  • a single Wikibase instance to be used by all Wiktionary versions and other projects, hence called trans situ;
  • one Wikibase instance for each linguistic version of Wiktionary, hence called per situ.

Other approaches are still warmly welcome for now.

Notes[edit]


References[edit]


Related discussions[edit]

See also[edit]

Participants[edit]

  • Csisc (talk) : volunteer. Interested to involve in Wikibase Installation and Lexical Data Modelling. 17:52, 31 May 2021 (UTC)
  • Psychoslave (talk) : coordinator, creator and volunteer. Interested in Wikibase Installation, Lexical Data Modelling, roadmap and team building. 10:14, 1 June 2021 (UTC)