Proposal for a Wiktionary proof of concept

From Meta, a Wikimedia project coordination wiki

As discussed in The need for XML re: wiktionary and Tables for Wiktionary, it is necessary to structure the wiktionary content in order to be able to share the content of wiktionary. The complexity of doing this is huge. Not only do we need to describe all kinds of data to be able to include this in our database. We have to watch and make it not too difficult for a would be contributor. We also have to produce XML or something like it to publish our content.

To do all this in one go is a bit much. So I propose to do something that is simpler first. We may use the GEMET data for inclusion within Wikimedia. This is a rich and important body of knowledge and we can use it not only in wiktionary but also in wikispecies. We have been given the SQL stuff from the GEMET relational database. So we can change this to fit Wikimedia. GEMET has its own XML definition.

We have therefore these important components:

  • We have a SQL definition to fit the data
  • We have the complete data from the GEMET available in XML format
  • It provides a subset of what is required in structuring Wiktionary
It is an important resource in its own right; the GEMET data
It gives us the ability to handle open content glossary/thesauri
It gives us an idea how Wiktionary could/should evolve

The SQL defenitions are posted on Meta. I do not see how to add them on bugzilla.