OmegaWiki modules

From Meta, a Wikimedia project coordination wiki

The OmegaWiki data design defines all the functionalities that we would like to have in OmegaWiki. However, for many it is a rather daunting picture to look at. The purpose of this article is two things; it will define modules that can be implemented seperately as part of the buildup to the full functionality of the UW. The second purpose is to aid in understanding the data design and its functionality.

The core and the thesaurus functionality will be the first ones to be implemented.

Core functionality[edit]

This is the absolute bare minimum.

They are the tables Language, Expression, Word, Syntrans, Meaning, and MeaningText There is a relation from Language to Language to indicate orthographies and dialects.

Defining languages is needed from the very start. Initially languages will be defined in English. Priority will be given to the languages that have active Wikimedia projects. Translations can be added in the languages that have been defined in English.

When fields in tables require the existence of as yet not developed functionality, these fields will not be editable in the user interface.

Thesaurus functionality[edit]

This functionality allows for the GEMET data. CollectionLanguage, Collection, CollectionMeaning, Relation, RelationType

The Relation and the RelationType table may be used in combination with the Table table.

The initial RelationTypes that will be defined will be the ones used by GEMET.

User functionality[edit]

LanguageSelection and User

The functionality of only showing content for the languages defined in the LanguageSelection can only be build once these tables exist. The user preferences will be changed to allow for the selection of languages and to allow users to inform about their ability with respect to languages.

Additional Word information[edit]

Gender, PartOfSpeech, Label, LabelType, Table

The values needed in these tables will be initially filled in English. Translations can be added in the languages that have been defined in English.

This will allow the definition of the fields Gender and PartOfSpeech in the Word table. Label will be used to label verbs as being transitive or intransitive.

Inflection information[edit]

Inflection, InflectionBox, Inflection-Word

In Inflection, possible inflection for a part of speech are defined. All possible variations for a particular part of speech have to be defined. The InflectionBox is used to define where a particular inflection is located in a grid. This table allows for the definition of labels in this grid. The inflectionbox information is used to generate a box that shows the inflections for a verb.

When an inflection is entered, and it does not exist as a Word, an Expression, a Word and a SynTrans record are created. The SynTrans record will be related to the Meaning for the "headword". There will also be a need for a meaning for the inflection itself as it does translate to other languages in a specific way.

NB This module requires the Additional Word information

Validated expression[edit]

ValidExpression, Authority, Misspelling

For the languages where an authority defines how a word is to be spelled, it is important to register what authority defined an expression as being correct and from what data untill what date these expressions were correct. Consequently expressions were considered to be wrong; these misspellings were often rehabilitated.

The Misspelling table can be used to indicate frequent misspellings for languages that do not have a spelling authority.

This functionality is paricularly important for the Dutch spelling changes that will be made public on October first 2005.

NB This module requires the Additional Word information

Etymology[edit]

Sound files[edit]

Mediafile

With all of the above modules, there should be enough information to start testing the migration of wiktionary data into the UW database.