Community Wishlist Survey 2020/Wiktionary

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

Community Wishlist Survey 2020

Wiktionary
20 proposals, 30 contributors

Go-previous.svg Wikivoyage  •  Wikibooks Go-next.svg

The proposal phase has ended.
Come back on November 20 to vote on proposals!


What's in the newspaper today?

Edit proposal/discussion

  • Problem: Wiktionarians can't detect every new used word in real time to include them as soon as they appear, although they are examples of use accessible online.
  • Who would benefit: Contributors and readers
  • Proposed solution: Development of a tool that harvests online newspapers to record words that are missing in Wiktionaries database.
  • More comments: This tool have to be adapted for each language and/or resource. Darkdadaah created a similar tool and had made it run from 2010 to 2013 for French.
  • Phabricator tickets:
  • Proposer: DaraDaraDara (talk) 14:54, 8 November 2019 (UTC)

Discussion

  • Harvesting newspapers is a great way to detect new words, and it helps to have selected sentences to add as examples, after some manual selection as some sentence are correct but too long or too much in need of the context. Also, a thematic labelling may help Wikinewsies and Wikipedians to find more sources. Noé (talk) 11:27, 15 November 2019 (UTC)

Fix the AddAudio script to upload audio to Commons seamlessly

Edit proposal/discussion

  • Problem: A great user script that easily allowed users to record audio recordings of words on Wiktionary and upload them to Commons is broken. I have posted to phab:T206942, and for assistance at Commons and Wiktionary to no avail.
  • Who would benefit: Anyone who wants to hear the audio at Wiktionary, particularly for blind users or those who are learning about foreign languages.
  • Proposed solution: My understanding is that there is a problem with the user script interacting with Commons' API. The script has already been made, so it's not starting from scratch, but tweaking how it saves files at Commons.
  • More comments:
  • Phabricator tickets: phab:T206942
  • Proposer: —Justin (koavf)TCM 19:49, 2 November 2019 (UTC)

Discussion

  • @Yair rand: courtesy ping —TheDJ (talkcontribs) 12:54, 4 November 2019 (UTC)
  • What about using Lingua Libre instead. You "just" need to write the code to support enwikt and all recording will be automatically uploaded to Commons and also on your wiki. Pamputt (talk) 17:51, 7 November 2019 (UTC)
    Lingua Libre bot already add recordings to English Wiktionary. But, it could be great to be able to use Lingua Libre recording tool, and metadata support, directly into Wiktionaries. Maybe it could be related with the proposal to develop a better display of recordings Face-smile.svg Noé (talk) 10:01, 13 November 2019 (UTC)

Search in a lexicon

Edit proposal/discussion

  • Problem: Search engine is not made for dictionary needs.
  • Who would benefit: Contributors and readers
  • Proposed solution: Having an internal anagram and advanced search such as the one we have in French Wiktionary, to find anagrams but also words with a specific sound or with a sequence of letter and a grammatical class for example. It could be based on a parsed dump to have part of speech distinction as well (only in nouns, only in verbs, etc.).
  • More comments:
  • Phabricator tickets:
  • Proposer: Lyokoï (talk) 16:07, 10 November 2019 (UTC)

Discussion

  • I feel this proposal would mainly target readers, as there is plenty website that reuse Wiktionary database to offer such service of adapted search tool. It is a real need. A UX study on that should provide more evidence of this. Noé (talk) 11:47, 15 November 2019 (UTC)

Two options for displaying categories

Edit proposal/discussion

  • Problem: In dictionaries, words are normally displayed in alphabetical order. In wiktionaries, when a category is displayed, only a part of words of the category are displayed in alphabetical order, those in subcategories are not displayed. It is often useful to see all words of the category.
  • Who would benefit: Everyone using categories.
  • Proposed solution: When displaying a category, add a button to be used when the user wants all words, including words in subcategories (and subsubcategories, etc.)
  • More comments:
  • Phabricator tickets:
  • Proposer: Lmaltier (talk) 20:49, 8 November 2019 (UTC)

Discussion

  • It may be similar as Deepcat, a script described as supported by some Wiktionaries. A good opportunity to made it core? Noé (talk) 16:06, 10 November 2019 (UTC)

Add breadcrumb/breadcrumb trail (graphical control element)

Edit proposal/discussion

Discussion

  • Contents in Wiktionaries are not so often hierarchically organized, so the breadcrumb may be very short if it take only the page in its environment. Do you think it should be for page summary and be actualized dynamically during the scroll down? Noé (talk) 15:46, 6 November 2019 (UTC)
    • May be during the scroll down, it is seems good idea, thanks! Fractaler (talk) 18:02, 7 November 2019 (UTC)

Display definitions from Wikisource dictionaries

Edit proposal/discussion

  • Problem: Wiktionaries offer some definitions but there are many ways to describe a meaning, and the actual wiktionary interface doesn't make it easy to display definitions from other dictionaries. Some are mentioned as references but they are not accessible in Wiktionary.
  • Who would benefit: Wiktionary readers
  • Proposed solution: Many dictionaries are already in Wikisource and we can use them to offer more definitions. A dedicated transclusion may help harvesting automatically entries with a specific tagging in the dictionaries hosted in Wikisources. They could come from several Wikisources, to be display in several Wiktionaries. It could be a new tab next to "Article" and "Talk", named "Dictionaries".
  • More comments: Some dictionaries are already properly tagged; for the others, it could be a good opportunity to do it, so that they can more easily be reused in open source projects.
  • Phabricator tickets:
  • Proposer: DaraDaraDara (talk) 14:32, 8 November 2019 (UTC)

Discussion

Provide full translatability of the Cognate dashboard

Edit proposal/discussion

  • Problem: Currently, the Cognate dashboard interface can be translated manually. The community can provide a translation on wiki, that can be added by the developer on the dashboard.
Two problems exist:
  • the translation is not automated, an action has to be done by the developer to integrate or modify a translation
  • not all parts of the interface are translatable: the "dynamic" parts, such as column titles or action buttons, remain in English
  • Who would benefit: all users using the dashboard and who are not comfortable with English
  • Proposed solution: develop the Cognate dashboard to make it translatable
  • More comments:
  • Phabricator tickets: T202613
  • Proposer: Pamputt (talk) 18:11, 7 November 2019 (UTC)

Discussion

Find more active users

Edit proposal/discussion

  • Problem: Small user communities (Wiktionary as well as others) have a problem to find and welcome new users.
  • Who would benefit:
  • Proposed solution: Solve bug T234798 to improve Mediawiki's existing mechanism for finding "active" users (with 1 contribution in the last 30 days) to add a threshold for at least some number N contributions.
  • More comments: When created in early October 2019, T234798 was categorized as a low priority. Hopefully, this wishlist survey could change that.
  • Phabricator tickets: phab:T234798
  • Proposer: LA2 (talk) 20:57, 23 October 2019 (UTC)

Discussion

  • This is definitely a bug (and has been logged as such), but I'm curious as to how we could use the function for the purposes of the project. Could it be used to encourage further editing, for instance? Yannis | 13:09, 27 October 2019 (UTC)
It's not a bug, but a feature request. Today, when I look at the list of "active" users on Swedish Wiktionary, I get a list of 79 people who made at least one edit in the last 30 days. But many of these have done only one or two edits as part of vandalism. I want to find the people who made at least 10 edits to find newcomers that should be welcomed. It is possible to scroll through the 79 names and pick those with at least 10 edits. But the software could provide that filtering. --LA2 (talk) 20:18, 27 October 2019 (UTC)

Multiple collations per site

Edit proposal/discussion

  • Problem: It is extremely common, on Wiktionary projects, to display entries of multiple languages on the same page. But, only one collation can be used on a particular Wikimedia project. That means: if a website uses a language-compliant collation, e.g. uca-default which is a English- and Portuguese-friendly collation, all categories concerning e.g. Swedish words, will sort words starting with Å under A, because Å is considered in English to be the same letter than A with a diacritic, while it is a whole new letter in Swedish (where it is sorted at the near end of the alphabet). Categories' headers are therefore incorrect for many languages with the current solution used on Wiktionary projects.
    Currently a way to circumvent the problem is to use the default Mediawiki collation (namely uppercase), but this implies that sort keys are added in all English/French/etc. entries with a diacritic in the title, as Å, É, etc., as all diacritic letters are considered as first-entry headers in categories, and this implies a huge amount of sort keys in pages to bypass this behavior (and thus sort Å under A for e.g. English), and makes Wiktionary projects less readable and editable for newcomers.
  • Who would benefit: users of Wiktionary categories, and new editors to all Wiktionary projects
  • Proposed solution: allow multiple collations per site, and therefore collation to be specified per category: uca-sv should be used for Swedish-related categories, uca-es for Spanish cats, uca-default for English (and similar), etc.
  • More comments: Liangent and Bawolff have been working on this in the past, but feasability seems also to depend on sysadmins (for increased system load).
  • Phabricator tickets: phab:T30397
  • Proposer: Automatik (talk) 21:58, 23 October 2019 (UTC)

Discussion

  • This proposal is a rerun of the 2019 proposal, always topical. — Automatik (talk) 21:58, 23 October 2019 (UTC)
  • It's not up to me to decide (so this is not official in any way shape or form) but, in my opinion, I dont think there are scalability concerns with allowing collations to be set on a per category basis, provided any individual category only has one collation (e.g. there is a magic word to say that this category is french or german or whatever. You can only specify one, you dont for example have a drop down where you can view a category with different collations on the fly (like is wanted in zh)). Bawolff (talk) 04:24, 24 October 2019 (UTC)
  • This should be merged with Community Wishlist Survey 2020/Wiktionary/Context-dependent sort key. Urhixidur (talk) 14:13, 25 October 2019 (UTC)
    • Note, this proposal is probably a lot easier than the other proposal and seems to be solving a different problem. I would recommend against merging them. Bawolff (talk) 19:25, 25 October 2019 (UTC)
  • This feature is sorely needed. Currently in the English Wiktionary we use a sort_key value for each language in our language data modules that describes how to generate sortkeys from page titles, and is used by the makeSortKey method of our Language objects. The sortkeys are generated inside many different templates, and are used in category links, and to sort lists of links to entries (for instance in Template:col3). The generated sortkeys are not always able to make categories sort correctly, as described in the proposal.

    An extension of this proposal would be allowing definition of custom collations. Some languages probably do not have a collation system (not sure of the correct terminology) available, such as Egyptian (which in Wiktionary mostly uses a transliteration system rather than hieroglyphs). The desired sort order for the Egyptian transliterations (ꜣ j y ꜥ w b p f m n r h ḥ ḫ ẖ z s š q k g t ṯ d ḏ) is so different from the order of code point values (b d f g h j k m n p q r s t w y z š ḏ ḥ ḫ ṯ ẖ ꜣ ꜥ), which is presumably used in Category:Egyptian lemmas, that a custom sortkey cannot work. We can sort lists of links by generating a sortkey for each link with a module (Module:egy-utilities), but the Egyptian module cannot be used in categories because the sortkeys would put nonsensical code points in the category headers. (The sortkey-generating function works by replacing the characters in the transliteration with arbitrary code points that have the correct sort order.) So getting Egyptian categories to sort correctly requires a custom collation system.

    Another idea would be to make collation available in a Lua library for Scribunto. At minimum what would be required is a function to compare two strings using a collation and yield values indicating "greater than", "less than", or "equal" (like strcmp in C), which could be adapted for use by table.sort (which requires a function returning a truthy value that indicates whether argument 1 is less than argument 2). Then we could sort lists of links using the same collations used in categories whenever possible, rather than using module-generated sortkeys. This might not depend on the implementation of the "multiple collations" proposal, but if custom collations were implemented, ideally they would be available in the Lua library.

    I haven't submitted the first idea as a separate proposal because it depends on the "multiple collations" proposal, but perhaps I should submit the second one. — Erutuon (talk) 20:53, 7 November 2019 (UTC)

Adopt Lingua Libre Bot service as a tool

Edit proposal/discussion

  • Problem: Lingua Libre is a great service to record pronunciation of words and, now, Lexemes at Wikidata. When you record them they're uploaded to Commons and via Lingua Libre Bot they are added to the corresponding word/lexeme. But this bot is mantained by a volunteer, and it seems that sometimes it can be stopped for weeks. This service should be adopted as a WMF Tool and make it more stable and not dependant from an user disponibility.
  • Who would benefit: People wanting to record pronunciations
  • Proposed solution: Add it to Toolserver and run it independently.
  • More comments:
  • Phabricator tickets:
  • Proposer: Theklan (talk) 11:01, 27 October 2019 (UTC)

Discussion

IMHO, the problem is more general. On a long-term period, LinguaLibreBot and Lingua Libre website itself has to be maintained to add more feature and to fix bugs. I think Wimedia France starts to think about this problem (Eavqwiki may comment more). Pamputt (talk) 17:20, 1 November 2019 (UTC)

Allready an grant, although the solution is different. Will use OAuth, so no bot will be needed. On the talk page, the developer writes: "the files will be transferred from LinguaLibre to Commons using an OAuth authorization". So, in the future an bot will not be used and this issue will be fixed in another way than the proposee here accounted for. (Grants have a lifetime of 12 months)--Snaevar (talk) 12:37, 11 November 2019 (UTC)

Change color of the interwiki link regarding to the content of the sections in the page

Edit proposal/discussion

  • Problem: Cognate is now deployed on all Wiktionaries to manage interwiki links. Currently, all links are blue. The suggestion is to color the interwiki link differently, if on the linked page, there is a section related to the current language.
Example :
I'm on the French Wiktionary, on the page "pain". There is an interwiki link to the English Wiktionary because a page "pain" also exists there.
On this en:pain, there are several sections for several definitions in several languages. There is a section about French word "pain".
Then, on the French page, the link will be blue.
Otherwise, if no French section exists on the English page, then on the French page, the link to the English page will be (for example) green.
  • Who would benefit: this informs the user that even if a page exist in another language, there is no section about his own language, then maybe he could go there and add it. It would encourage contribution and cross-wiki contribution.
  • Proposed solution: improve the Cognate extension or develop a new gadget.
  • More comments: a list with the formatting of all the Wiktionaries have already been built and is available here.
  • Phabricator tickets: T150841
  • Proposer: Pamputt (talk) 18:07, 7 November 2019 (UTC)

Discussion

  • Hi! I don't understand when the link is supposed to be blue. Is it only when there are, on both side, the same language sections? Lepticed7 (talk) 09:01, 11 November 2019 (UTC)
    Well, when there is at least a section of the based-language of the source version, i.e. in French Wiktionary, the link should be blue when French is described in the linked languages. If it isn't, red links may be used instead of a new color. Noé (talk) 11:54, 15 November 2019 (UTC)
This is an interesting functionality. If possible, we could also customize the user's preferred language, e.g. German, so a user editing the English/French Wiktionary would know if other Wiktionaries, e.g. Italian have the German word or not. KevinUp (talk) 11:54, 19 November 2019 (UTC)

Wikicode variables usable in and between templates

Edit proposal/discussion

  • Problem: Wiktionaries organize their pages using a section for each language. However, in each template of each language section, we must add the language code, which leads to errors and a lot of repetitions
  • Who would benefit: All projects needing to define a value and to use it several times. Also it would make contribution for beginners much easier.
  • Proposed solution: Be able to define variables, to assign and reassign values and to use them in templates.
  • More comments:
    • As an example, here's what I thought for French Wiktionary:
      1. Define and assign the language code to the variable lang at the beginning of the section.
      2. For each template which needs the language code, use the variable lang in the template's source code.
    • The templates of the French Wiktionary which would benefit from this proposition are:
      • the pronunciation templates pron, phon and phono
      • the stub templates such as ébauche, ébauche-étym, ébauche-exe, etc...
      • the domain and lexicon templates such as poissons, apiculture, mathématiques, etc...
  • Phabricator tickets:
  • Proposer: Lepticed7 (talk) 15:15, 10 November 2019 (UTC)

Discussion

  • Having this kind of variable or being able to identify a section of a page as describing a specific language could help a lot to deal with multilinguism. The challenge here is to not include the whole in a template as it will make it more difficult to edit, but to tag a segment where a language code is valid. It could clarify the wikicode a lot. Noé (talk) 15:39, 10 November 2019 (UTC)

Sections reorder tools

Edit proposal/discussion

  • Problem: When I work on very long page containing a lot of section like thesaurus, it is not easy to modify the section order to improve the page layout. It is necessary to edit all the page. During the operation, everybody can edit a section and create an edition conflict.
  • Who would benefit: contributors
  • Proposed solution: it will be interesting to have in graphic modification mode 2 buttons on each section (h2, h3, h4...): up and down. The advantage is a simple method to improve page layout and allow to limit the entry in history to a simple comment like: section reorder.
  • More comments:
  • Phabricator tickets:
  • Proposer: Jpgibert (talk) 20:30, 10 November 2019 (UTC)

Discussion

  • Jpgibert, when you say "graphic modification", do you mean VisualEditor? Noé (talk) 14:12, 15 November 2019 (UTC)
    Yes, Noé. I forgot the suitable term when I wrote. Thanks for your help. Jpgibert (talk) 14:40, 15 November 2019 (UTC)
    You're welcome. Then, consider French Wiktionary is the only Wiktionary that offer VisualEditor, for the other wiktionaries, they don't use it. So, your suggestion has to be tied with the proposal to adapt VisualEditor to Wiktionaries.
    translation in French: Bon en fait, on est tous les deux francophones, donc : de rien. En fait, le Wiktionnaire est la seule version de Wiktionary qui utilise l'éditeur visuel, les autres versions ne l'utilisent pas. Du coup, ta suggestion devrait être rattachée ou connectée à celle proposant de construire un vrai éditeur visuel dédié aux Wiktionnaires. Noé (talk) 15:24, 15 November 2019 (UTC)

Create memory games for words in watchlist

Edit proposal/discussion

  • Problem: The watchlist has a fairly limited function.
  • Who would benefit: Those who seek to improve word recollection of "favorited" entries.
  • Proposed solution: Redirect the watchlist to another site that draws out from the list and randomly selects words for memory games.
  • More comments:
  • Phabricator tickets:
  • Proposer: Clicero (talk) 21:28, 24 October 2019 (UTC)

Discussion

@Clicero: it looks like a 2019 proposal. Would this proposal fits your need (IMHO, the 2019 proposal is broader)? If so, I will update your proposal to take into account information given last year. Pamputt (talk) 17:11, 1 November 2019 (UTC)
    • Yes, but as you said that proposal seems a bit broader, so I'd like to add that it would be nice if memory games that randomize words from the watchlist would be included. Clicero (talk) 18:34, 1 November 2019 (UTC)

@Clicero: the English name for this is flashcards, perhaps people will understand you better if you use it. MaxSem (WMF) (talk) 19:25, 8 November 2019 (UTC)

Context-dependent sort key

Edit proposal/discussion

  • Problem: In most Wiktionary projects, words of different languages share a page if their spellings are identical. Currently, the magic word DEFAULTSORT works for an entire page, which means we cannot define a default sort key for each language in the same page. That is an issue especially for Chinese, Japanese and Korean (hanja). They share characters but their sort keys are totally different (radicals or pinyin for Chinese, kana for Japanese, hangeul for Korean). If it is allowed to define a default sort key for each section, it will be much easier to correctly categorize pages.
  • Who would benefit: Editors of Wiktionary, especially those who edit Chinese and Japanese entries.
  • Proposed solution: Introduction of a new magic word, say, SECTIONSORT, that works for all categories after it up to the next usage of the same magic word. SECTIONSORT should override DEFAULTSORT if both are defined. The use of SECTIONSORT without a sort key should clear the previous sort key (and should not define an empty sort key).
  • More comments: see Community Wishlist Survey 2017/Wiktionary/Context-dependent sort key for a discussion in 2017. It is still a problem.
  • Phabricator tickets: phab:T183747
  • Proposer: TAKASUGI Shinji (talk) 12:19, 11 November 2018 (UTC)

Discussion

How it will be visible in category? Sections can't be added to category. --Wargo (talk) 21:48, 16 November 2018 (UTC)

Currently, one adds a sort key to an entire page. The goal of this proposal is to allow more than on sort key per page: one per section; e.g. one sort key for the Chinese section of , one sort key for the Japanese section of the same entry, etc. This is because a same word may not be sorted the same way in different languages, and Wiktionaries often have entries from multiple languages in the same page, as a page corresponds to a specific spelling (which may occurs in multiple languages). — Automatik (talk) 14:06, 20 November 2018 (UTC)
Notifying WargoAutomatik (talk) 14:07, 20 November 2018 (UTC)

See also my somewhat related proposal (I keep missing the deadline) Community Wishlist Survey 2017/Archive/Allow multiple entries within each category. Urhixidur (talk) 13:30, 17 November 2018 (UTC)

  • I've been thinking a bit about this. The problem here is that you have multiple types (languages) of content inside a single page, with a single title. The page https://en.wiktionary.org/wiki/日本#References for instance (quoted as an example in the ticket) is English. And therefor all categorisation of the page is based on the English title of the page (even though the title is not in the english language). This is a fundamental problem (a mismatch to the wikipage concepts). It really means that the entire system should be changed to make use of MCR and specialised MW contenthandlers, so that more semantic info can be extracted out of the page. (Like how wikidata deals with different types of information in a single page). And then on top of that, you could have a Category be in a certain language, and the category could use the correct sort key for a page, by referring to the information of the applicable 'language section' inside the Page. —TheDJ (talkcontribs) 11:25, 6 November 2019 (UTC)
    • To further clarify, the community has laid meaning (a convention) into some of the content, which MW cannot contain for them. When you want software features that makes use of those meanings, that meaning first has to be machine extractible (at scale) before we can do things with it that are not; 'a simple wiki page that complies with the assumptions of the original wikipedia' —TheDJ (talkcontribs) 11:28, 6 November 2019 (UTC)
      If I got your idea right, you are saying that "Page content language" in Page information should be able to deal with more than one language, through a specific tagging in the page or by using a template use for language section title. Then, the ordering for each language could be fixed in MediaWiki. I think this is another way to solve the same issue, and maybe a more MediaWiki-centered one. Noé (talk) 10:05, 9 November 2019 (UTC)
      This is not what I understand. For en.wikt, the "Page content language" is always English (for apple as well as for pomme or Apfel), for fr.wikt, it's always French, etc. Anyway, there is no such issue with the "multiple collations" proposal. Lmaltier (talk) 13:57, 10 November 2019 (UTC)
  • This proposal seems to become useless if the "Multiple collations per site" proposal is adopted (i.e. a magic word stating the language for each category). Or do I miss something? Lmaltier (talk) 20:27, 8 November 2019 (UTC)
    It is mainly for Japanese and optionally for Chinese and Korean (hanja). You cannot generate a correct sortkey for each language in a page of Chinese characters. In the example above, the correct sortkey for 日本 is “にほん” for Japanese and “일본” for Korean. You can have only one default sortkey now. — TAKASUGI Shinji (talk) 23:08, 10 November 2019 (UTC)

Insert attestation using Wikisource as a corpus

Edit proposal/discussion

  • Problem: Wiktionaries definitions relies on attestations, sentences from corpora illustrating the usages and meanings of words. Wikisource is an excellent corpus for Wiktionaries, especially for classic uses, but it is uneasy to search into the texts for a specific word. Now, the reference of the sentence had to be copy/paste by hand and it's a long and unfunny way to contribute, the result being few quotation from Wikisources (less than 3 % for French Wiktionary).
  • Who would benefit: Readers of Wiktionaries would find more examples of usages and a way to access the whole source directly in Wikisource. Contributors of Wiktionaries would have a fancy and enjoyable way to add attestations, similarly as Insert media tool that dig into Wikimedia Commons, and the community may grow with new people that like to add sentences from their readings. Editors of Wikisource would have a new way to shed light on their sisyphean work. Both projects visibility would increase in search engines with more links between them. The global audience of both projects may increase with more connectivity. Also, other projects may benefit from this feature, such as Wikipedia to add quotations in authors' pages.
  • Proposed solution: This feature is inspired by Insert media but targeting Wikisource instead of Wikimedia Commons. So, instead of an snippet search offering pictures, Insert attestation would display a list of sentences from a targeted Wikisource (could be same language or other than the source project) that include the targeted sequence of characters. Their is no meaning requirement nor proximity, it is exact results only to keep it simple. In the displayed snippet of results, an editor would just grab a sentence with a single click and it will be added with the adequate sources picked from Wikidata associated with the Wikisource page. The feature would copy the sentence (no transclusion) and the source of the sentence (adding the information for the number of the page in the original manuscript optimally, i.e. "page 35."). This feature may need a specific parser to identify limits of sentences and to bold the targeted sequence of characters.
  • More comments: This feature/tool/functionality should be accessible through WikiText editor and VisualEditor. It may be interesting to keep track of the reuses of Wikisource content in other project with a specific What's link here from Wiktionary to Wikisource, similarly as Wikimedia Commons indication of reuses in others projects, but this could be part of another development. This idea was suggested in 2019 with 36 supports, in 2017 and supported by 32 people, a draft was suggested in 2016 with 19 supports and this idea was coined first in a MediaWiki discussion.
  • Phabricator tickets: T139152, T157802
  • Proposer: Noé (talk) 07:33, 22 October 2019 (UTC)

Discussion

  • I support the idea. If it will be possible to add examples from Wikisource to Wiktionary examples then it will be great! It will be real integration of two projects: Wiktionary and Wikisource. The solution of this problem requires: to split Wikisource text into sentences, to lemmatize words, create tables with links from lemmas, wordforms to texts. --Andrew Krizhanovsky (talk) 10:34, 24 October 2019 (UTC)
    In my opinion, it is not necessary to lemmatize, as inflected forms can be good examples for infected forms entries. There is an entry for each forms in Wiktionaries, so if you look for teeth, you don't want tooth in the results. Noé (talk) 13:29, 24 October 2019 (UTC)
    Agree with Noé--So9q (talk) 20:51, 24 October 2019 (UTC)
    I want the user to be able to choose: (1) search for sentences with all word forms of the lemma, or (2) search only this word form (strict search). --Andrew Krizhanovsky (talk) 18:22, 1 November 2019 (UTC)
  • I would prefer a tool like [1] for this task. Anyone can adapt or improve it by adding the Wikisource Search API as a backend. Example search for "meaning".--So9q (talk) 20:51, 24 October 2019 (UTC)
    Would you accept making a tool like that official and integrated by default, as suggested in the proposal? That's what the proposal is really about for me—making it easily accessible to all and not just custom JS users (who are a small fraction of Wiktionary editors, let alone readers). Yannis | 13:15, 27 October 2019 (UTC)
  • Strong support. I have done this manually many times, especially for words I first encounter at en.ws. This is a great proposal. —Justin (koavf)TCM 19:41, 2 November 2019 (UTC)
  • My idea is - page Foo (1), Click on something will run search in wikisource for sentences conatining word foo (2). Then editor must chceck, if this word is in correct context/sense and select part for copy to some input field (3). Sometimes some corrections are needed (…), shorten long sentence, add missing subject from previous sentence... Then click to OK - and there will be example (4) with reference.
    1. I have word cs:wikt:pitel,
    2. Search on Wikisource gives me some examples
    3. I select one of them - sentence Od dávna trvající věrný pitel vína dobrého. from Paměti
    4. I got #* {{Příklad|cs|Od dávna trvající věrný pitel vína dobrého.}}<ref>Mikuláš Dačický z Heslova: [[s:Paměti/1601–1605|Paměti]]</ref> for copying to Wiktionary.

JAn Dudík (talk) 20:52, 7 November 2019 (UTC)

  • I agree with your description, but I also think it could be done in VisualEditor without even see any wikicode. It could be user-friendly and easily accessible for new user, like "Add translation" in some wiktionaries. Like "Insert Media", very easy to use in Wiktionary. - Noé (talk) 16:47, 8 November 2019 (UTC)
    But because Wiktionary pages are mostly from various templates, VE is hardly usable in Wiktionary [2]. JAn Dudík (talk) 10:06, 10 November 2019 (UTC)
    French Wiktionary use it. It imply to document every templates with TemplateData and still, it adds several unnecessary line break, but it is possible to use it Noé (talk) 11:00, 11 November 2019 (UTC)

Adapt visual editor to wiktionaries

Edit proposal/discussion

  • Problem: The visual editor is the same as Wikipedia, but wiktionary pages are structured in a totally different way
  • Who would benefit: Anyone who wants to create or edit wiktionary pages
  • Proposed solution: Install a visual editor which understands the structure of wiktionary pages and proposes tools that handle the wiktionary templates. This will help editors a lot, especially the beginners. The most obvious features are:
  1. Handle multiple languages in the same page
  2. Standard sections that are present in all pages (e.g. the French wiktionary always has 1. Etymology 2. Form and pronunciation 3. Definitions and example sentences 4. translations)
  3. Definitions start with one or more templates
  4. Optional sections have a predefined order
  5. Configurable character sets are needed to build pronunciations
  6. A tool to build sources from one of the existing models, to put at the end of example sentences
  7. A tool to add links and lists of links to a section, many sections contain structured and ordered lists of links to other pages (words)
  8. Links to Wikipedia and other sister projects
  • More comments: All those features need to be configurable as each wiktionary uses different structures and conventions
  • Phabricator tickets:
  • Proposer: Romainbehar (talk) 21:44, 8 November 2019 (UTC)

Discussion

  • The solution for this problem may also imply some UX, to frame a branch of VisualEditor that fit with the specific needs of Wiktionarians. Also, in French Wiktionary we have a gadget to use a form for the creation of new entries, it could be a good indication of our needs in this kind of contribution, it's CreerNouveauMot. A related proposal, Visual entry form suggests to also use more preload in order to help begginers to have the template of the page. It may also be through a dynamic questionnaire asking questions such as "Is it for English?" Yes/No "Is it for a noun, verb, adverb, etc.?" "Is it still in use?" (if not, add "old" in the beginning of a definition), etc. Noé (talk) 10:00, 9 November 2019 (UTC)
  • @Romainbehar: I am sorry for contacting you last minute... voting starts tomorrow! I believe there may be a problem with this proposal. As you say, each Wiktionary uses a different format so it will be very difficult if not impossible to make a tool that works for all of them. Additionally, VisualEditor in particular is a behemoth of software, and customizing it for Wiktionary is probably outside our scope. However, we could make a gadget that allows for easy addition/modification of Wiktionary entries, but this would have to be specifically for your wiki. Would you like to reword the proposal to be solely for French Wiktionary? MusikAnimal (WMF) (talk) 03:17, 20 November 2019 (UTC)
  • @MusikAnimal: Thanks for answering my request. I understand that VisualEditor is a large piece of software and doesn't fit in your current scope, but creating a gadget is far from my inital idea: using VisualEditor in the French Wiktionary is almost useless as it just allows modifying text (sometimes it's more difficult than expected), and doesn't help much with Wiktionary specific templates and structures. The French Wiktionary already has gadgets, such as the one named CreerNouveauMot widely used to ease the creation of new pages. I'll create another entry to improve that gadget. Out of curiosity, could you give me links to the VisualEditor community and code? Romainbehar (talk) 08:05, 20 November 2019 (UTC)

Allow searching using ^ and $ anchors at least for intitle: searches

Edit proposal/discussion

  • Problem: For some reasons, CirrusSearch doesn't support ^ and $ anchors (It is not possible to search for strings beginning/ending with some sequence.) While I understand it's probably not needed for whole document searches, it would find practical uses in the context of intitle: searches.
  • Who would benefit: On Wiktionary it would be possible to search for words (entries) that start with a specific prefix / end with a specific suffix. But clearly there would be many other uses outside Wiktionary.
  • Proposed solution:
  • More comments:
  • Phabricator tickets:
  • Proposer: Zabavuju flašku chlastu maskovanou jako zubní pastu (talk) 15:29, 25 October 2019 (UTC)

Discussion

  • There is an intitle:^ search at Special:PrefixIndex. If you are searching inside the document aswell, then my suggestion is not so helpful.--Snaevar (talk) 02:06, 4 November 2019 (UTC)
  • What is your definition of start ? Line, page, table cell, a wrapping template ? —TheDJ (talkcontribs) 09:29, 6 November 2019 (UTC)
  • I've often wondered why intitle: doesn't support ^ and $. It's obvious why insource: doesn't, so did someone decide that both should use the exact same flavor of regex? Erutuon (talk) 00:19, 11 November 2019 (UTC)

More Lua memory for Wiktionary

Edit proposal/discussion

  • Problem: Lack of Lua memory for basic words. See wikt:CAT:E for currently affected words.
  • Who would benefit: All users and readers of Wiktionary.
  • Proposed solution: More Lua memory for Wiktionary.
  • More comments:
    • Pages that lack memory are not being properly categorized and the sortkey is not working properly.
    • Standard information such as citations, semantically related terms are being removed as a temporary solution and this is a loss of information for our readers.
  • Phabricator tickets: phab:T188492
  • Proposer: KevinUp (talk) 16:20, 11 November 2019 (UTC)

Discussion

Alternatively, consider implementing a tool so that the source of each language can have its own separate page, like how all the proposals have its own individual page. KevinUp (talk) 16:20, 11 November 2019 (UTC)

The most parsimonious solution is to raise the cap on Lua memory. This cap seems arbitrarily placed, and it is crippling for long pages. Metaknowledge (talk) 18:04, 11 November 2019 (UTC)
Indeed. If all were like in a company we would already have more memory, because the time spent to circumvent the cap is dearer than the trifling amount of more memory that in total would be used (since it concerns but some dozens of pages, for which much dust has been raised). Fay Freak (talk) 21:52, 15 November 2019 (UTC)
@Noé: Hi. Does the French Wiktionary have the same issue (lack of Lua memory) for entries with short words? I think if we were to migrate information from English Wiktionary to French Wiktionary, the same situation would occur. KevinUp (talk) 23:32, 16 November 2019 (UTC)
Local community discussion regarding lack of Lua memory can be found here, here and here. If only the Community Tech team would be graceful enough to inform us the actual memory that is needed by wikt:do, wikt:一, wikt:人, wikt:水, wikt:月, wikt:生, wikt:我 which are basic words. KevinUp (talk) 20:15, 18 November 2019 (UTC)
@Pamputt: Hi. Does the French Wiktionary currently have issues with lack of Lua memory? Do you think the same issue would occur if information from English Wiktionary for entries such as wikt:do, wikt:一, wikt:人, wikt:水 were copied to the French Wiktionary? KevinUp (talk) 11:56, 19 November 2019 (UTC)
@KevinUp: actually we do not use that much Lua module on the French Wiktionary so I am not aware of such limitations. Yet, maybe JackPotte or Darkdadaah can say more. Pamputt (talk) 18:50, 19 November 2019 (UTC)

Statistics for Wiktionaries

Edit proposal/discussion

  • Problem: : Mediawiki statistics are not adapted with Wiktionary projects. We don't need pages quantity but lemma quantity, examples quantity, definitions with illustrations, quantity of thesauri, etc.
  • Who would benefit: People who want to communicate about Wiktionary, Contributors
  • Proposed solution: Having better metrics, such as the one we have in French Wiktionary, for examples count, pictures count, quantity of nouns, adjectives, etc., how many people have contributed to thesaurus in the past months and so one.
  • More comments:
  • Phabricator tickets:
  • Proposer: Lyokoï (talk) 16:04, 10 November 2019 (UTC)

Discussion