Community Wishlist Survey 2022/Wiktionary

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
Wiktionary
7 proposals, 97 contributors, 182 support votes
The survey has closed. Thanks for your participation :)



Automatically sync audio/video/image description of words to all language editions of wiktionary

Edit proposal/discussion
  • Problem: Currently, when a user upload an audio/video/image description of a word, the audio/image/video will only be presented on the wiki where user insert the file into. However, in most cases, such audio/video/image description should be universal to the same word in same language no matter which language version of wiktionary one is trying to use to describe that word in target language, and the lack of automated process to put such file into all relevant wikis make the visibility of such file less than they could be.
  • Proposed solution: Automatically link any descriptive audio/image/video of words into wikidata entry of lexeme and then automatically display them in relevant entries on wiktionary of all language editions.
  • Who would benefit: All Wiktionary users
  • More comments:
  • Phabricator tickets:
  • Proposer: C933103 (talk) 08:34, 17 January 2022 (UTC)Reply[reply]

Discussion

Isnt' this doable already with wikidata ?? Should be queryable via P443. —TheDJ (talkcontribs) 15:21, 17 January 2022 (UTC)Reply[reply]

The key word here is "automatic", as in make any user uploading such items to any wiktionary be automatically linked to wikidata and then automatically linked from and show up on all other wiktionary. C933103 (talk) 21:42, 17 January 2022 (UTC)Reply[reply]
Note that per your talk page, one user should have a maximum of 5 CWS proposals, which means you have to withdraw at least 5 extra proposals by either archiving or asking other proposers to take over them. Do you familiar with other peoples who are interested in this? Liuxinyu970226 (talk) 02:28, 19 January 2022 (UTC)Reply[reply]

If it would hit the right place, it would be very good. Lets take pronunciation and picture. German Band has a lot of entries. With different pronounciations. And different meanings of cource. An .oga file or .jpg cant be placed randomly. Which means in my world (if I think through the whole idea) that I 1 contribute with a picture from commons and then should be asked 2 if I want to copy this as well to the following list of other wikis, which have the same entry i just illustrated. You have to ask if its right, you cant copy automatically, that would produce incorrect information.
And generally said: I like the idea, that stuff has 1=one place where it sits, where it is physically located (tex data) and then just shows itself in such places where showing is a plus. mlg Susann Schweden (talk) 19:13, 28 January 2022 (UTC)Reply[reply]

Voting

Insert attestation using Wikisource as a corpus

Edit proposal/discussion
  • Problem: Wiktionaries definitions relies on attestations, sentences from corpora illustrating the usages and meanings of words. Wikisource is an excellent corpus for Wiktionaries, especially for classic uses, but it is uneasy to search into the texts for a specific word. Now, the reference of the sentence had to be copy/paste by hand and it's a long and unfunny way to contribute, the result being few quotation from Wikisources (less than 3 % for French Wiktionary).
  • Proposed solution: This feature is inspired by Insert media but targeting Wikisource instead of Wikimedia Commons. So, instead of an snippet search offering pictures, Insert attestation would display a list of sentences from a targeted Wikisource (could be same language or other than the source project) that include the targeted sequence of characters. Their is no meaning requirement nor proximity, it is exact results only to keep it simple. In the displayed snippet of results, an editor would just grab a sentence with a single click and it will be added with the adequate sources picked from Wikidata associated with the Wikisource page. The feature would copy the sentence (no transclusion) and the source of the sentence (adding the information for the number of the page in the original manuscript optimally, i.e. "page 35."). This feature may need a specific parser to identify limits of sentences and to bold the targeted sequence of characters.
  • Who would benefit: Readers of Wiktionaries would find more examples of usages and a way to access the whole source directly in Wikisource. Contributors of Wiktionaries would have a fancy and enjoyable way to add attestations, similarly as Insert media tool that dig into Wikimedia Commons, and the community may grow with new people that like to add sentences from their readings. Editors of Wikisource would have a new way to shed light on their sisyphean work. Both projects visibility would increase in search engines with more links between them. The global audience of both projects may increase with more connectivity. Also, other projects may benefit from this feature, such as Wikipedia to add quotations in authors' pages.
  • More comments: This feature/tool/functionality should be accessible through WikiText editor and VisualEditor. It may be interesting to keep track of the reuses of Wikisource content in other project with a specific What's link here from Wiktionary to Wikisource, similarly as Wikimedia Commons indication of reuses in others projects, but this could be part of another development. This idea arrived #5 in 2020 with 57 votes but not done with this long explanation: "We unfortunately ran out of time and were unable to work on this. It can be re-proposed in a future survey.". It was suggested in 2018 with 36 supports, in 2017 and supported by 32 people, a draft was suggested in 2016 with 19 supports and this idea was coined first in a MediaWiki discussion.
  • Phabricator tickets: T139152, T157802
  • Proposer: Noé (talk) 07:33, 22 October 2019 (UTC)Reply[reply]

Discussion

Hello there, thanks so much for re-submitting! Accepting for voting phase and marking this for translation now. Regards, NRodriguez (WMF) (talk) 13:59, 17 January 2022 (UTC)Reply[reply]

Voting

Allow list of pages with increased memory allowance

Edit proposal/discussion
  • Problem: The single biggest issue currently plaguing the English Wiktionary is Lua out-of-memory errors on a small number of pages, such as a, C, in, la, si, se, etc. Wiktionary users have spent hundreds of hours trying to work around MediaWiki limits, but have not been able to solve the issue in all cases. In the future, this issue will only get worse as more content is added to Wiktionary.
  • Proposed solution: Add an allow list (aka whitelist) of pages that are given memory above the current 50MB limit. I propose phased limits, e.g. up to 100 pages can be placed on an allow list with a 100MB limit, and up to 50 pages can be placed on an allow list with a 200MB limit. This is a small number of pages and should not materially affect the load on MediaWiki servers.
  • Who would benefit: All English Wiktionary users.
  • More comments: MedaWiki developers have so far been unresponsive to requests to help fix this issue. The lack of responsiveness and unwillingness to take this issue seriously on the part of MediaWiki developers has eroded trust between the English Wiktionary and MediaWiki. Please do not dismiss this request out of hand as has happened with previous similar requests.
  • Phabricator tickets:
  • Proposer: Benwing (talk) 19:42, 22 January 2022 (UTC)Reply[reply]

Discussion

  • Tickets: Closed: phab:T267708, phab:T165935, phab:T6767; open: phab:T188492. In general, I think you probably need to think about a better way to architect your pages instead of trying to change memory requirements. --Izno (talk) 03:09, 23 January 2022 (UTC)Reply[reply]
    Yes, like maybe having each language in its own tab, or some other overhauled redesign, instead of clobbering dictionary entries onto a platform designed for encyclopedia articles. DAVilla (talk) 18:02, 24 January 2022 (UTC)Reply[reply]
its general: solve the problem inside the boundaries. If you have to make exceptions you have to administer them på top av allt. You make the issue broader and more komplex and even more timeabsorbant when you ask for an exception (for 10 out of worldwide millions) mlg Susann Schweden (talk) 19:22, 28 January 2022 (UTC)Reply[reply]
  • Does this memory limit include transcluded pages inside of the page? I was thinking you could store all of the definitions and on subpages and then transclude them on the main page. Lectrician1 (talk) 02:14, 29 January 2022 (UTC)Reply[reply]
  • Is the Community Tech team capable of making configuration change?

Voting

Display definitions from Wikisource dictionaries

Edit proposal/discussion
  • Problem: Wiktionaries aims to offer for each meaning one definition but there are many ways to describe a meaning, many words - including local uses (i.e. and American adapted definition and an Indian one for the same word) and very technical terms sometimes. A synthetic one is a way, but more than one is better. Some other definition in other dictionaries may be mentioned as references in the articles but they are not accessible in Wiktionary despite being for some of them in Wikisource.
  • Proposed solution: Many dictionaries are already in Wikisource and we can use them to offer more definitions. A dedicated transclusion or paragraphs from Wikisource in Wiktionaries could be a solution, by hand/bot or with an automatic harvesting of entries with a specific tagging in the dictionaries hosted in Wikisources. They could come from several Wikisources, to be display in several Wiktionaries. It could be a new tab next to "Article" and "Talk", named "Dictionaries" with definition for the same sequence of letters from dictionaries published in Wikisource. For French, I can imagine at least a dozen of definitions from as much dictionaries. For underdescribed languages with at least one source in Wikisource, it could be an interesting way to compare the source and how it evolve after its inclusion in Wiktionary.
  • Who would benefit: Readers wanting more than one definition.
  • More comments: Some dictionaries are already properly tagged; for the others, it could be a good opportunity to do it accordingly to TEI Lex0 guidelines, so that they can more easily be reused in open source projects. Also, to undermine a tendency when someone talk about Wiktionary: No, Wikidata Lexeme could not be of any help here. This issue is about content and not data or relation. Definitions are under CC BY-SA 3.0 in Wiktionary and in Wikisource dictionaries. This proposal is the same as this proposal last year (supported by 40 people) and this one posted two years ago by DaraDaraDara (32 supports).
  • Phabricator tickets: T240191
  • Proposer: Noé (talk) 11:43, 29 November 2020 (UTC)Reply[reply]

Discussion

Voting

Encourage translations by easier/automatic adding of translation boxes

Edit proposal/discussion
  • Problem: Many entries in Wiktionaries lack translations and translation boxes (templates). It would be easier for many more users (especially non-expert ones) to add translations if there was a translation box by default in every entry.
  • Proposed solution: Auto add translation box or make it easier to add translations to non-expert users, it could be a link in the form of "Click here to add translations" that would auto-create a translation box.
  • Who would benefit: All users searching for translations.
  • More comments: See discussion here
  • Phabricator tickets:
  • Proposer: Spiros71 (talk) 11:49, 12 January 2022 (UTC)Reply[reply]

Discussion

Voting

Get free pronunciation data

Edit proposal/discussion
  • Problem: we need more pronunciation sound examples. It's difficult to embrace all words editing manually sounds. We need to import sounds from free sources (as CommonVoice)
  • Proposed solution: Import sounds from free sources (like CommonVoice)
  • Who would benefit: End user with every pronunciation of each word
  • More comments:
  • Phabricator tickets:
  • Proposer: Xan2 (talk) 10:14, 12 January 2022 (UTC)Reply[reply]

Discussion

  • Hello, there is Lingua Libre ! Did you have try it ? It is made for the Wikimedia environment: https://lingualibre.org. -Lyokoï 11:46, 12 January 2022 (UTC)Reply[reply]
  • LinguaLibre is catching up, with a focus on words. The main issue is the project is still lowly known and we have no Communication / Outreach team to truly go get the target communities and raise wider adoption.
    In the other hand, CommonVoice is about 30 times larger than LinguaLibre, or about 30 millions equivalent words. I'am not sure if they have words or sentenses, and the feasability for their 130 (?) languages haven't been explore. Yug (talk) 12:13, 12 January 2022 (UTC)Reply[reply]
  • @Xan2: would phab:T298950, if implemented, address your concern? — xaosflux Talk 14:29, 12 January 2022 (UTC)Reply[reply]
    Perhaps the automatic way of pronunciation is different to "real" pronunciation (by real persons). So perhaps it's confusing for new users of language. Xan2 (talk) 17:13, 19 January 2022 (UTC)Reply[reply]
  • I dare to say we have overlapping proposals here Community Wishlist Survey 2022/Reading/IPA audio renderer :) Xavier Dengra (MESSAGES) 23:56, 21 January 2022 (UTC)Reply[reply]

Voting

Import translations from wikidata

Edit proposal/discussion
  • Problem: Translations could be further enriched automatically or semi-automatically by allowing the import/conversion of wikidata equivalences in translation boxes. For example, for the entry hypocapnia I added the translations from wikidata and converting them to translation box format was a tedious process.
  • Proposed solution: Automate or semi-automate the import and conversion of wikidata translations with the script that would convert wikidata in the format used by wiktionary.
  • Who would benefit: Any users looking for translations.
  • More comments:
  • Phabricator tickets:
  • Proposer: Spiros71 (talk) 11:55, 12 January 2022 (UTC)Reply[reply]

Discussion

  • Concepts in Wikidata should not be used directly in Wiktionary. Lexeme Senses are better because of the shared nature. Concept could be expressed by other words not documented in Wikidata. In my experience with French Wiktionary, it's haphazardous to contribute this way. Noé (talk) 13:19, 12 January 2022 (UTC)Reply[reply]
    Please refer to the example I mentioned above. Do the wikidata concepts incorporated look haphazardous? Does the end result look haphazardous? Also, I am not talking about direct use, I am talking of automating their import. The editor would need to finetune if needed. Spiros71 (talk) 13:47, 12 January 2022 (UTC)Reply[reply]
    Your example is correct, but there is plenty exception such as animal names (Q321376: is it vernacular or scientific name? It depends.) or abstract nouns (Q7242: the label in French is distinct from the name of the page in French Wikipedia). Concepts doesn't include verbs, adjectives, multiwords expressions, affixes, etc etc, a large part of Wiktionaries' content. A suggestion of a list of potential translations is interesting, but I think it's better to display only a short selection of languages known to the user, to capture good information and avoid vague intuition of possible translation in languages not known by the contributor. Noé (talk) 14:03, 12 January 2022 (UTC)Reply[reply]
    I do fully understand your concerns and I agree. It cannot work perfectly in all cases, and some rules would need to be made/automated to improve/limit its scope; on the other hand, it can work on a very good level on a huge number of cases (as the one I illustrated above). This is where my focus is. Spiros71 (talk) 08:02, 26 January 2022 (UTC)Reply[reply]
    Fully automatic addition seems bound to end in tears. I like the idea of a list of check-boxes for suggested translations garnered from the Wiktionary and Wikipedia links in Wikidata for languages known to the user, perhaps based on babel boxes or a similar declaration by the user. Maybe one could, for cases like “hypocapnia”, allow the user to add languages they are not familiar with, but place these in categories of automated translations to be checked. Such translations could be added at first with a template that marks them as pending confirmation or even conceals them; a tool to automate checking and confirming them would also be nice to have. Another function that might help with semi-automatic addition is displaying part of the linked Wiktionary/Wikipedia entries in the list of suggestions. PJTraill (talk) 19:01, 29 January 2022 (UTC)Reply[reply]

Hello there, thanks for your proposal-- I was curious to understand how/if you envision adding translation boxes as related to this wish Automatic adding of translation boxes. Do you envision that the functionality proposed in that wish would benefit this? Are they unrelated in nature due to this focusing on Wikidata linkage? Thanks in advance! NRodriguez (WMF) (talk) 13:49, 17 January 2022 (UTC)Reply[reply]

@Spiros71 Pinging so you can see this, thanks so much! NRodriguez (WMF) (talk) 23:11, 25 January 2022 (UTC)Reply[reply]
They are related, but different. This procedure is more advanced as it deals with (semi-)automating importing content (and a number of rules would have to be created for that, for example for language name manipulation and compatibility between the two projects), whereas the other one is (semi)-passive, i.e. either automatically provide (blank) translation boxes in all cases, or make it easier for non-technically inclined users to do so (for example, with a button "Click here to add translation [box]"). As it stands now, it is only amenable to people who have some technical knowledge regarding the addition of translation boxes. I hope this is clear; please do not hesitate to ask for any further clarifications. Spiros71 (talk) 07:54, 26 January 2022 (UTC)Reply[reply]

its no idea to automize translations. Take an example beugen, german Verb - five meanings and five different english words to describe the verbs impact. If you take easier translation jobs like a noun, its still the accuracy that is todays problem when humans try to find and enter a correct translation. dvs not a word 'referring to something like that' or meaning 'slightly more' than the word in question or 'just one little special case' of the word in question. In my experience its so, that noone ever will control automated entries. If you let them in (if you give an idividual the chance to massimport), they will be there forever. There can be exceptions, but this is my generel view after a lot of practice in the german wiktionary. Pardon my french :) Got a little bit rusty med engelskan mlg Susann Schweden (talk) 18:50, 28 January 2022 (UTC)Reply[reply]

“control automated entries” – perhaps you mean “check”? Sorry if it sounds like nit-picking, but this would clarify what you mean! PJTraill (talk) 18:48, 29 January 2022 (UTC)Reply[reply]

Voting