Grants talk:Project/Intelligibility transcriptions

From Meta, a Wikimedia project coordination wiki

Microphone inputs on Wiktionaries[edit]

"We will put microphone inputs in the pronunciation sections in Wiktionaries."

I don't actually even understand what you intend to do with that. Would this be for collecting audio recordings? Neither of the linked tools do anything like that. (Incidentally, there is a local gadget on the English Wiktionary for adding audio recording tools to pronunciation sections.) Would it be for allowing readers to test their own pronunciation? While that might be a useful tool for some people, it's not really lexicographic content in the spirit of Wiktionary.

I suggest running the proposal by at least one Wiktionary community before saying that you "will" do something to the interface. Making such a change would require a full vote by a community that tends to be hesitant to make such changes without a lot of discussion and testing. --Yair rand (talk) 04:50, 13 March 2017 (UTC)[reply]

It would be for allowing readers to test their own pronunciation. Do you think dictionaries should not be capable of providing pronunciation assessment? It would also be for the most efficient form of remediation. How could that be made more in the spirit of Wiktionary? I agree with your suggested rewording and have made the corresponding edit. Thank you. James Salsman (talk) 07:49, 13 March 2017 (UTC)[reply]

Proposed email if the Foundation is not interested in funding this opportunity[edit]

Hi, I’m working on an interesting open-source project, and I thought you might like to help. I'm trying to produce free, interactive language pronunciation assessment and remediation software which may be able to improve students' pronunciation of words six times faster than commercially available products.

Millions of people worldwide want to improve their pronunciation in order to gain access to better jobs and succeed at more opportunities to speak in public, on teleconferences, or to groups. Unfortunately, companies which charge for this service often frustrate students by putting too much emphasis on inconsequential mistakes. So this year we are building on the open source software we have released in our past Google Summer of Code efforts to produce the most efficient, full-featured pronunciation training software, with your help.

We are trying to raise $25,000 from now until May (http://sphinxcapt.org) in order to pay for student transcriptions, which we will use to improve our software. Sharing this link on social media and spreading the word about our efforts will help too. Thank you so much for what you do in the open-source community. James Salsman (talk) 22:05, 13 March 2017 (UTC)[reply]

Project Grant proposal submissions due today![edit]

Thanks for drafting your proposal for a Project Grant. Proposals are due today! In order for this submission to be reviewed, it must be formally proposed. When you have completed filling out the infobox and have fully responded to the questions on your draft, please change status=DRAFT to status=PROPOSED to formally submit your grant proposal. This can be found in the Probox template found on your grant proposal page. If you have already done this, thanks for your submission, and you should be receiving feedback from the Project Grants committee in the coming weeks. Thanks, I JethroBT (WMF) (talk) 18:16, 14 March 2017 (UTC)[reply]

Thank you, Jethro! James Salsman (talk) 23:20, 14 March 2017 (UTC)[reply]

Why mechanical turk?[edit]

Given that this is the Wikimedia community it seems possible to create a way where community members upload the required transcriptions. ChristianKl (talk) 15:39, 1 April 2017 (UTC)[reply]

Great question! I would love to avoid payments if possible, but obtaining authentic attempts at transcriptions is almost impossible without compensation. I know that there are many wikimedians who would gladly listen to several if not dozens of recordings and try to transcribe them, but we need several tens of thousands such transcriptions over only a few months' time. That takes real money and means connecting with groups of people who are willing to do that sort of work for small payments, while agreeing to follow rules such as never trying to transcribe different recordings of the same phrase unless necessary, and then only if a sufficient amount of time and intervening transcriptions have been obtained from the same person. Mechanical Turk provides software infrastructure to help enforce those sorts of rules, and allows cookie-based methods to make them somewhat more robust. Even if we could duplicate that Mechanical Turk infrastructure on Wikimedia Tool Labs, we would still lack their registered user base, and would need to actively recruit transcriptionists, relatively few who would end up being actual wikimedians, I suspect. However, we will certainly encourage wikimedians to become transcriptionists, whether they want to be paid for it or not. James Salsman (talk) 19:09, 4 April 2017 (UTC)[reply]
I clarified the sort of mturk alternative we can use if requested: Google AdWords to recruit workers and a custom Flask application to administer their transcription work, for example. This would add considerable delay and overhead, so despite its drawbacks, I do prefer mturk. James Salsman (talk) 05:27, 5 April 2017 (UTC)[reply]

Eligibility confirmed, round 1 2017[edit]

This Project Grants proposal is under review!

We've confirmed your proposal is eligible for round 1 2017 review. Please feel free to ask questions and make changes to this proposal as discussions continue during the community comments period, through the end of 4 April 2017.

The committee's formal review for round 1 2017 begins on 5 April 2017, and grants will be announced 19 May. See the schedule for more details.

Questions? Contact us.

--Marti (WMF) (talk) 19:53, 27 March 2017 (UTC)[reply]

Thanks to CMUSphinx Google Summer of Code mentors and student applicants for endorsements[edit]

Here are some of the applications from student endorsers: [1], [2]. I am happy to say that Priyanka Mandikal, my Wikimedia Google Summer of Code student from last year, has agreed to co-mentor the latter, which will use training scenarios for using her Accuracy Review of Wikipedias system to illustrate instruction in pronunciation and general topics simultaneously. James Salsman (talk) 05:35, 5 April 2017 (UTC)[reply]

Note to reviewers: cmusphinx wiki moved to GitHub[edit]

The cmusphinx wiki page referenced (https://cmusphinx.github.io/wiki/pocketsphinx_pronunciation_evaluation/) recently moved from SourceForge to GitHub during the review period, as is the entire project, slowly, so I apologize that was down for a couple weeks. James Salsman (talk) 08:32, 2 June 2017 (UTC)[reply]

Round 1 2017 decision[edit]

This project has not been selected for a Project Grant at this time.

We love that you took the chance to creatively improve the Wikimedia movement. The committee has reviewed this proposal and not recommended it for funding. This was a very competitive round with many good ideas, not all of which could be funded in spite of many merits. We appreciate your participation, and we hope you'll continue to stay engaged in the Wikimedia context.


Next steps: Applicants whose proposals are declined are welcome to consider resubmitting your application again in the future. You are welcome to request a consultation with staff to review any concerns with your proposal that contributed to a decline decision, and help you determine whether resubmission makes sense for your proposal.

Over the last year, the Wikimedia Foundation has been undergoing a community consultation process to launch a new grants strategy. Our proposed programs are posted on Meta here: Grants Strategy Relaunch 2020-2021. If you have suggestions about how we can improve our programs in the future, you can find information about how to give feedback here: Get involved. We are also currently seeking candidates to serve on regional grants committees and we'd appreciate it if you could help us spread the word to strong candidates--you can find out more here. We will launch our new programs in July 2021. If you are interested in submitting future proposals for funding, stay tuned to learn more about our future programs.

Aggregated feedback from the committee for Intelligibility transcriptions[edit]

Scoring rubric Score
(A) Impact potential
  • Does it have the potential to increase gender diversity in Wikimedia projects, either in terms of content, contributors, or both?
  • Does it have the potential for online impact?
  • Can it be sustained, scaled, or adapted elsewhere after the grant ends?
2.6
(B) Community engagement
  • Does it have a specific target community and plan to engage it often?
  • Does it have community support?
3.6
(C) Ability to execute
  • Can the scope be accomplished in the proposed timeframe?
  • Is the budget realistic/efficient ?
  • Do the participants have the necessary skills/experience?
3.6
(D) Measures of success
  • Are there both quantitative and qualitative measures of success?
  • Are they realistic?
  • Can they be measured?
3.4
Additional comments from the Committee:
  • Is not aligned with WM priorities. The project could be used in other languages, but only after a huge redesign of processes, I think.
  • The proposal does not present a clear case for how the project will lead to measurable improvement to the content of the Wikimedia projects; rather, it appears to involve the development of a tool that, while potentially valuable in its own right, would have a merely incidental relationship to Wiktionary.
  • The impact is not clear.
  • The connection to the Wikimedia movement and its strategic priorities seems tenuous at best. The creation of such a software may be important in principle but I am not sure that teaching people how to correctly pronounce words is within the scope of any Wikimedia project except may be Wiktionary to some little extent.
  • I don't think it has any good fit with strategic priorities. It might have an indirect impact on quality of Wiktionary pronunciation recordings, but this needs engagement from at least one Wiktionary community which is not the case here.
  • The goals of the project are vague, and no specific targets or measures of success are presented.
  • The goals are unsatisfactory putting it lightly.
  • I can say the approach is innovative, but I do see a potential risk (we may develop a tool that will not be used) greater than impact (potential improvement of pronunciation files).
  • The project considers using a paid service to make a "free contribution" of an idea to Wikimedia projects (aka subcontracting).
  • The project is extensive in scope, and it is unclear whether all of the activities described can be accomplished in 12 months. No detailed timeline or project plan is presented.
  • Unclear.
  • Probably we have a good fit in terms of participants' skills. I don't think the budget is reasonable: Commons simply does not have 50k recordings of 1000 words, thus this is a budget for a non-existent task.
  • Lack of interest.
  • There is limited evidence of community engagement, and it is unclear how much support the proposal has from the established Wiktionary community.
  • Little community engagement.
  • No big community support.
  • No community engagement at all. No Wiktionary community supports it. It does not seem to support diversity either.
  • More exploratory work, discussion with the Wiktionary community, and specific project planning is necessary before this project can be funded.
  • No firm connection to the Wikimedia and its goals.
  • Very interesting project with big budget and lack of community support. Perhaps to see and try the test version of the first, and ask the community what they think about the project.
  • No per above comments. It is a project with very low impact potential on Wikimedia projects (and potentially useless if no Wiktionary community will be interested in it) with unrealistic expectations (requesting budget for transcription of 50k recordings of 1000 English words while we do not have them).