Grants:Project/Wikitongues Poly Feature Set 1

From Meta, a Wikimedia project coordination wiki
statusineligible
Wikitongues Poly Feature Set 1
summaryIn response to the world’s growing rate of language loss, the non-profit organization Wikitongues is expanding the feature set of Poly, a proprietary open source platform for language activists, teachers and students alike to document, share and learn every language in the world, including sign languages.
targetPoly stands to improve the language content for both Wiktionary and Wikivoyages by creating a broader network of language content aggregation. Through it, the breadth of these projects will be expanded and their language accuracy improved. Furthermore, in gaining access to new language communities working with Wikitongues, Wikipedia stands to benefit from the incubation of new language editions.
type of granttools and software
amount$98,400 USD
granteeDbudellFredAAndrade
contact• freddie@wikitongues.org
volunteerMohamedudhuman05
this project needs...
volunteer
affiliate
advisor
join
endorse
created on17:30, 4 October 2016 (UTC)


Project idea[edit]

What is the problem you're trying to solve?[edit]

Explain the problem that you are trying to solve with this project. What is the issue you want to address? You can update and add to this later.

Wikitongues is a 501(c)(3) non-profit organization founded to stem the alarming trend of global language loss: within the next eighty years, as many as half of the world’s seven thousand languages are expected to disappear.[1] In coordination with an international volunteer community, we’re building the world’s first online and public archive of every language in the world.

To that end, Wikitongues is developing Poly, a free and open source platform aimed to empower people to share and learn every language in the world. Poly will simplify and make accessible the process of creating and maintaining language documentation — critical components to the vitality of linguistic communities and the preservation of humanity’s collective cultural heritage. The production of language content, as well as the fight for linguistic diversity, should benefit from our contemporary technologies, be fast and easy to use, and most importantly, be globally accessible.

Our efforts come at an historic moment of mobilization for the public good. Lacking government support and private investment, cultural activists from endangered language communities around the world are taking the initiative to document their languages before it’s too late. With an accessible and unified platform to share this documentation with the world, language revitalization movements would be greatly amplified. In developing Poly, Wikitongues provides a crucial piece of twenty-first century infrastructure, not just for linguistic preservation in an academic sense, but also for profound cultural exchange in the age of globalization.

Consider, for instance, indigenous activist Marie Wilcox, the last fluent speaker of California’s Wukchumni language.[2] Like many Native Americans, Ms Wilcox is working to revive her mother tongue after more than a century of cultural and political marginalization. Since 2014, she has been working with her daughter to produce a comprehensive dictionary and grammar in Wukchumni, lest the language of her people—the embodiment of their culture and history—fade with her own passing.

However, in documenting Wukchumni using only pen, paper, and voice, the final product has been a trove of relatively inaccessible data, alienating to younger generations and impossible to leverage computationally for linguistic research and dissemination. As a unified and accessible platform for language documentation, Poly would greatly amplify the efficacy of Ms Wilcox’s efforts.

Poly will stem the tide of language loss by empowering communities to document, share and preserve their languages for future generations. History shows us that with the proper support, communities can save their languages from extinction, or even revive a lost mother tongue. Hebrew for example, which today is the first language of half the world’s Jewish population, originally went extinct in the fourth century BCE, only to be revived by nineteenth century activists leveraging the ample Hebrew literature available to them. By developing a free, open source, and globally accessible platform for language documentation, we can empower cultural activists around the world to sustain their languages, ensure the longevity of their communities, and expand the base of public knowledge for future generations.

Initially developed and launched with funds raised from a record-breaking crowdfunding campaign in Kickstarter’s translation category, Poly has been met with great enthusiasm. We’re proud to count among our pre-launch community key stakeholders in language education and cultural activism: 1) cultural language activists, such as Pablo Blanco of New York’s Garifuna-speaking community; 2) language revitalization projects, including the nascent Seminole Creole movement in Brackettville, Texas, and the Dal Riada Scottish documentation project spearheaded by Scottish translator and celebrity language coach Àdhamh Ó Broin; 3) government institutions concerned with cultural exchange, such as the Arizona Refugee Resettlement Program; 4) language educators like African language specialist Khady Ndoye of LaPolyglotte; and 5) student and hobbyist communities around the globe.

The platform has been online since July 2016 as we conduct our private Beta launch, fixing bugs both in the UI and overall experience. In its current pre-release version, it supports text-to-text translations, however, to better serve language communities worldwide, robust video and search infrastructures are required, through which users would record words and phrases (both spoken and signed) in video and browse the vast body of content.

Wikitongues is therefore seeking support to expand our operational capacity, in order to develop a tightly defined feature set which would improve the quality and accessibility of data acquired by our growing network of volunteers. These features are video, search, software integrity and load test coverage, document pagination, external embedding of dictionaries, video descriptions for rich introductions to dictionaries, and an infrastructural data migration to Amazon S3 for efficiency gains.

What is your solution?[edit]

If you think of your project as an experiment in solving the problem you just described, what is the particular solution you're aiming to test? You will provide details of your plan below, but explain your main idea here.

Concept[edit]

Poly will help language learners, educators, and cultural activists to effectively document and share their languages, scaling community engagement opportunities with language revitalization programs. This will be essential infrastructure for successfully preventing language loss, as research shows that revitalization programs work best when communities play an active role.[3][4][5]

Poly was launched to a closed network of volunteers on a private Beta in late June, in preparation for a rolling public launch between October 2016 and February 2017. The platform currently allows users to create text-to-text translations between words and phrases, yielding bilingual dictionaries. Through this grant, the organization will work to advance multiple aspects of the Wikimedia Foundation’s mission through the implementation of an expanded feature set, which includes: 1) the creation, distribution, and search of multimedia content; 2) social features; and 3) a dynamic and flexible open data framework.

Mission Alignment[edit]

Poly is deeply aligned with the Wikimedia Foundation’s core values. Much as the Internet itself, Poly is built to be a key component in education, communication, collaboration, business, entertainment and society as a whole. As a platform, it exists to enrich the lives of individual human beings by creating a space for cultural bridge-building and exchange of knowledge. It leverages the right and will of individuals to shape the Internet and our own experiences on it, building contexts for greater language representation online. We recognize that the effectiveness of the Internet as a public resource depends on interoperability, innovation and decentralized participation worldwide, and as such, rely on our growing community and network of volunteers to reach speakers and signers of every language in the world, as we seek to empower these communities with tools to share and document their languages freely and effectively.

Poly is free, open source and built for the web of open data; in building this platform, we’re contributing to the development of the Internet as a public, global, and diverse human resource. The expansion of the Poly platform will magnify the Internet’s potential as a public resource, and is geared towards ensuring that all people can benefit from access to materials in their own language, as well as all other languages around the world.

Project goals[edit]

Explain what are you trying to accomplish with this project, or what do you expect will change as a result of this grant.

Beyond emphasizing the academic, scientific, and cultural value of greater language access and better linguistic data, it is worth noting that maintaining linguistic diversity is a crucial layer of human development. Language plays a foundational role in preserving the cultural identity of all peoples, and as such, language loss has a devastating effect on both the individual and the collective, causing cultural dislocation, social alienation, and deprivation of group identity.[6][7] The success of Poly, and of Wikitongues as a whole, will help ensure that globalization becomes more equitable for future generations.

Project plan[edit]

Activities[edit]

Tell us how you'll carry out your project. What will you and other organizers spend your time doing? What will you have done at the end of your project? How will you follow-up with people that are involved with your project?

Overview[edit]

The development of Poly so far has been conducted in a distributed manner, with contributors in Paris, Berlin, Los Angeles and New York. The project is hosted and managed on Github, and coordinated over Slack and Google Hangouts by Freddie Andrade, who oversees all development. In tandem, Daniel Bogre Udell will work with the extended Wikitongues volunteer community to populate the platform with new language content and the grow the application’s user-base.

This project will be carried out for forty hours per week over the course of twelve months. Over the course of this period, we will develop and iterate on the features outlined below, as well as conduct usability, engagement and retention tests for different types of users, both institutional and private. We will also report to the general public on feature progress, platform health, and the addition of new and interesting content, on at least a quarterly basis. At the end of the project, all of the outlined features will be implemented, tested, and released to the public on the production instance of Poly.

After the Wikimedia grant period has ended, we will continue to leverage the Wikitongues community’s networks for further collaboration, as well as conduct coordinated outreach to press to raise awareness about our efforts.

Poly Feature Set[edit]

To ensure that Poly is truly accessible to all, multimedia creation tools are essential, because half of the world’s languages have no written scripts. Sign languages, for example, comprise as much as 5% of the global total, and require video support to be transliterated. Poly multimedia support would include recording and uploading words and phrases in video, as well as streaming of pre-recorded videos in dictionaries.

Video support will also play an important role in specialized features for language educators, such as affording users the ability to upload and embed video introductions to the dictionaries they create. In this way, dictionaries become even more powerful educational resources, because their creators can provide walk throughs of linguistic concepts such as: grammar, syntax, pronunciation, and orthography.

As Poly aggregates more dictionaries, and as dictionaries grow to include hundreds of words or phrases per document, it will become increasingly difficult for users to navigate content without a carefully designed interface to guide them. With that in mind, intelligently displaying streams of content will be essential for a seamless user experience. Pagination, as well as the ability to efficiently search within a dictionary, are necessary features to improve legibility and usability of this content.

Since language documentation is often a collaborative process, either between researcher and subject, teacher and student, or fellow activists, a set of core social features is necessary; in particular, the ability for multiple users to contribute to a single dictionary. Beyond that, other layers of interactivity, such as the ability to follow users in order to receive notifications when they edit existing dictionaries or create new ones, will strengthen the sense of community within the platform. The ability to save favorite books will empower users to build personalized collections and more seamlessly navigate the content relevant to them.

Because Poly is expected to aggregate vast content, a powerful search interface is crucial to the efficacy of the platform. The ability to search for specific words or phrases in a given language throughout the entire Poly corpus, or within subsets of it are key to making a manageable archive — and in order to build that search feature, we must intelligently structure the platform’s aggregated data.

Nurturing the open data infrastructure of Poly will prime the platform for making big contributions to the long-term development of language technology, specifically machine translation and natural language processing. Structuring intelligent, open source corpora for previously under-documented languages, as well as creating direct translations between traditionally unassociated languages—between Kosovar Sign Language and Brazilian Portuguese, or Kurdish and Okinawan, for instance—is an essential step to opening the benefits of the Internet to all people.

It is important to recognize that world-class technologies have set the expectation of perfect functionality, no bugs and zero downtime. To ensure Poly looks, feels and acts like the dependable, necessary global utility it needs to be, adequate unit, functional and integration test coverage is very important, preempting breaking integrations and reducing the brittleness of the platform, more so due to the fact that Poly has not yet experienced large volumes of user traffic. Once open to the public, the platform must be able to support continued use by an unknown number of users, expected to be in the mid hundreds, without rendering itself economically unfeasible.

Budget[edit]

How you will use the funds you are requesting? List bullet points for each expense. (You can create a table later if needed.) Don’t forget to include a total amount, and update this amount in the Probox at the top of your page too!

Wikitongues is driven by the hard work and enthusiastic participation of our volunteer community. In particular, a core team of open source developers has made the development of Poly possible. By contracting a full-time software engineer and providing a budget for the infrastructural expenses of software development, we will provide essential support to the efforts of this community.

Expense List[edit]

  • Software engineer
  • Infrastructure Costs
    • Server costs
    • Elasticsearch
    • Continuous Integration services
    • Incidental costs

Budget Sheet[edit]

Initiative Amount Rate Total Contribution
Software Engineer 1,920 hours $45/hr. $86,400 USD
Infrastructure costs 12 months $1,000/month $12,000 USD
$98,400 USD

Community engagement[edit]

How will you let others in your community know about your project? Why are you targeting a specific audience? How will you engage the community you’re aiming to serve at various points during your project? Community input and participation helps make projects successful.

The feasibility of this project lies not only in the enthusiasm and skills of the core team members, but also how effectively we leverage our volunteer community and public audience to inform the development of Poly. As an organization, Wikitongues already benefits from healthy engagement through our social media channels, adding up to 20,000 people spread over Youtube, Facebook, Twitter and Kickstarter, as well as a robust volunteer network distributed across 45 countries. Leading up to Poly’s public release, we will bring key contributors from this community into the fold as we develop the application’s core feature set and initial content. Beyond that natural exposure, we are forming partnerships with key stakeholders in the realm of language activism and education, outlined above.

Over the course Poly’s nascent development, we have counted on the advice and feedback of these core users to ensure that we design and develop empowering features for individuals concerned with the documentation, dissemination, and acquisition of language. As we continue to develop the application and transition to formal beta testing, we will conduct frequent polls and surveys among our pre-launch user base to hold ourselves accountable to our goals.

Our current timeline accounts for Poly’s public launch to coincide with UNESCO’s International Mother Language Day, in celebration of linguistic diversity worldwide. With that in mind, we plan to host an unveiling event for Poly in honor of that holiday, which will help raise the platform’s profile among language enthusiasts. After public launch, the Poly community will provide the resources necessary to hold open discussions, mobilizing their communities and generating greater participation in language documentation.

Sustainability[edit]

What do you expect will happen to your project after the grant ends? How might the project be continued or grown in new ways afterwards?

Wikitongues sustains its operations using diverse revenue streams, including crowdfunding campaigns on Patreon and Kickstarter, high-capacity donations, and pending grants from other organizations, such as the Mozilla Foundation. These funding sources will continue to sustain the project after the Wikimedia grant period has ended.

After the Wikimedia grant period has ended, we will work closely with our volunteer community and public audience to expand our user-base and depth of language content. In the technical realm, we will focus on expanding Poly’s cross-platform compatibility up to full integration with SMS infrastructure, so that it can be leveraged by activists and language learners in low-bandwidth and Internet-free environments.

Measures of success[edit]

How will you know if the project is successful and you've met your goals? Please include specific, measurable targets here.

Poly’s quantitative success is contingent on the successful deployment of the technologies proposed. Each of the features proposed must have unit, functional and integration tests. Adequate time for bug testing and catching is allocated for each feature, ensuring a minimum of breaking merges. Measurable targets are each of the individual features described, or video, search and content filtering, software integrity and load test coverage, document pagination, external embedding of books, video description and an infrastructural data port to Amazon S3.

As an awareness raising non-profit organization, we track our impact through the growth and reach of our volunteer community and public audience, as well as the quantity and quality of data we collect. Therefore, Poly’s qualitative success is gauged by the diversity, scale, and retention of its user-base, as well as the volume of language data we are able to aggregate, especially from smaller and threatened languages.

Get involved[edit]

Participants[edit]

Please use this section to tell us more about who is working on this project. For each member of the team, please describe any project-related skills, experience, or other background you have that might help contribute to making this idea a success.

Wikitongues is driven by the hard work and enthusiastic participation of our volunteer community. In particular, a core team of open source developers, outlined below, has made the development of Poly possible. Beyond these core contributors, the extended Wikitongues volunteer community will play an active role in beta testing, content acquisition, and the design of an expanded feature set after the Wikimedia grant period has ended.

Freddie Andrade[edit]

Freddie Andrade is the co-Founder and Executive Director at Wikitongues, as well as the lead designer and project manager for Poly. With a BFA in Design & Technology from Parsons School for Design, he has ample experience working as a web developer and product designer in New York City, as well as his native São Paulo. He speaks English, Portuguese, Spanish, and French fluently and commands a working proficiency in Japanese. He is also learning Arabic.

Daniel Bogre Udell[edit]

Daniel Bogre Udell is the co-Founder and Executive Director at Wikitongues, as well as co-founder and editor of Global Voices Online’s Catalan language edition. With an MA in historical studies from the New School for Social Research and BFA in Design & Technology from Parsons School of Design, he has also worked as a web developer and content strategist in New York City. He is fluent in English, Spanish, Catalan, and Portuguese and commands a working proficiency in French. He is learning Galician, Italian, Romanian, and Polish.

Chris Voxland[edit]

Chris Voxland is a Wikitongues volunteer and core contributor to Poly. He is a backend and devops engineer at Plated, specializing in Ruby on Rails, Javascript (especially React.js), and SQL. He speaks English and Spanish fluently.

Luis Arias[edit]

Luis Arias is a Wikitongues volunteer and advisor to Poly. He is a backend and devops engineer at Balsamiq, as well as the director of a Rueil Digital a civic engagement nonprofit based in Paris, France. He speaks English, Spanish, German, and French fluently.

Ben Arias[edit]

Ben Arias is a Wikitongues volunteer and core contributor to Poly. He is a front-end application developer with a focus on Javascript, especially React.js. He speaks English, Spanish, French, and German fluently.

  • Volunteer I would like to help guiding and assisting the participant and also involved towards various service which will help communicate with all. Ill show positive aspects of to the community, will share and care people who needs some assist. Identify issues and/or opportunities for collecting data. Explore organizational culture and help other follow the event privacy. Mohamedudhuman05 (talk) 16:15, 14 October 2016 (UTC)

Community notification[edit]

Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. Need notification tips?

Endorsements[edit]

Do you think this project should be selected for a Project Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page).

  • I think it will be good for languages and Wikimedia projects. Davidpar (talk) 13:13, 12 October 2016 (UTC)
  • I have been personally involved in Wikitongues and this project will help us in acheiving our goal for "the sum of all human knowledge" by documenting and strengthening communities of "every language in the world".--Satdeep Gill (talk) 16:41, 13 October 2016 (UTC)

References[edit]

  1. Davis, Wade (February 2003). "'Dreams from endangered cultures'". TED. Retrieved October 3, 2016. 
  2. Vaughan-Lee, Emmanuel (August 18, 2014). "'Who Speaks Wukchumni?'". The New York Times. Retrieved October 3, 2016. 
  3. Norris, M. J. (2003). From generation to generation: Survival and maintenance of Canada's Aboriginal languages within families, communities and cities. Aboriginal Strategies Conference. Edmonton, AB. 
  4. Norris, M. J. (2006). "Aboriginal languages in Canada: Emerging trends and perspectives on second language acquisition.". Aboriginal Policy Research: Moving forward, making a difference 3: 197–226. 
  5. Grenoble, L.A.; Whaley, L.J. (2006). Saving Languages: an introduction to Language Revitalization. New York: Cambridge University Press. 
  6. Haugen, E.; Bloomfield, M. (1974). Language as a human problem. New York: W.W. Norton & Company Inc. 
  7. McIvor, O. (2005). Building the Nests: Indigenous Language Revitalization in Canada Through Early Childhood Immersion Programs. (M.A.). University of Victoria. 

Bibliography[edit]

  • Davis, Wade (February 2003). "'Dreams from endangered cultures'". TED. Retrieved October 3, 2016. 
  • Vaughan-Lee, Emmanuel (August 18, 2014). "'Who Speaks Wukchumni?'". The New York Times. Retrieved October 3, 2016. 
  • Norris, M. J. (2003). From generation to generation: Survival and maintenance of Canada's Aboriginal languages within families, communities and cities. Aboriginal Strategies Conference. Edmonton, AB. 
  • Norris, M. J. (2006). "Aboriginal languages in Canada: Emerging trends and perspectives on second language acquisition.". Aboriginal Policy Research: Moving forward, making a difference 3: 197–226. 
  • Grenoble, L.A.; Whaley, L.J. (2006). Saving Languages: an introduction to Language Revitalization. New York: Cambridge University Press. 
  • Haugen, E.; Bloomfield, M. (1974). Language as a human problem. New York: W.W. Norton & Company Inc. 
  • McIvor, O. (2005). Building the Nests: Indigenous Language Revitalization in Canada Through Early Childhood Immersion Programs. (M.A.). University of Victoria. 

External links[edit]