Grants:IEG/Fundación Joaquín Díaz
What is the problem you're trying to solve?
Fundación Joaquín Díaz (FJD) is ready to release its extensive collection of unique sound recordings composed of about 23,000 files to make them available to the public in the Wikimedia projects. The problem is that the institution does not have enough resources neither to publish them on the Internet (basically they can not afford the costs of web hosting, etc.) nor to hire someone to afford this task.
What is your solution?
Working part-time in processing the database of Fundación Joaquín Díaz (FJD), cataloging and subsequent upload to Wikimedia Commons and Wikisource, making the collection fully available on the Wikimedia projects.
This is the second phase of Fundación Joaquín Díaz cooperation project, and the aim is to afford the processing and uploading of the whole archive. After releasing its photographic archive, in this phase we are interested in releasing its enormous sound archive, which would be the most important release of sounds made up until now on the GLAM field.
These are the main goals for this project:
- Improve the cataloging of the collection as a first step.
- Process its database into something workable by a bot.
- Publish the about 23,000 records (folk songs, tales, romances and other recordings) of the collection on Wikimedia Commons, making them available for the first time on the Internet. Most of the archive was recorded in the 1970s.
- Publish those recordings with transcripts on Spanish Wikisource.
Current cataloging is inefficient and inadequate. Given that Fundación Joaquín Díaz (FJD) does not have enough financial resources, the goal is to work on improving the full cataloging, processing files and database to make the collection available online under a free license in Wikimedia Commons.
The contents of each recording are not known so it is necessary to identify them individually in order to classify them as a song, a story, a romance, etc.
The database is contained in CSV files. Cataloging is sometimes a bit confusing and has several omissions that should be corrected before uploading all the material available on Wikimedia Commons. These errors and omissions must be verified individually in most cases before uploading files.
- Personal remuneration for related tasks: €580/month / $735/month, including taxes, for partial dedication from December to May.
- Travel spending: €76 / $96 due to four trips planned to Urueña from Valladolid (it depends of the transfer of the files, if we have to have a personal interview with the archive team, etc.)
- The total amount requested is €3,556 / $4,506.
For this task it is necessary to listen all the recordings (more than 23,000) one by one to know exactly what they contain, because it is not said in the record provided by FJD.
Considering that it would be necessary to listen about 60 seconds per track on average, and having into account that there are more than 23000 tracks, the result is an amount close to 400 hours of work only for listening and processing the information. Dedicating 4 working hours on weekdays, it would be required four months and two weeks of work dedicated to listen tracks. It would be necessary to work one additional month and two weeks in the task of processing the database and make the subsequent upload.
Three weeks would be used to program and test a robot to upload the tracks and the last three weeks would be used to upload and set the database up.
The Spanish Wikisource has a small community of contributors so I expect that I could coordinate perfectly with them. I also expect that this experience may bring more attention over the other Wikimedia projects besides Wikipedia and the communities contributing on them. Furthermore, this project can be positive for other people from the community carrying out similar GLAM cooperations in the future.
The current situation of Spanish ethnographic institutions is quite difficult and one of the few available online collections is Alan Lomax' one, mainly due to his American origin.1 That results in the most important part of Spanish traditional musical records being unknown for the general public.
Therefore, I expect that this project will bring Wikimedia to a new status among Spanish archives so they can be more interested in releasing contents under a free license if that means that they would be available on the Internet.
Furthermore, the Fundación Joaquín Díaz is planning to change its own website so they can link the contents hosted in the Wikimedia projects. Then visitors will find the contents in both ways: if they come from the FJD website or if they are surfing the Wikimedia projects.
Measures of success
Need target-setting tips? Note: in addition to your project-specific measures of success, you will also be asked to report on some Global Metrics at the end of your final report. Please keep this in mind as you plan, and we'll support you as you begin your project.
- Create a bot to process the database.
- Upload the 23,000 sound records.
- Geolocate the 23,000 records.
- The basic target is to transcribe at least 10% of tracks on Wikisource during the next 12 months, and a more ambitious goal would be the transcription of at least 15% of tracks.
Joaquín Díaz has just told us they have some transcriptions, although it is a fairly low rate. Moreover, I have been testing some speech recognition software. They are far from being perfect but they can save a lot of work. However, human intervention is inevitable.
We also have the support of the community of Wikimedia Spain and, in order to engage more the community, we have thought of organizing some kind of event (a WikiChallenge) in which the users who perform more transcriptions would be rewarded (presenting them with a Wikimedia shirt and books donated by Joaquín Díaz for this purpose, for instance)
In this way we would increase participation in the project, so we would reach the objectives of increasing outreach of Wikisource and participation.
- Pusazul: I am a Spanish Wikipedia user since 2007. I have a degree in Statistics and strong knowledge in database administration and data mining. One of my areas of interest is folklore and traditional culture so I will be delighted to work in this project. I am a new member of the Spanish chapter
- Rastrojo: Spanish Wikipedia user since 2006. Sysop in both es.wp and Commons. Member of the Board of Wikimedia España, I first contacted the Fundación Joaquín Díaz in July 2013 in order to set a GLAM partnership which resulted in the donation and upload of 2765 images. I am still the contact person with the Fundación Joaquín Díaz so I would just participate as advisor and coordinator of the workload between them and the Chapter/Pusazul.
Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. Need notification tips?
Do you think this project should be selected for an Individual Engagement Grant? Please add your name and rationale for endorsing this project in the list below. (Other constructive feedback is welcome on the talk page of this proposal).
- Community member: add your name and rationale here.
- I support this grant. I think that this project is very interesting for Wikimedia projects. There is a very low amount of recorded material on Wikimedia Commons, and we are talking about preserving old and traditional songs, tales, etc. which are part of the culture of a country. And this is part of the sum of human knowledge, as we have in our moto. Every year it could be more difficult to colect that kind of material, because the people who know it are old. It could be also a new kind of collaboration for Wikimedia movement that could be followed by other organizations. --Millars (talk) 21:57, 30 September 2014 (UTC)
- I support. The proposed budget represents 0.15 € per recording uploaded to Commons. It is difficult to be more cost-effective! --Hispalois (talk) 06:43, 1 October 2014 (UTC)
- I support the grant. As Millars says, it´s a great opportunity to get a lot of recorded and free material; as well, it helps the preservation and diffusion of an important element of traditional culture in Spain. --Rodelar (talk) 08:15, 1 October 2014 (UTC)
- Support This is the kind of activities the WMF should spend its money. It truly contributes to spreading the free knowledge. --Discasto (talk) 12:39, 1 October 2014 (UTC)
- Support I think that is a good oportunity to obtain a precious material with a free license. Bye, --Elisardojm (talk) 22:11, 1 October 2014 (UTC)
- Support An opportunity we cannot afford to waste. It's relatively cheap anyway.Totemkin (talk) 13:14, 2 October 2014 (UTC)
- Support per above, and I want to add that having free material uploaded in an organized way that makes it usable from the Wikimedia projects or anywhere else is crucial and should be mandatory in every GLAM project involving massive uploads. Pusazul's work should make possible to find, for example, a Christmas song which talks about the Three Wise Men, or a story about shepherds, and use them from Wikipedia articles. It takes time and dedication, so this grant is adequate. -jem- (talk) 19:38, 2 October 2014 (UTC)
- Support per Millars. --Alan (talk) 21:20, 2 October 2014 (UTC)
- Strongly support, per Millars, Discasto and -jem; and because is a heritage hard to find and with a great cultural value to the humanity. This definitively will help the completeness of the Wikimedia articles and thematic units. --Zerabat (discusión) 10:55, 6 October 2014 (UTC)
- Support Well explained. Is a huge ammount of multimedia files that couldn't be found any else.--Coentor (talk) 19:44, 6 October 2014 (UTC)
- Support. Per -jem-. Allan J. Aguilar (Ralgis) 14:38, 9 October 2014 (UTC)
- Support A good project to digitize and publicly make available the heritage. Best wishes--Visdaviva (talk) 12:03, 27 October 2014 (UTC)
- Support. I strongly support this proposal. The project brings together the 70's cultural/knowledge rescue activism with the XXI Century free knowledge movement. It's a project about preservation, access and reuse that open the opportunity to organize community activities to transcript, document and contextualize on Commons, Wikisource and Wikipedia. Could be a role model for other cultural organizations similar to FJD --Dvdgmz (talk) 08:13, 25 November 2014 (UTC)