Grants talk:IEG/Fundación Joaquín Díaz

From Meta, a Wikimedia project coordination wiki

Finalize your proposal by September 30![edit]

Hi Pusazul. Thank you for drafting this proposal!

  • We're hosting one last IEG proposal help session in Google Hangouts this weekend, so please join us if you'd like to get some last-minute help or feedback as you finalize your submission.
  • Once you're ready to submit it for review, please update its status (in your page's Probox markup) from DRAFT to PROPOSED, as the deadline is September 30th.
  • If you have any questions at all, feel free to contact me (IEG committee member) or Siko (IEG program head), or just post a note on this talk page and we'll see it.

Cheers, Ocaasi (talk) 20:07, 25 September 2014 (UTC)[reply]

[edit]

Hi, Pusazul,

Thanks for this proposal. Can you please describe a bit more clearly which tasks the personal renumeration part of your budget would cover? The content being donated by this GLAM project sounds wonderful, but WMF grants generally do not fund people to create content. Uploading these files to Commons or Wikisource yourself would be content creation (rather than, say, organizing a group of volunteers to do this task). Any further details you can share about your plans to help clarify this would be useful. Cheers, Siko (WMF) (talk) 23:14, 2 October 2014 (UTC)[reply]

Hi Siko! Sure I can explain a bit more about this project.
The personal remuneration would cover the curator work that someone (me, in this case) has to afford before the recordings could be uploaded to Commons. The archive is composed by ~23000 songs which were mainly recorded in the 1970s but the current database does not specify which kind of recording is inside each file. That makes necessary to listen 45-60 seconds per record in order to register in the database the kind of recording that we are processing.
Furthermore, we are heading to 300-400 hours listening the recordings in order to create a valuable database that could be later on uploaded to Commons. That is why the Spanish chapter thought that an IEG could be a great solution for this GLAM project: the current monetary situation of most Spanish cultural institutions does not allow them to hire someone to perform these tasks so they could donate a good database. Therefore, we may see this as an innovative approach to solving a key problem for the Wikimedia movement, especially in Spain: the lack of funds may prevent some GLAM-cooperation opportunities.
So, in the end, after ~4 months I should have been able to create a better database and then the uploading process would be possible and easier ( in other words, the reason why the money is necessary for is the processing time, around 80% of the total time, and the content creation is just a natural consequence of the previous work).
Finally, I think it is clear that I am not creating content by my own but requesting this grant to make possible that the FJD contents can be uploaded under a free license to Wikimedia servers. And, of course, all the tools that I may create in this project will be placed for further reuse by others on the WMFLabs if it is possible.
Best regards. Pusazul (talk) 15:37, 3 October 2014 (UTC)[reply]
Thanks for the quick and thorough response, Pusazul! Based on this understanding that your paid time would be spent on the processing/curation of the database rather than the uploading itself, I'm going to go ahead and mark this proposal eligible now. Feel free to edit your budget section to clarify this item further in your proposal too - it will be useful for our reviewers to understand that this part of the budget is for curation and processing of the materials so that they can be donated to Commons. Best wishes, Siko (WMF) (talk) 20:27, 3 October 2014 (UTC)[reply]

Questions[edit]

Hi! I like IEGs to develop software tools, but I have some questions about your project:

  1. Which is your real engagement in Wikimedia movement? I see only 100 editions in all Wikimedia projects.
  2. Did you develop any tool for WMFlabs or old toolserver?
  3. Do you know whether Spanish wikisource allows transcriptions from sound records?
  4. You talk about your goal: "Publish those recordings with transcripts on Spanish Wikisource." Will you transcript these sound records to Wikisource? Do you talk about paid editing? are these editions over your IEG's scope?
  5. I'm not sure about the real Spanish copyright status of these sound records. Are you sure that Fundación Joaquín Díaz has the copyright of these sound records? Do you know about Spanish copyright issues?

Good luck!--KRLS (talk) 23:57, 2 October 2014 (UTC)[reply]

Hi KRLS!
1. I registered myself as a Wikipedia user on the Spanish Wikipedia in 2007 and it is true that I have not been much active since then. Mainly because I identify myself as a WikiGnome, so I have done most of my edits anonymously.
Even now, sometimes, I forget to log in before editing, as you may have seen even on this grant request or on the Catalan Wikipedia Taverna. I am trying to change this from now ^^'
Nevertheless, I have been in touch with some members of Wikimedia España in my city, Valladolid, since at least 2010 and I am member of the chapter.
2. No, I have not developed any kind of tools for WMFlabs/Toolserver for the moment.
3. Yes, the Spanish Wikisource allows transcripts of songs (here you have an example). That means that this GLAM project would fit perfectly on Wikisource (we would have both the transcript and the song file, in some cases).
4. The transcripting of the records is not one of my planned tasks on this IEG (that would be paid editing, as you point). The FJD is providing the transcripts of some records, thus making possible the upload to Wikisource. Therefore, I will only upload to Wikisource the transcripts donated by the FJD.
5. Joaquín Díaz and his foundation (FJD) are the copyright holders of the whole archive, considering that mostly all the records are folk songs that can be under the public domain. However, in case that some songs may have been used in CDs published by Joaquín Díaz and may be object of some copyright restrictions, the FJD has granted an OTRS permission in CC-BY-SA 3.0 for the whole collection.
Best regards, Pusazul (talk) 15:57, 3 October 2014 (UTC)[reply]

Eligibility confirmed, round 2 2014[edit]

This Individual Engagement Grant proposal is under review!

We've confirmed your proposal is eligible for round 2 2014 review. Please feel free to ask questions and make changes to this proposal as discussions continue during this community comments period.

The committee's formal review for round 2 2014 begins on 21 October 2014, and grants will be announced in December. See the schedule for more details.

Questions? Contact us.

Siko (WMF) (talk) 20:28, 3 October 2014 (UTC)[reply]

Categorisation[edit]

Hello,
actually, I really like this proposal, as our projects are badly in need of sound recordings. Nevertheless, I want to ask you how you assure that those 23,000 sound files will be categorised more or less well? Until now, most of photos of the previous of Fundación Joaquín Díaz don't have any descriptive categories, similar to many other GLAM donations. What do you propose? Does the Fundación Joaquín Díaz have any categorisation system that we could use as well? --Jcornelius (talk) 17:10, 10 October 2014 (UTC)[reply]

Hi JCornelius! The system that the members of WMES used for the categorization of the pictures from the FJD was to place those pictures in the categories related to the places (in some cases these pictures are the only ones available on Commons). Nevertheless, as you point out, the categorization at this GLAM donation is more important than in the pictures one.
The FJD is providing some information about the songs that we can use in the categorization process but it is not so detailed as we would desire so I will work in this point with a special effort (I plan to categorize the songs by author, place, year, recorder, content, if they are completely recorded or they have missing parts, etc...). Best regards Pusazul (talk) 20:36, 16 October 2014 (UTC)[reply]
Hi Pusazul, thanks for your answer and thanks for putting some effort in the categorisation! Actually, not even arranging or managing of GLAM donations is the most difficult (or most work intensive) part but this "stupid" and "boring" work after ;) So, good luck and patience to you! --Jcornelius (talk) 21:55, 16 October 2014 (UTC)[reply]

questions from rubin16[edit]

Hello! I wanted to clarify some points of the proposal - help me with it, please :)

  • where are recordings stored at the moment? Will they transfer you a DVD or publish it somewhere on FTP/cloud drive? rubin16 (talk) 13:25, 12 October 2014 (UTC)[reply]
  • what is an approximate total size of files? How many GBs? rubin16 (talk) 13:25, 12 October 2014 (UTC)[reply]
  • who will perform individual review of the database? rubin16 (talk) 13:25, 12 October 2014 (UTC)[reply]
  • have you though about releasing files to some cloud drive and posting database somewhere on GitHub, for example, so that all interested people could be involved in verifying of the database? We at Wikimedia Russia had a good experience of verifying databases of WikiLovesMonuments: despite of the database size (tens thousands of items) wiki-community was interested in it and invested a lot of time into its verification: if you could post files somewhere to be available to everyone before transfer to Commons and post database to be easily edited - that could a good step to decrease your workload rubin16 (talk) 13:25, 12 October 2014 (UTC)[reply]
Hi Rubin!
* The FJD holds the recordings in DVDs at their sound archive. They have transferred us 3 CDs via WeTransfer.
* The average size of the first 59 files that I have been sent by FJD is 2.75MB (median=2.22MB), so we can assume that the total size is about 62GB. To be more accurate, a 95% confidence interval is (48GB,76GB).
* I will do that job in coordination with the FJD to preserve the accuracy of the processed database.
* I think that the actual size of the collection makes it quite impossible to upload it to the cloud in order to be processed and then upload it again to Commons.
Thanks for your comments :-) Pusazul (talk) 21:43, 21 October 2014 (UTC)[reply]

Lossless[edit]

Needs specification of source and target formats. As for audio, it clearly should be FLAC. There is a bugzilla request somewhere to enable FLAC and it shouldn't be hard if only you open a discussion on Commons; however if you can't upload to Commons I want the FLACs to be uploaded to archive.org (use [1]), they already have several such collections. --Nemo 08:51, 17 October 2014 (UTC)[reply]

I had planned to upload the files in .ogg format. Pusazul (talk) 19:56, 21 October 2014 (UTC)[reply]
That's lossy. Please ensure you upload the lossless files (FLAC) somewhere. --Nemo 08:17, 29 October 2014 (UTC)[reply]
From Grants:IEG/Fundación Joaquín_Díaz/Final#Progress towards stated goals I was sad to discover that only ogg files have been produced and uploaded. This is a big mistake and seems to prove that digitisation efforts should be conducted by professionals to guarantee sufficient quality and value for money. Nemo 08:31, 11 June 2016 (UTC)[reply]

Sustainability & scaleability[edit]

Hi there. Thanks for your proposal - it sounds like a really exciting GLAM project!

I'm wondering if you could share more information about what kind of database the audio files are currently stored in, and if you plan to document your work in improving the database in preparation for the uploading process? I ask because I see your project as having a lot of relevance to other cultural heritage institutions embarking on similar projects; as you pointed out, many of these institutions are underfunded which makes it very difficult to pay for the necessary equipment or skilled staff. In my opinion, one valuable outcome of this project would be if, upon completion, it would be then possible for other institutions (with or without technical expertise) to easily adopt your process and use the new bot to upload their own collections.

I'm also curious as to the cataloguing of files—above you mentioned plans to categorize the files by author, place, year, recorder, content, etc.—with regards to the development of a taxonomy to describe the content, will this come from yourself or will it be developed in collaboration with the appropriate WMF communities (Wikimedia Commons, Spanish Wikisource, etc.)? As others have noted, large collections of materials donated by GLAM institutions are only useable if they are organized in a way that allows users to easily locate them.

Thanks in advance.
-Thepwnco (talk) 17:39, 19 October 2014 (UTC)[reply]

Hey Thepwnco. The database is stored in CSV files, each CD of the collection has its own CSV file. I think that this program can lead to a great GLAM experience so I really plan to document all the steps that I may follow during the IEG period and include them on the hypothetical final report. As you point out this cooperation project may be interesting for other institutions so I will try to make all the tools/scripts that I may use for the process the most versatile that I can.
Regarding categorization, I plan to collaborate with my partners of Wikimedia España (most of them are long-time wikimedians) and the members of the community in the projects so they can point out if I am missing some important categorization facts (I really want this to be the most complete as I/we can afford so others can find the files :-)!). Thanks for your comments. Pusazul (talk) 22:01, 21 October 2014 (UTC)[reply]

Licensing[edit]

How do you know that all of this content is available under a truly free license? Many archives hold materials that are not in the public domain. Sound recordings from the 1970s might well contain copyrighted songs, etc. (Also note that if Fundación Joaquín Díaz is not the copyright holder, it cannot release the materials under a free license.) Can you explain your thinking on this point? Thanks, Calliopejen1 (talk) 18:09, 27 October 2014 (UTC)[reply]

Hi there. These records can be considered under the public domain according to the Spanish Intelectual Property Law (Ley de Propiedad Intelectual, LPI, 1/1996). In the case that we can consider these songs anonymous, the LPI in art. 6 and art. 27 states that the copyright term of the songs is of 70 years. As I have pointed below, the FJD has granted an OTRS permission releasing the contents in CC-BY-SA just in case that they may hold any copyright regarding the distributions of the songs and obeying the Spanish LPI. FJD has promised to release all the material which can be placed under these two clauses.
Pusazul (talk) 23:51, 4 November 2014 (UTC)[reply]

Aggregated feedback from the committee for Fundación Joaquín Díaz[edit]

Scoring criteria (see the rubric for background) Score
1=weak alignment 10=strong alignment
(A) Impact potential
  • Does it fit with Wikimedia's strategic priorities?
  • Does it have potential for online impact?
  • Can it be sustained, scaled, or adapted elsewhere after the grant ends?
7.8
(B) Innovation and learning
  • Does it take an Innovative approach to solving a key problem?
  • Is the potential impact greater than the risks?
  • Can we measure success?
7.3
(C) Ability to execute
  • Can the scope be accomplished in 6 months?
  • How realistic/efficient is the budget?
  • Do the participants have the necessary skills/experience?
7.4
(D) Community engagement
  • Does it have a specific target community and plan to engage it often?
  • Does it have community support?
  • Does it support diversity?
7.3
Comments from the committee:
  • Great GLAM project. The goals are worthy.
  • Fits with Wikimedia's strategic priority to improve quality
  • Unique material (unique in that the material is audio, Spanish language, and traditional/folk culture).
  • Seems realistic and feasible. Project lead has the requisite skills and abundant enthusiasm, and access to the material.
  • The FJD has also previously uploaded a collection of freely licensed materials to Commons - further evidence of their credibility.
  • Has the potential to be adapted elsewhere, particularly if appropriate documentation for other cultural institutions to use is created and the categorization scheme of audio files is found to be useful.
  • Risks are low
  • Efforts have been made to notify the appropriate communities, quite a few endorsements. Nice community and chapter support
  • Processing 60 records per hour may be unrealistic.
  • Provided measures of success by the participants may not give much of an indication of impact.
  • Little innovation in terms of methodology, and may not be scalable. As mentioned on the talk page, it'd be truly innovative to structure the database work in a way that allows for the community to help or provide input; as it stands, what I see as the most important and innovative learning for the larger Wikimedia is the resulting documentation of best practices and procedures for similar cultural institutions (who might be underfunded or lack technical capacity) to follow in order to prepare and upload their content to Commons.
  • Unclear to what extent materials will be used by communities without any additional support or coordination.
  • Would appreciate clearer focus on developing documentation of best practices and processes for other cultural institutions. The proposal also does not require much community engagement or participation
  • Could volunteers help listen to the files?
  • Licensing issues could be a concern – would like to hear more about this
  • With volunteers and (some) support from Wikimedia Spain, the project could be accomplished within the budget and time proposed.

Thank you for submitting this proposal. The committee is now deliberating based on these scoring results, and WMF is proceeding with its due-diligence. You are welcome to continue making updates to your proposal pages during this period. Funding decisions will be announced by early December. — ΛΧΣ21 16:53, 13 November 2014 (UTC)[reply]

Adding 1 measure of success[edit]

Hi Pusazul. We've discussed this already privately, but I just wanted to capture it here as well, for the record: One of your stated goals is to "Publish those recordings with transcripts on Spanish Wikisource" and in your community engagement section you do state that you will engage the Spanish Wikisource community (although it would be nice to see some additional ideas for how to do this as your project progresses as well - let's keep talking about this). But, what we are still missing for the moment is a target in your "measures of success" section aimed at incorporating these recordings on Wikisource. It would be fine to clearly state that you are targeting X number of recordings/transcripts published on Wikisource within 12 months of the grant's start, if you feel it is impossible to commit to such at thing happening during the 6 months of your IEG (it makes sense to me that you first have to focus on the pre-processing etc before there will be anything useful for Wikisource). Regardless, we would like to see that you have at least one clear target (even if it must be longer term) that is aimed at impacting your stated target wikis. If you have further questions about this, please let me know - happy to discuss further. Cheers, Siko (WMF) (talk) 23:12, 25 November 2014 (UTC)[reply]

Thanks for the addition - confirming that 10% within 12 months seems reasonable. Cheers, Siko (WMF) (talk) 19:46, 3 December 2014 (UTC)[reply]

Committee advisor[edit]

Hi Pusazul,

Superzerocool, from the IEG committee, has kindly offered to serve as another project advisor on this proposal. In addition to the wonderful community volunteer advisor you already have in Rastrojo, we like to offer you either a WMF staff member or an IEG committee member to serve as co-advisor, so that they can help you coordinate any IEG-related needs, and offer additional perspectives as your project progresses. Superzerocool is a Spanish speaker and I think you will get along quite well. To confirm that this suits you and you 2 have had a chance to connect here, can one of you please add Superzerocool to the list of advisors in your infobox? (in edit mode, you should now see a space for advisor2= ).

Best regards, Siko (WMF) (talk) 00:24, 4 December 2014 (UTC)[reply]

Great! I will really appreciate the help from Superzerocool and also from Dvdgmz
Regards, Pusazul (talk) 00:09, 6 December 2014 (UTC)[reply]

Round 2 2014 decision[edit]

Congratulations! Your proposal has been selected for an Individual Engagement Grant.

The committee has recommended this proposal and WMF has approved funding for the full amount of your request, €3,556 / $4,506

Comments regarding this decision:
Thank you for adding a measure of success aimed at longer-term impact to Wikisource! Looking forward to seeing this project involve volunteers as your plans develop over time.

Next steps:

  1. You will be contacted to sign a grant agreement and setup a monthly check-in schedule.
  2. Review the information for grantees.
  3. Use the new buttons on your original proposal to create your project pages.
  4. Start work on your project!
Questions? Contact us.


-Siko (WMF) (talk) 18:23, 5 December 2014 (UTC)[reply]

Many congratulations![edit]

Many congratulations and muchas felicitaciones to Wikimedia España and everyone involved! This is to me an inspiration in seeking contributions from outside Wikimedia and it will surely be a great contribution to Wikimedia Commons and Spanish Wikisource. Allan Aguilar (talk) 14:49, 6 December 2014 (UTC)[reply]