Grants talk:Programs/Wikimedia Community Fund/Rapid Fund/Digitalizing Punjabi Manuscripts (Phase II) (ID: 22024557)

Add topic
From Meta, a Wikimedia project coordination wiki

Initial suggestions from the South Asia Regional Funds Committee[edit]

Hi @Gaurav Jhammat, thank you for your application. South Asia Regional Funds Committee wants to note that all ideas towards the preservation of culture, history, and knowledge are essential and employing technology to ensure that such a transfer happens and is subsequently made available for Wikimedians is indeed a very good idea.

Please go through some of our suggestions and comments and provide as detailed responses as possible:

In the application:

A. It is mentioned that “ First one is we are doing some research about manuscripts which are available to us but not in the public domain.”

  1. Does that mean the manuscripts are unpublished?
  2. Are these handwritten manuscripts?
  3. Any idea of ‘the year’ these manuscripts are written?
  4. What is the copyright status of these manuscripts?

B. It is mentioned that “13 Visits on 4 different locations”

  1. Is it possible to mention the 4 different locations and the base location?
  2. It is possible to bring the manuscript to the base location and then scan them?
  3. Why are 13 visits necessary, can there be a different approach to reduce the budget and time implication?

C. In the budget sheet, in the summary page it is mentioned as “Scanning-99,500-INR 5 per page (19,900*5=99,500)

  1. As you are aware a lot of Wikisource digitisation work happens as part of the volunteer efforts and that model should be pursued before fixing a budget line item for scanning charges.
  2. To whom are the scanning charges mentioned in the application being paid?

D. For the paid position it is mentioned “ 20 hours a month” & “20,000 per month for six months”.

  1. We would need some reference for the determination of the hourly wages along with the description of the role and outcome from this position

E. Metrics and Evaluation:

  1. In the metrics part, in the target instead of mentioning the manuscript number (6) it would be better to mention the estimated no. of pages that will be scanned
  2. In phase one many of the pages scanned were skewed and in some pages it needed some effort to read the skewed pages properly. What measures are taken up by the current team to overcome this problem?
  3. In phase one there was good speed in scanning, like scanning 1000+ pages within a day. In the current proposal, it is requested for six months. How do we explain this ?
  4. We would also recommend that more information about the possible partnerships emerging from this project should be provided.

On behalf of the South Asia Regional Funds CommitteeTHasan (WMF) (talk) 14:18, 10 December 2022 (UTC)Reply

Hi @THasan (WMF) for such valuable feedback and concerns for this project. I really appreciate your suggestions and comments which i think are genuine too. Now, coming back to application, am answering as per your concerns written order:
In the application:
A. It is mentioned that “ First one is we are doing some research about manuscripts which are available to us but not in the public domain.”
  1. Does that mean the manuscripts are unpublished?
  2. Are these handwritten manuscripts?
  3. Any idea of ‘the year’ these manuscripts are written?
  4. What is the copyright status of these manuscripts?
Answers:
1. Yes, these are handwritten historical, religious manuscripts. Some of them do have modern published versions but preserving the handwritten versions is one of the purposes of this project.
2. Yes, as mentioned above. They are all handwritten manuscripts.
3. Please see this:
  • Dasam Granth by Bhai Mani Singh - early 18th  century
  • Handwritten Gurbani of Guru Gobind Singh ji’s time - Late 18th century
  • Handwritten Account of Maharaja Ranjit Singh Darbar - Mid 18th century
  • Janamsakhi by Bhai Bala - 1890 A.D.
  • Handwritten Manuscript of Guru Granth Sahib of Maharaja Ranjit Singh's time - Mid 18th century
  • Panth Prakash By Giani Gian Singh - 1889 A.D.
4. Since they are from the 18th or 19thr century, the manuscripts  are in the public domain but not in public access. It means they are copyright free but just placed at a place which is not common to public access.
B. It is mentioned that “13 Visits on 4 different locations”
  1. Is it possible to mention the 4 different locations and the base location?
  2. It is possible to bring the manuscript to the base location and then scan them?
  3. Why are 13 visits necessary, can there be a different approach to reduce the budget and time implication?
Answers:
  1. Yes, we have mentioned the cities (Bathinda, Patiala, Rajpura, and Amritsar districts) but not mentioned the personal addresses because manuscripts are kept at personal houses of the caretaking families. Those people have been taking care of these manuscripts for generations.
  2. No, It's not possible to displace or move the manuscript as these manuscripts have religious value for the hosting families and for the safety of the manuscripts, they will only allow them to be digitized at the location and to not displace them.
  3. All the manuscripts are not placed at the same location. So we have to do multiple visits. There is a very big difference between scanning a manuscript and an ordinary book. Manuscript pages need to be cleaned first before scanning and we are doing it on a weekend basis. For better results, we have reduced the estimated scan count in a day to approx. 750. But if we get better results with more scanning than this, then visits might be reduced but right now we are budgeting for the maximum number of visits needed, keeping in mind that there can be some unavoidable events that inhibit us from working at the same pace as previous project.  
C. In the budget sheet, in the summary page it is mentioned as “Scanning-99,500-INR 5 per page (19,900*5=99,500)
  1. As you are aware a lot of Wikisource digitization work happens as part of the volunteer efforts and that model should be pursued before fixing a budget line item for scanning charges.
  2. To whom are the scanning charges mentioned in the application being paid?
Answers:
  1. Yes, we are aware that a lot of wikisource digitization work happens as volunteer work and Punjabi Community is already doing such type of work on Punjabi Wikisource. As I already said, these manuscripts are placed at some specific places so we can’t displace them or move them all to one common place and scan them. It is very different from scanning regular texts and needs more time and energy. Punjabi Community Members has expressed that they are willing to do the scanning work. There are Four members (Jugraj Singh, Jagseer S Sidhu, Jagvir Kaur and Mulkh Singh) who will come for scanning for a particular visit but will need to dedicate their time for the entire duration of that visit. As it’s not a leisure activity that volunteers can do from the comfort of their houses at the pace they wish, we think we should reimburse them for their dedicated time for this project. These members are already doing work at voluntary space for example, Jugraj Singh is an active Punjabi Wikimedian on the Wikimedia Commoons. He was also part of fieldvisits in our last manuscript scanning project. Jagseer S Sidhu is leading a Punjabi Audiobooks Project which is related to Text to Speech conversion of books available in Punjabi Wikisource. Mulkh Singh is an active wikisource editor and he has brought a collection of Punjabi Fiction Books into public domain via Wikisource. Jagvir Kaur is also an active wikisource editor and part of Punjabi wikisource collaborations. These three have been a part of Wikisource Advance Training Workshop organized by CIS-A2K. They will continue to work in their volunteer capacity along with other community members to upload these works on Commons and further integration with Wikisource and other projects with Wikidata and Wikipedia.
  2. I am responsible for every activity under this project. I will supervise and train the community members for scanning. They will come in visits as per their availability. Our focus is to engage more members in the team so that we would have such editors who would know the full process of digitization of books from scanning to transclusion of them on wikisource and then till the validation.
D. For the paid position it is mentioned “ 20 hours a month” & “20,000 per month for six months”.
  1. We would need some reference for the determination of the hourly wages along with the description of the role and outcome from this position
Answers:
  1. Apologies, the work would be 20 hours per week instead of the 20 hours per month mentioned in the proposal.
For reference, a guest lecturer is paid 500 to 1000 INR per hour.
Being a project coordinator, my work will include three types of responsibilities :
  • Team Management: Managing and Supporting Multiple Workflows like organizing regular online meetups for preparations , designing schedule of visits and allocation of duties to team members (for scanning work) etc.
  • Logistics:  It will involve the visit organizing work which will include travel and accommodation organizing.
  • Training: Training & supervising community members on the digitization workflow. It will include editing and conversion of scanned manuscripts into PDF’s.
  • Documentation: Written documentation (including a possible diff post) and report writing for the project
E. Metrics and Evaluation:
  1. In the metrics part, in the target instead of mentioning the manuscript number (6) it would be better to mention the estimated no. of pages that will be scanned
  2. In phase one many of the pages scanned were skewed and in some pages it needed some effort to read the skewed pages properly. What measures are taken up by the current team to overcome this problem?
  3. In phase one there was good speed in scanning, like scanning 1000+ pages within a day. In the current proposal, it is requested for six months. How do we explain this ?
  4. We would also recommend that more information about the possible partnerships emerging from this project should be provided.
Answers:
  1. We have mentioned above that all manuscripts have around 20,000 pages approximately in total. The metric 6 is mentioned because 6 files will be uploaded on Wikimedia Commons.
  2. Actually, phase one was our pilot project and we didn’t know about the scanning process and obstacles of scanning manuscripts. We have learned some lessons from it and we will try to improve in its phase. In the first phase, we had to do a lot of work to edit scanned photographs in post-scanning work. This time we will try to check if the room light, scanner angle and position is suitable or rescan if needed. In the first phase, we still have scanned images in good condition but pages got skewed in editing or converting to pdf. This problem arose because lack of technical awareness and knowledge about software which we were using to convert images into pdf. They are converting it into black and white format and also vanishing some written text. In this phase, we will take care of it and also give training to editors for this conversion. Our community has done such type of scanning work in past too so we will conduct a training exercise in which experienced editors will give training about scanning process, how to handle scanner in case of different or giant sized manuscripts and also about conversion into other formats.
  3. As we already said, it was our pilot project and in addition, we have a limited budget to complete the target so we could not organize a follow-up exercise or rescan if some pages are needed. This time, we have multiple locations and plus, we have twenty thousands pages to scan so we do not want to take any risk of not good scanning. Also, I am managing and coordinating this project and it will take time in coordination, visits and post-scanning work like editing pics, converting them to pdf and report writing. So we have mentioned six months for this project. And Yes, 1000+ pages is a good speed but this time we will focus on good quality scanning (as raised by the committee in the previous question) rather than a fast speed.
  4. We are not collaborating with any specific institution in this project but after doing the pilot project, we got information for other Sikh families who are taking care of old Manuscripts and Heritage. There can be possible partnership in future while transclusion and Wikisource work as we would need some researchers and language experts who would help wikisource editors to proofread and validation.
I hope my answers and explanations have satisfied your concerns. Gaurav Jhammat (talk) 06:00, 16 December 2022 (UTC)Reply

Recommendation from the South Asia Regional Funding Committee[edit]

@Gaurav Jhammat: Thank you for responding to our observations in detail. While we are confident about most of the planning and process we would still like to make two suggestions for a better implementation of the program. South Asia Regional Funding Committee will finalise its funding recommendations based on your response to these suggestions.

  1. In the metrics section target can be mentioned as 6 books containing a total of about 19,900 pages. It would be easy to ascertain if the target is achieved.
  2. Paying Rs. 5 per scanned page after reimbursing travel, food, stay, etc amounts to paid editing and we would like the organisers to reevaluate this allocation. On behalf of the South Asia Regional Funding Committee THasan (WMF) (talk) 07:36, 9 January 2023 (UTC)Reply
Hi @THasan (WMF), thanks for the valuable recommendations. The proposal has been edited as per your suggestions and sorry, we have noted that scanning charges mentioned in the budget were sounding like paid editing and now, to resolve this, we have made changes accordingly to your suggestions and now we have increased some budget of appreciation which we will give to scanning volunteers, advisors and project contributors. We will appreciate them by gifts or souvenirs rather than paying them scanning charges. Thanks again.Gaurav Jhammat (talk) 16:34, 15 January 2023 (UTC)Reply

Funding recommendation from the South Asia Regional Funding Committee[edit]

@Gaurav Jhammat: The South Asia Regional Funding Committee highly appreciates that the feedback was received in a positive way. Upon diligent analysis regarding the updated proposal it is considered to reduce the funding in certain areas and an amount of Rs3,10,000/- has been approved for the project.

We wish the organisers and volunteers a productive engagement and would like to receive regular updates about the progress of the report. On behalf of the South Asia Regional Funding Committee THasan (WMF) (talk) 08:28, 24 January 2023 (UTC)Reply

Thanks @THasan (WMF) for appreciating the feedback and approving the proposal. We are highly grateful for you for this and also thanks for wishes. We are assuring you we will give regular updates about the project. - - Gaurav Jhammat (talk) 13:25, 24 January 2023 (UTC)Reply

Project date extended[edit]

This is a confirmation that Gaurav Jhammat has requested a project end date extension via email. The extension request has been approved. The new project end date is 31 May 2024 with a reporting due date on 30 June 2024. DSaroyan (WMF) (talk) 11:31, 19 January 2024 (UTC)Reply