Jump to content

Indic-TechCom/Requests/Archive/2019

From Meta, a Wikimedia project coordination wiki
Latest comment: 4 years ago by ZI Jony in topic Find & Replace

"Find and replace" tool for Assamese wikisource

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

The "OCR" tool has been immensely helpful for Assamese wikisource but at the same time it is not 100% accurate. Whereas some obvious errors like Bengali "র"can be corrected while proofreading, some faults are shown as correct by the fonts such as "র" in conjuncts. We, therefore, need a bot and a gadget that can rectify these as the "find and replace" tool in standard editor or WikEd is very time consuming. A gadget button will be helpful for general editors and the bot can be run once in while by a user with technical knowledge. This will be a real boon for As.wikisource users. Also adding @Simbu123: and @Mridul Kumar Sharmah: to the discussion.

An example of such error can be seen here (source page : https://as.wikisource.org/wiki/%E0%A6%AA%E0%A7%83%E0%A6%B7%E0%A7%8D%E0%A6%A0%E0%A6%BE:%E0%A6%B6%E0%A6%99%E0%A7%8D%E0%A6%95%E0%A7%B0%E0%A6%A6%E0%A7%87%E0%A7%B1.pdf/%E0%A7%A8%E0%A7%AA)

List of intended corrections in sandbox here

Gitartha.bordoloi (talk) 15:17, 20 January 2019 (UTC)

@Titodutta: here. Gitartha.bordoloi (talk) 09:19, 23 January 2019 (UTC)
Gitartha Ji, Working on--Jayprakash >>> Talk 11:55, 30 January 2019 (UTC)
@Gitartha.bordoloi:-Bengali is my mother-tongue but Assamese looks quite distinct; in that I cannot perceive any difference between many of the pairs. At any case, if the proposal is to mass-substitute all instances of the left-hand-characters with the right hand ones (without much of discretion), I can do a semi-bot-run but I need a list of target-pages.Winged Blades of Godric (talk) 19:56, 4 February 2019 (UTC)
Gitartha Ji, Can you try Indic-TechCom/Tools/FixUnicode(AS)?--Jayprakash >>> Talk 12:44, 8 February 2019 (UTC)
Hi Jay, I added the gadget in my commons.js. I tested some pages and found that the tool is correcting one line at at time or so. It's taking multiple clicks to correct one page. Also it's not replacing some incorrect য় or ঢ়. Could you have a look at the tool again? My test pages were this and subsequent pages. Thanks.Gitartha.bordoloi (talk) 08:49, 9 February 2019 (UTC)
Hi all. I'm accepting defeat for now as I've realized that য় (09DF), ড় (09DC) and ঢ় (09DD) cannot be corrected in wiki pages. Even if I write them correctly they would turn into "য (09AF)+.", "ড (09A1)+." and "ঢ (09A2)+." respectively after saving the page even in meta script pages (though will look like correct characters. You can press backspace and see). Same problem is there in Bengali. For other inconsistencies User:দিব্য দত্ত has prepared a script here which should do for the time being. Thank you. Gitartha.bordoloi (talk) 18:36, 13 February 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Tool for transferring non free fair use media

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

Many movie articles contain non free fair use images. Many of the movie articles in English contain the non free image but not in the Indic language articles. A tool to transfer these non free images from one wiki to another and add the image automatically to the infobox to the linked article of a language. This will save lot of time and improve the articles. -- Balajijagadesh (talk) 05:18, 5 February 2019 (UTC)

Balajijagadesh Ji, We have lots of work at this time. Please create a RfC at Indic-TechCom/Management/RfC. And Notify Other Communities as well. If your proposal gets a good amount of consensus. Then we will work on it soon. Thanks :)--Jayprakash >>> Talk 14:25, 8 February 2019 (UTC)
Balajijagadesh Ji, Done https://tools.wmflabs.org/wikifile-transfer/ Sorry for late work. I had initially work on the userscript phab:T217486.--Jayprakash >>> Talk 08:15, 25 September 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Twinkle menu appears distorted on Telugu Wikipedia

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.
Twinkle menu display error

Twinkle menu tab shows grey triangles around it unlike other tabs. Need help to fix the same on Telugu wikipedia.--Arjunaraoc (talk) 06:07, 11 March 2019 (UTC)

Hey Arjunaraoc Ji, Per image, there is two issues.
  • Extra spaces after More Dropdown.

Sol:- This comes because of మీడియావికీ:Gadget-Twinkle-pagestyles.css. There is 3.24em margin on the right side. To remove the extra spaces. You need to set 0em as margin just like I did in w:te:వాడుకరి:Jayprakash12345/common.css.

  • Unnecessary arrows in the background.

Sol:- This comes from w:te:మీడియావికీ:Gadget-Twinkle.js because there is a background config for Outer element. Which is totally useless at this point. To remove the image from the background. You need to remove following codes from w:te:మీడియావికీ:Gadget-Twinkle.js.

	if ( type === "menu" ) {
		// Fix drop-down arrow image in Vector skin
		outerDiv.style.backgroundImage = 'url("data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABYAAAAQCAMAAAAlM38UAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAAA9QTFRFsbGxmpqa3d3deXl58/n79CzHcQAAAAV0Uk5T/////wD7tg5TAAAAMklEQVR42mJgwQoYBkqYiZEZAhiZUFRDxWGicEPA4nBRhNlAcYQokpVMDEwD6kuAAAMAyGMFQVv5ldcAAAAASUVORK5CYII=")';
		outerDiv.style.backgroundPosition = "right 60%";
	}

This part already removed from twinkle.js.

I think tewiki used old twinkle code. So my suggestion is to upgrade the twinkle codes. These are just temporary solution.--Jayprakash >>> Talk 17:05, 11 March 2019 (UTC)
Thanks User:Jayprakash12345 for your suggestions. I applied the same and the problem is fixed now. Regarding upgrade, are you referring to the latest version that runs on enwiki or any custom version for Indian languages that I remember being discussed on these pages. --Arjunaraoc (talk) 06:54, 12 March 2019 (UTC)
After the fixes, tabs TW and 'మరిన్ని' have background color/shading different from other tabs. Any suggestion to fix that.--Arjunaraoc (talk) 06:56, 12 March 2019 (UTC)
Yes, I am referring the latest enwiki version. And Regarding tabs TW and 'మరిన్ని' have background color/shading different from other tabs that are intended in general.--Jayprakash >>> Talk 20:29, 12 March 2019 (UTC)
@User:Jayprakash12345, Thanks for the clarification. When I did the initial import many years ago, it was a struggle to import all related templates and translate some of them to Telugu, as the original was not intended for non English usage. If you are comfortable to upgrade our TW without major issues, I can authorise you with the necessary rights. Let me know.--Arjunaraoc (talk) 04:42, 13 March 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Indic Wikisource Graph

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

The Phe tool have produce different kind of Graph in all wikisource. https://tools.wmflabs.org/phetools/stats.html We need two graph as cumulative proofread as shown in https://tools.wmflabs.org/phetools/graphs/Wikisource_-_proofread_pages.png ( Cumulative pages, subdomain comparison section). We need only comparision for Indic Wikisource proofread stats. The script is available at https://github.com/phil-el/phetools/blob/master/statistics/graph.py for your ready reference. Jayantanth (talk) 20:00, 16 March 2019 (UTC)

Jayantanth Ji, I have created the route for API. See Add route for API will be done before the end of May. :)--Jayprakash >>> Talk 12:32, 14 April 2019 (UTC)
Jayantanth Ji, Done https://tools.wmflabs.org/indic-wsstats/graph--Jayprakash >>> Talk 12:26, 3 May 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Mass rename script

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

I need a Mass rename script which will support normal find & replace move and Regex based find & replace move. script should have following option (if possible):

  1. a field for copy-pasting pages
  2. an option for move without redirect (for sysop)
  3. an option for "Treat search string as a regular expression"

(Info: There are two script exist for mass rename. c:User:Legoktm/massrename.js & c:User:Perhelion/massrename.js. c:User:Legoktm/massrename.js works on bnwiki but this script doesn't support regular expression. c:User:Perhelion/massrename.js support regular expression but doesn't work on bnwiki.) --আফতাবুজ্জামান (talk) 01:37, 29 March 2019 (UTC)

আফতাবুজ্জামান Is it for File namespace?--Jayprakash >>> Talk 21:15, 6 April 2019 (UTC)
@Jayprakash12345: i usually need it for article & template namespace. --আফতাবুজ্জামান (talk) 02:20, 7 April 2019 (UTC)
আফতাবুজ্জামান Ji, Work in process I have created a development version of script. See https://meta.wikimedia.org/wiki/User:Indic-TechCom/Script/massMover.js, But It does not contain Regex, Error Handing, and Success Msg. Although Script is working. I will finish the work once I have time. :)--Jayprakash >>> Talk 12:02, 14 April 2019 (UTC)
Please, let me know when finished. --আফতাবুজ্জামান (talk) 20:48, 18 April 2019 (UTC)
Done Indic-TechCom/Tools/MassMove--Jayprakash >>> Talk 18:51, 1 July 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Infobox settlement and its sub templates not working on Telugu Wiki

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

We imported infobox settlement from enwiki into Telugu Wiki. The automatic conversion for length,area does not appear. Even the units do not appear. As an example, the documentation pages w:Template:Infobox settlement/lengthdisp shows metric and imperial values, units in Template Output Column, while te:Template:Infobox settlement/lengthdisp shows just metric value.--Arjunaraoc (talk) 00:59, 14 April 2019 (UTC)

Resolved it myself, as I could find and fix a bug in the template imported from English wikipedia.--Arjunaraoc (talk) 08:39, 14 April 2019 (UTC)
Arjunaraoc, Thanks for letting us :)--Jayprakash >>> Talk 12:04, 14 April 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Gadgetification of cropimage script

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

Cropimage script is updated on Telugu Wikisource to fix problem with index files having unicode text. As I understand it is being used in Kannada and other wikipedias, making it a gadget will make it more manageable. --Arjunaraoc (talk) 04:56, 15 April 2019 (UTC)

Arjunaraoc Ji, I remembered, I fixed two problems last year Indic-TechCom/Requests/IWCC2018#Image_Crop. Sorry, we were failed to notify the communities. You can compare the version. And can install as the gadget. Please remember to declare dependencies=jquery.ui.dialog in te:మీడియావికీ:Gadgets-definition, when installing :)--Jayprakash >>> Talk 08:08, 15 April 2019 (UTC)
@User:Jayprakash12345 Thanks for your response. Actually my question was to make it globally available like other scripts, so that it can be put under proper Version Control.--Arjunaraoc (talk) 03:50, 16 April 2019 (UTC)
Not done We have too many scripts for maintenance, we should not take the extra burden. But will try to fix any potential bug. If community reports :)--Jayprakash >>> Talk 18:49, 1 July 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Include Avg Page Size

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

The lovely work done on Indic Wikisource Stats. I would suggest and request to include stats pertaing to average size of page in bytes. This is to check which Wikisource has done more work, What type of content is there, Viz. poetry or prose. --Sushant savla (talk) 07:13, 3 May 2019 (UTC)

Sushant savla Taking on the rader--Jayprakash >>> Talk 12:47, 3 May 2019 (UTC)
Sushant savla Ji, Done (https://tools.wmflabs.org/indic-wsstats/graph#averagepagesize) Thanks for your idea. :)--Jayprakash >>> Talk 18:13, 3 May 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Script for updating Infobox paramter

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

Hello,

I need a script that could automatically update the infobox parameter of a wikipedia page. The information on page name, parameter name and its value will be shared in specific format. Please try to help me on this.

Vikram maingi (talk) 17:37, 31 May 2019 (UTC)

Done, Hello Vikram maingi I have created a bot framework which has all the necessary files like Login, Action and Config file, etc. Please check out IndicBot and template_subs.py. Let me know if you find any problem.--Jayprakash >>> Talk 21:52, 22 June 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Regarding a tool from it.wikisource to pa.wikisource

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

User:Alex brollo has created a gadget that helps in proofreading and validating. The tool quickly saves the page and open next page's edit tab in just one click. I tried to do import it in Punjabi Wikisource and found difficult to do so. So, thats why I need some help with that.

  • Link from Italian Wikisource- eis.js
  • Link from Punjabi Wikisource- eis.js

--*•.¸♡ ℍ𝕒𝕣𝕕𝕒𝕣𝕤𝕙𝕒𝕟 𝔹𝕖𝕟𝕚𝕡𝕒𝕝 ♡¸.•*𝕋𝕒𝕝𝕜 22:35, 26 June 2019 (UTC)

Striked because done with the help of User:Alex brollo --*•.¸♡ ℍ𝕒𝕣𝕕𝕒𝕣𝕤𝕙𝕒𝕟 𝔹𝕖𝕟𝕚𝕡𝕒𝕝 ♡¸.•*𝕋𝕒𝕝𝕜 05:27, 1 July 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Chunk upload not working

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

Hi! The chunk upload tool on Commons is not working. It is showing all chunks uploaded 100% one by one. But in the end it says -'Failed:stashfailed:This file did not pass file verification'. One of the file in this series of books was successfully uploaded. Further 6 uploads are pending. Please help in fixing the error. Thanks, --सुबोध कुलकर्णी (talk) 04:26, 5 August 2019 (UTC)

Not done As no one taken it :(--Jayprakash >>> Talk 08:36, 25 September 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Find & Replace

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section.

Jayprakash12345, I've to replace many WikiProject template from talk page, could you please try to make something like find and replace? Regards, ZI Jony (Talk) 08:39, 25 September 2019 (UTC)

Jayprakash12345, A follow up reminder. Regards, ZI Jony (Talk) 08:27, 20 January 2020 (UTC)
ZI Jony Ji Done, Sorry I had ceated the script in Dec 2019. See here. But completly forget to notify you. This was my fault. Since I did not have any particular dashboard for user scripts. It is hard to track the development of the user script in comparison to toolforge tools. toolforge tools have a dashboard on phabricator. Now look Indic-TechCom/Tools/FindAndReplace. Please don't use mass replacing now. Just do with 5 pages then 10 then 20. Suggest improvements and bug on its talk page. Thanks :) --Jayprakash >>> Talk 23:24, 30 March 2020 (UTC)
@Jayprakash12345: Thanks, test done and its ok. Regards, ZI Jony (Talk) 05:50, 31 March 2020 (UTC)


Select the area/column of image before start Indic OCR

Hi , @Jayprakash12345:, Could it be be possible to select the area of images to OCR a page? There are many books with double/triple column as well as images. So when we click on Indic OCR button to OCR, then there should be an option to full page ocr OR Select a zone of a images to OCR. Jayantanth (talk) 18:17, 17 September 2019 (UTC)

Thanks Jayantanth for the above request. I have been thinking for sometime about this and visited today to make a request. This will be useful for digitizing old magazines, encyclopaedias, which have multple columns of text. The user interface should be such that the OCR text is appended to the existing text. (Example page on Telugu Wikipedia). We already have cropimage tool, which allows selecting the part of the image, which can be made useof. --Arjunaraoc (talk) 07:13, 19 September 2019 (UTC)
@@Arjunaraoc:, Thanks for your valuable comments. I think cropimage tool is not the solution. We have to think it in a different way, because GoogleOCR is not recongnised the column text at present. We have checked the Tecerract OCR which is column supported. But we havnt implemented in Indic Wikisource.Jayantanth (talk) 07:17, 26 October 2019 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.