Talk:File metadata cleanup drive/How to fix metadata

From Meta, a Wikimedia project coordination wiki

Thanks for taking this on, Guillaume! There is some (not too user-friendly) information on updating templates at mw:Multimedia/Media_Viewer/Template_compatibility which probably should not be included here but might be useful to know about.

Copying templates from Commons is not for the faint of heart ({{Information}} uses two Lua modules and at least half a dozen helper templates, plus introduces a bunch of English categories), it might be helpful to provide a stripped-down version. --Tgr (WMF) (talk) 18:24, 15 October 2014 (UTC)[reply]

So why not a temporary Importer access? --Liuxinyu970226 (talk) 00:48, 18 October 2014 (UTC)[reply]
Tgr: Thanks! This is useful information; I'll refer to that page and add it as a link for people who want more information. Thanks also for the warning about the Information template; I've set up a simplified version at File metadata cleanup drive/How to fix metadata/Simple information template with only the basic fields. Guillaume (WMF) (talk) 09:23, 20 October 2014 (UTC)[reply]
Mnn I've made an example at zh-min-nanwiki. --Liuxinyu970226 (talk) 01:04, 23 October 2014 (UTC)[reply]

Public domain licensetpl_link[edit]

--Nemo 18:38, 12 November 2014 (UTC)[reply]

We need a canonical link (URL) of some sort that people can refer to, and I didn't know about the CC Public domain mark, so I figured the only consistent link we could have was the file's description page. I'm happy to switch to the CC PD mark if the lawyers agree. Stephen, what do you think? Guillaume (WMF) (talk) 18:44, 12 November 2014 (UTC)[reply]
Ok. I tried [1]. --Nemo 18:53, 12 November 2014 (UTC)[reply]

Adding the information template takes a lot of time[edit]

Adding the information template takes quite a long time. Still +2,000 files on fi-wiki without the template. I used AWB which made it a bit faster (I didn't use regex, I'm too bad in it), but it was still quite slow way to proceed. https://tools.wmflabs.org/add-information/no_information.php is not very helpful since it only uses English parameters that works only on English projects / projects using English parameters (not many I think). --Stryn (talk) 20:01, 26 November 2014 (UTC)[reply]

If it helps, create fi:Template:Information with English parameters as a wrapper template for fi:Template:Tiedoston tiedot, then use AWB to change {{information|...}} into {{subst:information|...}} to replace the English template with the Finnish one.
Also note that fi:Template:Tiedoston tiedot doesn't handle missing information correctly. You need to replace id="fileinfotpl_src" with {{#if:{{{Lähde|}}} | id="fileinfotpl_src"}} or something similar (same for source and author). Otherwise, the template incorrectly tells that there is a machine-readable description, source and author even if the template parameter isn't used. --Stefan2 (talk) 22:01, 26 November 2014 (UTC)[reply]
Thanks for letting me know about the "if"-thing, I've added those now. --Stryn (talk) 16:43, 27 November 2014 (UTC)[reply]
Just a note that zhwiki also have a number of File pages which are using {{subst:information}}, so it'll have a long time to fix... --Liuxinyu970226 (talk) 01:56, 16 December 2014 (UTC)[reply]
If the code of the template is more or less the same on all those pages, a bot owner should be able to use a bot to go through those pages and "un-subst" the template. Guillaume (WMF) (talk) 20:48, 16 December 2014 (UTC)[reply]

Not multilingual[edit]

To echo what's been said above, https://tools.wmflabs.org/add-information/no_information.php is not very helpful since it only uses English parameters that works only on English projects / projects using English parameters (not many I think). Trying to roll this out on non-English projects is a problem because Template:Information doesn't exist, instead, it's called (in Welsh) Nodyn:Gwybodaeth. Is there a "quick fix" for this that doesn't involve using English, or using "wrapper templates"? Chase me ladies, I'm the Cavalry (talk) 20:48, 8 December 2014 (UTC)[reply]

Chase me ladies, I'm the Cavalry: There isn't a "quick fix" as far as I know. I've reported this issue in Magnus' bug tracker. In the meantime, perhaps you can focus on groups of files with similarly-formatted information, that can be parsed by a bot? Guillaume (WMF) (talk) 20:47, 16 December 2014 (UTC)[reply]
Wilco! Thankyou. Chase me ladies, I'm the Cavalry (talk) 16:40, 8 January 2015 (UTC)[reply]

Labs tool doesn't work for Meta-Wiki[edit]

Hi, <https://tools.wmflabs.org/add-information/no_information.php?language=meta&project=wikimedia&startswith=&user=> doesn't work for this project. As one of the projects which hosts local files it would be very handy to have this tool working here. Thanks for your help. -- M\A 15:51, 20 December 2014 (UTC)[reply]

Ping @Guillaume (WMF): any possible solution? Thanks! -- M\A 14:06, 30 December 2014 (UTC)[reply]
User:Ricordisamoa added mediawiki.org (although that change hasn't been merged by Magnus yet). Ricordisamoa, would you be able to add Meta-Wiki as well? Guillaume (WMF) (talk) 21:42, 5 January 2015 (UTC)[reply]
@Guillaume (WMF) and MarcoAurelio: https://bitbucket.org/magnusmanske/magnustools/pull-request/4 --Ricordisamoa 22:55, 5 January 2015 (UTC)[reply]
WikiThanks Thank you Ricordisamoa! Guillaume (WMF) (talk) 23:01, 5 January 2015 (UTC)[reply]
@Ricordisamoa, Guillaume (WMF), and Magnus Manske: Thanks! - There's a similar issue with commonshelper. Maybe it meta needs to be added to the DB as well? Best regards. -- M\A 10:54, 6 January 2015 (UTC)[reply]
@MarcoAurelio and Guillaume (WMF): I think CommonsHelper uses the same function to obtain the database name of a wiki, but I cannot find its source code so we're waiting for Magnus here. --Ricordisamoa 12:01, 6 January 2015 (UTC)[reply]
@MarcoAurelio and Guillaume (WMF): Magnus merged my patches, but there's still an error that prevents the tools from working with 'special' wikis: I've filed https://bitbucket.org/magnusmanske/magnustools/pull-request/6 that should fix it. --Ricordisamoa 04:53, 15 January 2015 (UTC)[reply]
@Ricordisamoa: Thank you. Yep, meta still isn't working :-( Best, -- M\A 11:04, 15 January 2015 (UTC)[reply]
@MarcoAurelio: it works now :-) --Ricordisamoa 14:20, 21 January 2015 (UTC)[reply]

Edit requests on Commons about this[edit]

Who have times to do? --Liuxinyu970226 (talk) 00:40, 3 January 2015 (UTC)[reply]

Labs tool doesn't work[edit]

Hi, the labs tool work for wikipedias. I changed "commons" to "gl", "es" nor "en" and it shows the error: "Warning: mysqli::mysqli(): (HY000/2005): Unknown MySQL server host 'glwikimedia.labsdb' (0) in /data/project/magnustools/public_html/php/common.php on line 105 Call Stack: 0.0280 709256 1. {main}() /data/project/add-information/public_html/no_information.php:0 0.0739 1077864 2. openDB() /data/project/add-information/public_html/no_information.php:54 0.1017 1083712 3. mysqli->mysqli() /data/project/magnustools/public_html/php/common.php:105

Fatal error: Call to a member function real_escape_string() on a non-object in /data/project/magnustools/public_html/php/common.php on line 119 Call Stack: 0.0280 709256 1. {main}() /data/project/add-information/public_html/no_information.php:0 3.1919 1081440 2. get_no_info() /data/project/add-information/public_html/no_information.php:72 3.1919 1081520 3. make_db_safe() /data/project/add-information/public_html/no_information.php:28 3.1919 1081736 4. get_db_safe() /data/project/magnustools/public_html/php/common.php:123" Bye, --Elisardojm (talk) 10:00, 20 January 2015 (UTC)[reply]

@User:Guillaume (WMF), it seems to me this report is unavailable, your tool is down, and this bug is still open (this diff proves it), so I'm not sure how exactly the Italian community is supposed to tackle the issue, as kindly reminded on Tech News a few weeks ago. What is preventing the software from just reading the Informazioni file template as it is? --Elitre (WMF) (talk) 20:52, 28 January 2015 (UTC)[reply]
Elitre (WMF): Wikimedia Labs had issues in the past few days and MrMetadata was affected. The Labs team fixed it and I restarted the tool afterwards; it should be working now.
Regarding how to fix files: Magnus' tool is useful for fixing individual files (when it works in your language), but for the wikis with the most imges to fix like itwiki, I strongly recommend using the same approach we're using on Commons, i.e. using bots to add the informazioni template. On Commons, we're trying to identify files with similar description pages, and running bots on them to add the template around the existing information. I think this approach would work well on itwiki and would be much more effective than Magnus' tool, especially considering that itwiki is the third biggest wiki in terms of number of files to fix. I'm happy to talk to bot owners if they need help. I was actually going to reach out to bot owners on the biggest wikis about this. Guillaume (WMF) (talk) 01:10, 30 January 2015 (UTC)[reply]
@User:Guillaume (WMF), now, if in the tool I change "commons" to "gl", "en" etc, the error continues appearing. Doesn't supports the tool wikipedias? Bye, --Elisardojm (talk) 22:15, 30 January 2015 (UTC)[reply]
No, that tool isn't working well on non-English Wikipedia. I've edited the page to explain how to fix files with bots instead. Guillaume (WMF) (talk) 23:51, 31 January 2015 (UTC)[reply]
Thanks! --Elisardojm (talk) 16:21, 1 February 2015 (UTC)[reply]

Information template in galician wikipedia[edit]

Hi, I have reviewed the help and the template Information in galician wikipedia, but I don't know if it's OK or it has to be updated because I don't understand well the use of the markers. Could somebody review it, please? Bye, --Elisardojm (talk) 16:26, 1 February 2015 (UTC)[reply]

PD-Self on el.wiki[edit]

Why is w:el:Αρχείο:Column2.png still marked as without machine-readable license? I did not find any mistake other than [2]. --Nemo 20:38, 3 October 2015 (UTC)[reply]

@Nemo bis: non-thumb images do not have a caption. The description ends up in the title attribute; HTML tags are stripped but even if they weren't they wouldn't be interpreted properly. --Tgr (WMF) (talk) 23:27, 4 October 2015 (UTC)[reply]

Navbox-based templates[edit]

What's the most efficient way to add td id attributes to a Navbox-based information template, as the fair use template on hy.wiki? [3] didn't work (ignore the fact that the test didn't have an if testing for the actual presence of those pieces of data). --Nemo 12:15, 4 October 2015 (UTC)[reply]

The original paring code in CMD follows the COM:MRD standard strictly (the id attribute has to be on a td tag and the information has to be in the next table field). Since that way lies madness, an alternative syntax was added eventually, which apparently I forgot to document :( I did it now: commons:Commons:Machine-readable_data#Alternative_format_for_CommonsMetadata --Tgr (WMF) (talk) 23:41, 4 October 2015 (UTC)[reply]

PD or Public Domain files[edit]

These files are in class="license2". When I tried changing the class to class="licensetpl" or class="licensetpl_short" for {{PD}}, the files continue to appear in our v:Category:Files with no machine-readable license. Suggestions? --Marshallsumter (talk) 00:01, 18 March 2016 (UTC)[reply]

Marshallsumter: The template has been fixed since, but in general you need at least a licensetpl and a licensetpl_short class. --Tgr (WMF) (talk) 00:15, 18 March 2016 (UTC)[reply]

We know but in order to get the template {{PD}} and others to work v:User:Dave Braunschweig had to insert <div class="licensetpl" style="display:none;"> <span class="licensetpl_short" style="display:none;">Public domain</span> <span class="licensetpl_long" style="display:none;">Public domain</span> <span class="licensetpl_link_req" style="display:none;">false</span> <span class="licensetpl_attr_req" style="display:none;">false</span> which was a bit too complicated for me. --Marshallsumter (talk) 21:36, 18 March 2016 (UTC)[reply]

@Guillaume (WMF):, @Tgr (WMF): an whoever understands how c:Category:Files with no machine-readable license is added to files. I have at least 2 files on Commons that seems to have valid widely used license templates and still end-up in c:Category:Files with no machine-readable license:

Can someone check what needs to be done with those files to get license templates in order. See also discussion on Commons Village Pump. Please ping me when replying. --Jarekt (talk) 13:35, 26 July 2016 (UTC)[reply]

@Jarekt:: The first is due to {{Naturspektrum}} (and in the end {{License}}) adding a licensetpl class but then not adding any license information. I have no idea about the second, that should have worked even before you moved the license into the info template. --Tgr (WMF) (talk) 01:11, 27 July 2016 (UTC)[reply]
Tgr (WMF), Thank you. I fixed {{Naturspektrum}} by this edit. As for the second file moving licenses did not help, but this edit fixed it. Now all files in c:Category:Files with no machine-readable license are tagged with a problem template. --Jarekt (talk) 02:36, 27 July 2016 (UTC)[reply]

@Guillaume (WMF):, @Tgr (WMF): an whoever understands how c:Category:Files with no machine-readable source is added to files. I was asked about how come many files using c:template:Art photo with perfectly fine source end up in c:Category:Files with no machine-readable source and I can not figure it out. I looked at 2 files: File:Adriano, 117-138 ca, collez. albani.JPG which is not in c:Category:Files with no machine-readable source and File:'The Raising of the Cross' by Tania Dey.JPG which is. Both files use c:Template:Art photo with "source={{own}}" and I did minor edits to both to make sure it is not a cache issue. Their html source codes have identical: <tr valign="top"> <td id="fileinfotpl_src" class="fileinfo-paramfield" lang="en">Source</td> <td> <span class="int-own-work" lang="en">Own work</span></td> </tr>, so it seems like that directory is assigned based on something else. Any idea? --Jarekt (talk) 19:12, 15 November 2018 (UTC)[reply]

@Jarekt: It's a variant of T59259 - there are multiple information templates on the page so all but one get ignored. --Tgr (WMF) (talk) 21:22, 15 November 2018 (UTC)[reply]
@Tgr (WMF): , but both files have the same 2 infoboxes, (2 Artwork templates no Information templates), so how come one is in the category and one is not. Is there a way to write those infoboxes so the check works correctly? And if not than maybe we should turn it off somehow, because with tens of thousands of false alarms that category is not useful to anybody. --Jarekt (talk) 03:52, 16 November 2018 (UTC)[reply]
@Jarekt: CommonsMetadata discards info templates with no author data (to avoid the worst fallout from the problem), which is why the behavior of the two articles differ. I suppose the template could generate its own invisible set of metadata at the top and make it as complete as possible... --Tgr (WMF) (talk) 18:09, 16 November 2018 (UTC)[reply]
@Tgr (WMF):, Thanks for pointing me to CommonsMetadata, as I have never run into it before. I guess the code is trying to do the impossible task of interpreting html tags created by various infobox templates and figure out things like source, author or license. As the infoboxes become more complicated the task become harder. I guess this is one of the reasons we need structured data (supposedly coming soon). I assume structured data will make CommonsMetadata obsolete, so there is not much point of fixing CommonsMetadata beforehand. --Jarekt (talk) 18:41, 16 November 2018 (UTC)[reply]
@Jarekt: yeah, it's the kind of problem the structured data project is supposed to fix. Changing CommonsMetadata is not that hard but this approach still seems safer than the alternatives - mixing data from multiple info templates could result in fake information, some of it legally relevant (sometimes one template is about the object and another is about the photograph). --Tgr (WMF) (talk) 19:45, 16 November 2018 (UTC)[reply]

lb.wiki[edit]

Hi! lb:Kategorie:Fichieren ouni maschinneliesbar Lizenz have a lot of files and I can't figure out how to fix the templates.

Perhaps someone could help fix lb:Schabloun:Bild-FU and lb:Schabloun:GFDL. Then I can hopefully figure out how to fix the rest.

If it's not too much trouble lb:Schabloun:Information too :-D --MGA73 (talk) 20:31, 18 January 2020 (UTC)[reply]

Hello. Information template seems to be fixed! Yay! But I could stille need help with license templates. --MGA73 (talk) 16:16, 27 January 2020 (UTC)[reply]
Anyone? --MGA73 (talk) 20:59, 8 March 2020 (UTC)[reply]