Getting the images used on Wikipedia pages etc
While I appreciate a great deal of time goes into getting the images loaded, I suspect the time needed to actually do something with each those images is probably 10x or 100x more work. I've been engaged in Commons-categorising 4000+ images from a local archive; I gave up about half way through. I was the only person doing it and it just got too boring and the tool support just didn't seem to be there (people tell me there are tools but I couldn't work out how to use them and there is very little documentation available). So improving tool *support* is necessary.
Just to consider a concrete example, the evaluation is illustrated with File:Tropenmuseum Royal Tropical Institute Objectnumber 1138-7 Riviergezicht met gebouwen van een kato2.tif. It's rather pretty and would be a nice illustration, but it's not currently used on Wikipedia. So I read through the English description and tried to see what Wikipedia article I could use it in. My immediate problem is that the image description (although several lines long) does not tell me which country it is from and there are no categories and no geolocations. The place name mentioned is Coronie and there is a Wikipedia article (or at least a redirect) by that name, but reading the article doesn't help me to decide if this is indeed an article that might be illustrated with this image. The Coronie article doesn't indicate even what continent it is on, but it appears to be South America. Looking at the image, I am guessing it is not of a place in the Netherlands (too many palm trees and not enough tulips). I know the Dutch colonised India and Indonesia and thereabouts and that's the type of scenery I would expect in those places. My history lessons never mentioned South America much and the Coronie article says nothing about the Netherlands. Coronie is in Suriname and it appears from its article that there was Dutch colonisation there, so maybe it is an image of Coronie in Suriname. But now I've spent several minutes on this and I still can't be sure if this image is of Coronie, Suriname or not. My point here is that someone from the Tropenmuseum would probably know straightaway which article this image might illustrate but, without that context, I cannot. I am guessing that the Tropenmuseum would have some kind of categorisation system but information on its categories haven't been included in the Commons entries. I think institution category information should be preserved. Could we introduce a Category called Tropenmuseum with sub-categories organised in the same way as the Tropenmuseum does. I appreciate it would create an effectively independent category hierarchy in Commons for each institution but it would be automatable and it make it a lot easier to understand the context of an image than having no categories at all.
Because of how time consuming it is and the paucity of information available for each image, I am not sure that the idea of mass uploads of images and leaving it to "the community" to do the hard work of finding a use for them is likely to yield good results. Making the hardest part of the job "someone else's problem" rarely works. I think it has to be part of the program to ensure some percentage of the images are actually used. Kerry Raymond (talk) 03:57, 16 January 2014 (UTC)
- Hi Kerry, thanks a lot for your feedback. Let me answer this from my point of view as a Wikipedian. Yes, you're right, the painting shows a plantation in the Coronie District in Suriname. As someone who – in his former life – has dealt with archival materials on an almost daily basis (before I came to the U.S., I wrote a Ph.D. thesis on 18th-century non-European history, which I never finished), I find the image description outstanding. And, given my background, I could grasp immediately what the image was about. And I guess that other Wikipedians who are more familiar with the history of South America would feel the same. Now, when it comes to the categorization system on Commons, I agree with you wholeheartedly. Currently, there seems to be no way of reflecting the keywords from digital archives in the Commons categories. That's a general problem of Commons: its metadata system (as many other things as well) desperately needs more love. I will point Fabrice Florin who's team works on new features for Commons to this discussion. Maybe he can give us some insights into what changes we can expect in the future. Thanks again, --Frank Schulenburg (talk) 17:12, 16 January 2014 (UTC)
Influence of the GLAMwiki Toolset on content donations / Content donation strategies / Wikidata
(I haven't read the article in detail, so excuse if I'm asking something that is been covered and that I've overlooked)
The installation of The GLAMWiki Toolset makes it a lot easier to do large content donations. There is a risk that the community will be 'flooded' by new content and that it will be harder to organise the use of the material in articles. This is something we should take into account for future content donations. I suspect that there will be a lot more uploads of content and a decline in the reuse as an effect of the GLAMwiki Toolset.
I'm currently involved in setting up a manual for the GLAMwiki Toolset. One of the paragraphs that are important for me is about content donation strategies. GLAMs in the Netherlands are thinking of another strategies than the current one-time large donation. These vary from starting with small images that will hopefully be replaced with high resolution files in the future to making the selection of content part of their current work processes for exhibitions. In other words: every new exhibition they look at the content that can be donated. These different strategies might have influence on the way they're evaluated.
Last but not least: wikidata is going to have a big impact on Commons according to some of the Wikidata volunteers in the Netherlands. A lot of metadata is currently displayed as text, but they will become links to unique attributes. This makes browsing the content, making queries and selections a lot easier. It does involve some manual labour though: the text has to be linked to the unique identifiers for the attributes. There will be a huge backlog from the day this ability enabled. Luckily the current categories can be used to extract a lot of the information. "Paintings by X" can be used to determine the datafield "creator: X". There is a possibility for a (big) difference in experience for content that has been processed and that has not been processed. Is this something to take into account too?
I appreciate the work that went into this study, but as a new WiR hoping to get a (relatively small) release of images, I can't say I found it that informative or helpful. It is isn't really very surprising that only a small proportion of releases of over 10,000 images are used in articles. In any case the use of images in articles is going to be a slow process, taking some years. The relatively slow replacement of older and far poorer images by those of the same work from the Google Art Project shows this - this is a much simpler process than changing the image completely, involving much less editorial judgement. Some allowance should be made for that in assessing usage, which this study doesn't seem to have done.
I would certainly echo what was said above about categorization. This is absolutely key on Commons, and involves enormous amounts of work if it is to be done really well, which I rather doubt the figures given capture. But even on a simpler approach it seems to me that the wrong approach is often taken, and I've never seen the issues raised with the community in advance (though I might well have missed that I admit). The huge and varied Walters release was not well categorized initially, and much of it remains so. In common with other releases the images were typically dumped inappropriately to categories at too high a level ("Art from Fooland"), swamping them, when dedicated sub-categories should have been used ("Art from Fooland in the ...."), and more usage made of period and object type information in the metadata. I've seen the same in other releases. The community has done huge amounts of work here, much of which could have been avoided, and the process remains very incomplete, effectively cutting off access to large proportions of the donated images.
I wonder how true it is to say "Without the hundreds of thousands photos that have been released by cultural institutions over the last couple of years, large numbers of our articles would lack pictorial illustration". I have placed dozens, probably hundreds, of donated images in articles, but normally as an improvement of or addition to existing images. Apart from biographies, there are normally alternatives. But improvement is an important thing, if hard to measure.
No mention is made of usage beyond Wikimedia projects - obvious this is hard or impossible to quantify, but if nothing else the Bundesarchiv example shows this happens, and one hopes normally in an appropriate way. It is very much within the movement's goals & should be at least mentioned. Wiki at Royal Society John (talk) 13:09, 18 January 2014 (UTC)
- I agree that this report didn't answer any of the questions I have/had (as a community member, interested person and chapter [board] member). For instance, how many improvements to metadata have been sent upstream, how many editors edited the file descriptions, what's the resolution of the images/media and whether it's improving or not, are the (non-copyright?) restrictions of any kind on the content constant, how many of the institutions adopted a free license for their own websites, whether free culture entered their normal employees' work, how much manual work the batch upload saved compared e.g. to the work of those users who have uploaded hundreds thousand files using PD sources or convincing Flickr users to adopt free licenses, etc. --Nemo 17:07, 26 January 2014 (UTC)
- Thank you for your comments and feedback. As we have stressed throughout all these reports. These are initial data points that have been mapped across a set of programs. In this case it is important to remember that reporting is dependent on (A) program leaders tracking/capturing information and (B) reporting it in a systematic way. For now, we are only able to get at some initial indicators as capacity for measuring, tracking, and reporting these other potential impacts is not something that exists yet. If you have ideas for specific ways these things might be done, that is not a big investment of time and effort in the tracking, please share those here so that this dialogue might be productive in advancing these ideas into actionable steps. JAnstee (WMF) (talk) 17:37, 26 January 2014 (UTC)
- I have already provided a number of suggestions above. :) It's not clear to me who's the expected audience or the actual goal of reports such as this, maybe they're already perfect for their goals; but, if you want them to be used by people outside WMF (like very active editors, chapter boards and staff, very active chapter members), you should probably start each report only after getting some answers from them on what their questions would be, what they'd be interested to have some data and some summaries about. --Nemo 08:57, 27 January 2014 (UTC)
Why this term "donation". The cultural partnership community has repeatedly stressed that it's better not to call them donations, for many reasons you probably know by now if you studied the topic. We usually call them partnerships, though in some languages like French finding an appropriate equivalent is reportedly harder. --Nemo 17:07, 26 January 2014 (UTC)
- Good question, Nemo. The reason we are referring here to the donations and not partnerships more generally is because that is the only program piece we have begun to examine specific to GLAM partnerships, not the full partnerships and the many various program components they may implement (i.e. a GLAM partnerships often has advocacy work, content donations, editing workshops, edit-a-thons, and other content promotion program strategies). It would be misleading to imply this report is on the broad GLAM partnerships (more of an overarching initiative) when we have only examined content donations specifically for this report. For this reason I am concerned that your recent renaming does not accurately reflect what we have prepared and shared here. JAnstee (WMF) (talk) 17:43, 26 January 2014 (UTC)
- Can you explain how "advocacy work, [...] editing workshops, edit-a-thons" fall under the "content partnership" definition? That's surely not how the term is usually used. --Nemo 21:11, 26 January 2014 (UTC) P.s. But if you're afraid it might, you could expand it to "content release partnership" in the lead section and imply it in the rest of the page.
- Thank you for the suggestion. To answer your question, both when we worked to map out the activities and outcomes specific to GLAM Content Donations with our pilot group in Budapest, as well as when, I assisted with a the session on GLAM evaluations at Wikimania in August (I believe you attended that, Nemo, or am I misremembering?) - it was clear that GLAM partnership activities extended well beyond the activities to acquire the donated content and often included these other streams of programming in order to gain maximal benefit from the GLAM partnership.
- As we have been chunking out the most "programs" (i.e., sets of activities that are replicated across time or setting and share a theory of change) we have been looking at content donations specifically from the GLAM partnership "model." Very likely, there are other "programs" within GLAM partnerships that may be identified and monitored for evaluation also. I think rewording to "content release partnership" makes keeps this distinction better, but I do not want to mislead readers that we have examined the partnerships more thoroughly than we have been able to at this point. This series of reports is essentially only a beta version of a set of works that will grow and improve with time and program leaders participation. JAnstee (WMF) (talk) 17:25, 27 January 2014 (UTC)