Jump to content

Talk:Community Wishlist/Wishes/(Commons) file description pages should be indexable by (Google) search

Add topic
From Meta, a Wikimedia project coordination wiki
Latest comment: 1 month ago by Prototyperspective in topic Agree

  Please remember to:

Thank you

[edit]

Hey @TheDJ - thanks for sharing this wish. It's really well considered and thought out, and I think it speaks to a number of wishes relating to the extensibility of commons content. JWheeler-WMF (talk) 18:16, 31 July 2024 (UTC)Reply

Agree

[edit]

I think the bad state of indexing is the most urgent and top-priority issue of Commons, even before getting more devs (volunteers) to help out with code issues and the app and so on. All the time and effort people spend on maintaining, organizing, and populating the site doesn't mean much if people don't know about and find the site and files on it. Could you make it clearer that also videos on WMC are not showing in the video tab of Google even when searching for their exact title? This is unclear and a major subpoint.

Moreover, it's not just Google but also DuckDuckGo – one can't find the free media on WMC in its video tab. Some more input here and in brief I suggest for a start 1) WMF investigations into this² 2) asking Google about it 3) an open letter asking them to change it signed by Wikimedia community that can build momentum, maybe get attention of digital policymakers and the media and get the community to work on solving this. A possible workaround I'd think of would be automatically creating a redirect for all file pages without the filetype ending. Prototyperspective (talk) 21:18, 31 July 2024 (UTC)Reply

May be best to make a separate wish about videos and also DuckDuckGo. Prototyperspective (talk) 22:52, 1 August 2024 (UTC)Reply
Done by Prototyperspective at Community Wishlist/Wishes/Do something about Google & DuckDuckGo search not indexing media files and categories on Commons.
While not identical, it would be hard to address either of these wishes without addressing the other. I think this wish does include videos, but includes all sorts of files, and is open to the possibility of a variety of reasons for non-indexing.
This proposal would help all search engines to index our pages, and I support it; thank you, TheDJ. It would be good to add a mention of other search engines, as at the other wish. There's the Bing backend (used by DuckDuckGo among others; there is less competition than there appears) and the Brave and Mojeek and even Yandex backends. PeerTube's Sepia Search is specific for video and audio (actually hosting a PeerTube instance would be a lot of work, but getting Commons on this list would get our videos and their metadata out there).
Working with Common Crawl (which Prototyperspective brought to my attention) might also align better with Wikimedia's open-knowledge mission. I'd be comfortable about a collaboration with them in a way in which I wouldn't be with a search provider with dramatic differences of principle on, say, user privacy and control. It also offers the possibility of improving indexing in multiple backends at once.
Google is currently facing antitrust action for search monopoly, and has been ordered to sell Chrome (so its default search engine might change). Google also has an obvious conflict of interest; indexing our videos better may well mean we get visitors who would otherwise have gone to YouTube (which seems to be increasingly unpopular with creators for the lack of recourse on bogus automated copyvio claims). The search market may shift, and platforms are not eternal. The WMF has also been repeatedly accused of excessively close ties to Google. Collaborating only with Google could fuel accusations of anticompetitive behaviour, which is probably not what either party needs.
For these reasons, I suggest we keep a variety of search engines in mind, at least to the extent of CCing them on any correspondence and inviting them to meetings. We'd be unwise to put all our eggs in one basket. HLHJ (talk) 20:13, 1 December 2024 (UTC)Reply
Thanks for your inputs. I also support TheDJ's wish albeit I doubt the issue specified here is the actual genuine problem of this. Google probably already has page-type misdetection mitigated and if not could easily do so. DDG & Co are much less important since they are used much less but especially DDG is still widely used among the demographic of active contributors & extensive WP users. but getting Commons on this list I see no need or benefit for that. One can already download any audio & music file by simply clicking save as so there is no need to download or stream with youtube-dl. Secondly, it's already supported....at least with yt-dlp and I see no reason to still use youtube-dl instead of that. Instead, what would be great is if the NewPipe app included Wikimedia Commons. I intend to create something about that, it already supports various other nonYT sites in its left sidebar and it's not very unknown anymore (partly due to enabling keep playing music even when the screen is turned off which the YT app can't). (Things like that would have much better chances of success if I wasn't proposing that on my own to let it collect dust and if something regarding that already exists, please let me know.)
I don't know how Common Crawl would improve indexing. Maybe there could be a new separate wish about that. I think it's probably relevant to this one.
Google also has an obvious conflict of interest; indexing our videos better may well mean we get visitors who would otherwise have gone to YouTube Exactly, good point, that supports my case that we shouldn't assume the issue named in the wish here is the actual cause it's seems very likely it isn't and is partly based on the assumption that it's some technical issue that prevents Google from indexing videos on WMC.
at least to the extent of CCing them on any correspondence and inviting them to meetings I don't think such things have been done so far. I also don't think Google would listen except if there is some notable outreach like a large open letter and/or some media attention and for the other search engines they probably also wouldn't be so much involved to get CCed of correspondence instead of something more concise like a concrete request for info or proposal for some change. I think DDG may be interested in collaborating to some degree, see this and this. Prototyperspective (talk) 16:21, 2 December 2024 (UTC)Reply