Jump to content

Community Wishlist/Wishes/A way to see why a file is somewhere underneath a specific category (tool to show cat-path)

From Meta, a Wikimedia project coordination wiki
A way to see why a file is somewhere underneath a specific category (tool to show cat-path) Open

Edit wish Discuss this wish

Description

FastCCI Commons gadget showing why a quality image is erroneously categorized into a category; a separate gadget/functionality that does this is proposed here
Deepcategory search / view-mode on Wikimedia Commons showing offtopic results for category "Environmental diagrams"
Example of a broad category that was largely fixed so that it doesn't contain offtopic images (Microscopic images relating to biology)

On Commons, many categories, especially the higher-order large ones, have many off-topic files within their category branch. That usually is because of some miscategorization where somewhere far down in its subcategories, a category was included that isn't really about the subject of the parent category.

With the deepcategory search operator, one can see files of a category, including files in all its subcategories. This often shows many offtopic files due to the issue explained above, as can be seen in the first image on the right (another example are diagrams and normal photographs underneath the c:Category:Microscopic images which is supposed to be for microscopy photos only).

Fixing up categories and removing offtopic files is currently really difficult though. That is because the file could be included in the search results because it's in some subcategory 10 levels away from the category one has used deepcategory with. So for example if one uses the search deepcategory:"Buildings" in MediaSearch and there is some microscopic image of a virus in the image results, one can't find out why it's there. One has to go to the file page and look at the file's categories and navigate upward on the category one thinks is the likely culprit and this often doesn't work and generally takes way too much time. For example it could be Buildings->Buildings by function->Agricultural buildings->[10 more levels]->Veterinary virology.

What is needed is a tool or functionality that shows the categorization path of the file to the parent category so that one can spot the faulty categorization and fix it. This is in fact already possible with the FastCCI tool but that tool usually doesn't load (phab:T367652) and that functionality is not the main use of the tool, e.g. it would load way too long and one can't use it for files/views found as described above. However, its code could maybe be used for this and I suggested what I'm proposing here also on the tool's talk page here. This feature would also be useful when images that shouldn't be there show up in a petscan where one can intersect Commons categories. Issue 182 about it at petscan.

Not having this feature greatly limits the usefulness of categories on Commons which can hardly be refined so that they really only show files about the category subject and impedes searching them using deepcategory as well as viewing categories with a modern scrollable wall-of-text view instead of having to navigate many subcategories in which files are dispersed.

Examples

By the way this would also be useful to display the source of categorization for seeing how the file relates to the selected category (which is usually specified in the chain of category names).

Somebody please enable showing a category path beneath each item in the thumbnail view (and a button per item in the normal table view). It would be best as a native functionality but a gadget would help a lot too and maybe could later be turned into a native feature.

It could improve data quality a lot and fix many categorization problems that otherwise would be very difficult/unlikely to find.

Assigned focus area

Unassigned.

Type of wish

Feature request

Wikimedia Commons

Affected users

Commons users

Other details