Wikivoyage talk:Hierarchical structures project

From Meta, a Wikimedia project coordination wiki

What to talk about[edit]

Structuring articles and categories on shared: following the geographical hierarchy concept seems to be common sense on Wikivoyage. However, there are several details that need to be clarified and improved. Every Wikivoyage contributor and/or user is invited to join this discussion and elaborate concepts for future improvements.

There are some articles more or less worth to read for background knowledge (more can be added):

In a first step, we probably should list up what we wanted to keep, where problems occur and what would be nice to have. This may be done as rather unstructured a brainstorming.

In a second step, it will be necessary to analyse what improvements could be made and how much effort they would take. Improvements should be described as exact as possible and results must be fixed on the project page. The whole discussion would render to be pointless if we would miss this step. It is essential for the technical implementation.

Step 1: Brainstorming[edit]

Please add your ideas here. Make subsections as appropriate.

Local optimisation or general concept[edit]

One of Roland's central points in the discussion on de:Lounge is that the locDB control of geographical categories messes up the previously built structure on shared. I wanted to generalize this point and ask: What is more important, a locally optimized structure or a general overall structure? My answer is clear, I prefer the general structure. -- Hansm 14:05, 13 January 2010 (UTC)[reply]

Concerning the idea of a central geographical repository for all articles and taking into account the future and current functions like automaps, vincinity search (Umkreissuche) and more, we should go for a general structure. But we have to define once and forever the means and meanings of a primary and secondary isin. As described on tech it should be clear that a geographical feature like a desert or a mountain range (like the alps) which affects more than one country should never be a primary isin but only a secondary one. What to to on national level (i.e. Russia) should be discussed. As a remark: I would suggest to point out exactly (so that everybody can follow) what the current situation (LocDB in charge on shared) means. --Der Reisende 16:51, 13 January 2010 (UTC)[reply]
LocDB controlled geographical categories on shared: are described in short on the shared: Lounge. What else do you think should be explained? -- Hansm 17:16, 13 January 2010 (UTC)[reply]
Obviously we do not use the same structures in :de and in :shared. For example: the German town Mannheim
10 Levels in :de: Index > Eurasien > Europa > Mitteleuropa > Deutschland > Baden-Württemberg > Baden > Badische Rheinebene > Kurpfalz > Mannheim
7 Levels in :shared: Index > Eurasia > Europe > Central Europe > Germany > Baden-Württemberg > Mannheim
Could it be a solution to use the same structures in :de and in :shared ? O.k., if Roland says that there are problems in the locDB control I think he may be right, but I didn't recognize any problems till this moment. If there are serious problems, I think it would be better to switch off the new extension at once. But in the moment I do not see any problem caused by the locDB, so I cannot decide how to vote. -- Berthold 17:47, 13 January 2010 (UTC)[reply]
Currently, de:, it: and shared: are using the same hierarchical structure for geographical articles or categories, respectively. It's just that not all intermediate levels are present on each of this three wikis. But it's the same hierarchy. Before turning on my locDB control, the hierarchy on shared has been different from what we have in the locDB and thus on de: and it:. Even the breadcrumb trail on shared: did not always reflect the category defined hierarchical structure. -- Hansm 19:39, 13 January 2010 (UTC)[reply]
The locDB control reduces some of the maintainance and contolling work on shared: But Roland pointed out some disadvantages of this solution, e.g. the not working alphabetic order via a second parameter in a category definition. Thats the reason why i want to keep the shared:'s stzate as it is right now. Is it possible to integrate all these needed features into the locDB to get rid off the disadvantages that Roland has mentioned? -- DerFussi 11:54, 8 February 2010 (UTC)[reply]
I wonder if this contribution is well placed in this section, but anyway, just a short answer: I'm about to review central parts of the LocDB's code. This has turned out to be necessary after one year of experience. And this must be completed before making more modules as it affects the interface between modules and the framework of the LocDB. New LocDB features not before autumn or winter. -- Hansm 21:53, 9 February 2010 (UTC)[reply]
To be honest. I did not know where to place this contribution. But i came up with this question while reading this section :) -- DerFussi 22:09, 9 February 2010 (UTC)[reply]

shared: has geographical and non-geographical categories[edit]

The most important question for me is if it makes sense to treat geographical and non-geographical (image) categories in different ways, i.e. to control the hierarchy of geographical categories by the locDB and keep the traditional category links for non-geographical categories. I see a pro and a con. The pro is to have one general geographical hierarchy accross all language version and shared:. The con is that the local category system on shared: gets less self-consistent than it used to be.

Non-geographical categories include templates, meta articles and non-geographical images like images for tutorials, logos or even coats of arms or maps. Perhaps, the latter two also might be regarded as geographic images, but currently, they are not. Anyway, non-geographical images/categories are a minority.

For their daily work, most contributors will have to deal with geographical categories, e.g. when uploading their images from their last travel. And most images will be uploaded for putting them into the article where the contributor is working on. Typically, contributors switch between their articles on the language version and the shared: images repository. Finding the same geographical hierarchy on both makes things easier and less error-prone. This is a strong reason for a general overall geographical hierarchy.

Conclusion: We have to pay the price of a minor local inconsistency on shared:, but getting a general geographical hierarchy is more important.

-- Hansm 13:10, 14 January 2010 (UTC)[reply]

I think categories-system consistency is more important than a control by the location database. A cause is that the philosophy of categorising locations differ from that in articles. --Roland 15:53, 16 January 2010 (UTC)[reply]
What does this say about geographical and non-geographical categories? -- Hansm 17:50, 16 January 2010 (UTC)[reply]

See Project talk:Expedition Hierarchical structures#New name for old stuff for a proposal related to this topic. -- Hansm 11:45, 17 January 2010 (UTC)[reply]

Is it possible to setup a second namespace in shared: that works like categories or is the behaviour of categories limited to the categories itself? If not we can introduce a namespace "topic" on shared as well to categorize all non-geographical files. so its easier to limit locDB features to files with a geographical relation. -- DerFussi 08:17, 10 February 2010 (UTC)[reply]
I'm not sure how to understand your first question/proposal. Regarding your alternative with namespace "topic", what would be the difference to what I have proposed in section New name for old stuff? -- Hansm 09:12, 10 February 2010 (UTC)[reply]
Ha, ha cool!. Seems to be the same. Did not read that section before. I have read this article section by section and not comprehensively. Sorry. -- DerFussi

Add coordinates to geographical images on shared:[edit]

Maybe some problems on shared could be solved if all pictures are geotagged. How about this? --Der Reisende 17:17, 13 January 2010 (UTC)[reply]

Sure, but this is probably not possible. How about categories like this: shared:Category:Templates, shared:Category:Symbols or shared:Category:Tutorials? -- Hansm 19:27, 13 January 2010 (UTC)[reply]
Yes. For non geographical images. As an sample for geographical pictures ave a look at locr.com's Holstentor. They also have a simple breadcrumb below the map. --Der Reisende 06:27, 14 January 2010 (UTC)[reply]
Please remember our manpower. I think there are more important things to do than making a breadcrumb trail for each photo, and sometimes it wil be impossible even for geographical pictures. -- Berthold 11:48, 14 January 2010 (UTC)[reply]
That's the crucial point. locr.com might be nice, but allmost all of our image won't come from there. Thus adding coordinates had to be done manually in very most cases. Even if we would treat non-geograhpical and geographical images in different ways, there is still the question who would be willed to add geographical coordinates to each image. I'd estimate that many contributors either would stop uploading images at all or simply would not add the coordinates. All in all, I think this idea would not work. -- Hansm 12:42, 14 January 2010 (UTC)[reply]

Maybe we can make this an option.--Der Reisende 12:27, 16 January 2010 (UTC)[reply]

How to insert new pages in namespace Topic:[edit]

O.k., weeks ago I was able to add some pages. But every time I try it again it seems to be very difficult. So I see de:Thema:Liste der Reiserouten, de:Thema:Reiserouten-Index, de:Thema:Reisethemen-Index. And then I have a look on de:Kategorie:Reiserouten with its Categories and Subcategories, and now my confusion is completed. Yes, I will try to start a new page, and I think after some trials and errors I will succeed. Perhaps we can find a way to simplify the procedure and simplify the structures.

Right, that's a week point in our structuring system. Let me explain the central ideas as discussed on de: a very long time ago:
  1. Articles in namespace topic should give additional information for readers with special interests.
  2. As there might be a lot of additional topic articles related to one article in the main namespace, it seemed to be better to link topics from the sidebar rather than from the travel article itself.
  3. In order to help readers finding topic articles for their special interests (e.g. hiking, cycling, fishing, architecure), topic articles should be linked with specific, significant keywords, one for each special interest.
  4. This key words should be the same for the whole wiki.
There had been several ideas how to structure topic articles. Finally, we decided to make use of categories. The key word to be displayed in the sidebar should be given by the name of the top category a topic article is in. During the first weeks, the breadcrumbs via template IsIn have become a de facto standard. It has turned out that contributors preferred to link topic articles from travel article text directly, so that the above item 2 has become pointless. Furthermore, in contradiction to what had been planed, contributors found a way to suppress sidebar links so that a link only showed up in the article text (::-syntax).
From the early beginning, contributors have had big difficulties to understand the concept and impact of categories for topic articles which usually leads to the notorious _XXXX_ sidebar links. Also, it has turned out that for many travel articles there are several related topic articles with the same key word, e.g. two cycle ways that both run through the place described in the travel article.
That's the story in short. And what to do now?
There is a proposal made by Roland on the telephone: Make sidebar links from travel articles by a special template. The template takes the title of the topic article and a link text as additional optional argument. If the link text is too long for the sidebar, it is automatically chopped. Probably, this would be easier to understand for contributors, but we would lose the keywords.
Are there other proposals?
-- Hansm 18:21, 14 January 2010 (UTC)[reply]
I thought about a more radical way of dealing with topics. All articles which are more or less a outsourced part of a location article in the main namespace should remain there. This means i.e. no more articles about museums in the topic namespace. But first aid, travel pharmacy, pack list etc will remain there but eating and drinking in US refers to the cuisine headline in the location article and is therefore a Location/eat and drink" article. So all topics would be attached to their location in the main namespace. No more hassle. One more Radrouten (bike routes) are either in a state, a country or a continent and therefore part of mobility. --Der Reisende 06:18, 15 January 2010 (UTC)[reply]
I think one of the problem is, that most users linked ALL topic articles in the sidebar. This behaviour headed to chaos. There are 5 links called "Fahrrad" in the sidebar as well as the well-known "_XXXX_". I think a more strict regulation can help more than new technical features. Most topic articles have a geographical relation nevertheless. We need a topic related hierarchical structure similar to the hierarchy in the name space. e.g.: travel routes > travel routes in Asia > travel routes in Southeast Asia > travel routes in Thailand. Every article gets his own corresponding link in the sidebar. All specific articles get these to double colons and are linked from the article text only. In some cases you can link to one higher level. (eg: Frankfurt has a sidebar link to the cycle paths in Hesse and the specific cycle path articles are linked in the article text only). So we need some more articles just for the hierarchy but with some effort they are short but nice as well (just a list, a picture and some links and words). We have to set the right categories and IstIn only once for these articles and will never have to put a hand on it again. So there ist no need to bother other contributors with this issue. And nobody will complain about the "_XXXX_" and we dont have to do any changes in our system. This solution is usefull for almost every topic, bicycle, climbing, diving, travel routes even gay travel or wedding travel has its own regions on the world and is finally geogrphical related. And pack lists, unusual travel projects dont need a link in other articles. People find it via our main page > travel topics. -- DerFussi 12:19, 8 February 2010 (UTC)[reply]

Primary and secondary IsIn[edit]

Referring to the phone call about creating regional polygons in the LocDB for automated sorting of sublocations, I would rather like to use a 90% instead of a 100% solution of area coverage. Which means: instead of demanding a smaller region to be a complete part of a larger region (therefore being a complete part of the larger region) it would be sufficient if only 90% of the smaller area is covered by the larger one. --Der Reisende 06:23, 15 January 2010 (UTC)[reply]

Then hierarchy breaks down. Does not work. -- Hansm 08:24, 15 January 2010 (UTC)[reply]
Why? I thought by defining the above we could solve the problem of precise borders (non overlapping not even 2 cm ;-)). --Der Reisende 09:02, 15 January 2010 (UTC)[reply]
By the way do you know this site? --Der Reisende 10:40, 15 January 2010 (UTC)[reply]
No, but looks like a collection of interactive programs, thus pointless for usage on the server.
Why your proposal breaks hierarchy:
Supposed 90% of location B lies in the area of location A and location C lies somewhere in the other 10% of location B. Then C were part of B and B were part of A. Consequently, we also had to say C is part of A, but this is wrong as C is outside of A.
Furthermore, let D be the neighbouring location of A really containing C. What would be the breadcrumb trail for C? Is it D > C or is it A > B > C? Or is it D > B > C, as C is part of B?
An other thing: How should this fuzzy locations be shown on maps? For maps, it is essential that neighbouring locations do touch each other, i.e. that they neither have an intersection nor that there is a gap in between. If there were a gap, we had an ugly stripe of no one's land between adjoining locations.
-- Hansm 14:54, 15 January 2010 (UTC)[reply]

See your point. Would only work for the smallest thinkable polygon. Sorry was taken away by enthusiasm. --Der Reisende 12:33, 16 January 2010 (UTC)[reply]

New idea. Primary isIn should be political (place -> state, Bundesland, goverment -> country -> maybe continent) and the secondary isIn should refer to a region. So we will have two strains merging on continent level. The advantage of political structures is, there are (mostly) seemless non overlapping borders. --Der Reisende 11:14, 17 January 2010 (UTC)[reply]

What would be the goal of this hierarchy compared to what we currently have? -- Hansm 11:49, 17 January 2010 (UTC)[reply]
It would be clearer. Two strains, each consistent in itself. Both visible so you know where it is political and geographical. --Der Reisende 06:34, 18 January 2010 (UTC)[reply]
Basically, political borders would (almost) perfectly meet the goal that primary IsIns (more precisely, what I have called "regular" on tech:Locations Database/Further Development#Better consistency for secondary IsIns) should be seemless and non overlapping. However, in practice, in most cases the exact borders are not available for us. Furthermore, in order to get the locations' areas really seemless and non overlapping, we would need one global selfconsistent set of border definitions. No way to get it. So, using political entities may look temptative, but its benefit would be limited.
Also, I think political entities below country level are often less important for travellers. At least from my way to travel, I do not care too much for them. What I am much more interested in are things like landscape, people or maybe even cultural aspects. This are soft borders and in many cases they do not not well coincident with political borders.
As we can see from our current hierarchy, non-political hierarchisation criteria do not necessarily lead to one non-political strain. What we actually have is a network of secondary paths rather than a clearly brachned tree.
-- Hansm 10:03, 20 January 2010 (UTC)[reply]

What do the Wikis Want[edit]

The Question for me is: "what do the Wikis want", ex specially is the intention on :shared the same as it is in :de and :it ?

In the language wikis it is clear that, in my opinion, the hierarchy has to reflect the reality as fine as it is possible. So every Region that has any importance should exist sometime with informations to the "consumer".

In :shared i think it is the question, whether the wiki has to be sometime more than just the container for the pictures. Or especially: Does anyone need the Information from the detailed hierarchy of the Region in :shared ? Looking to most of foreign Visitors here in Bavaria, the Difference between Bavaria and Franconia is even very special to them, sometimes even to Locals that are no Natives.

All the Times in all of the Wikis there have been some unsolved smaller problems where we had to learn to live with. At last the Solution must be realized from the technicians, and they must see the consequence of this solution in the near and middle term. I do not think i can see these consequences yet. Have a good weekend. --Bbb 16:04, 15 January 2010 (UTC)[reply]

Shared: is designed to be nothing but a media repository. You have asked "Does anybody need the information from the detailed hierarchy of the region on shared:?", but what would be your answer?
Developers are not more clear-sighted than other contributors. Sure, they have to implement the ideas into working software. But planning new concepts is not a task exclusively for developers. Sometimes, it is really hard to guess in advance what users would find intuitive and simple.
Furthermore, some postings in the above linked discussion on de: have bewailed the lack of a clear description where further development goes to.
The intention of this discussion was to plan the direction of future development related to hierarchical structures and fix some results. This should be done in order to not surprise contributors any more with new features that developers find useful, but maybe not the contributors.
-- Hansm 16:44, 15 January 2010 (UTC)[reply]
Let me show you one more idea. If somebody reads a nice article about Munich. The reader sees a link to shared. If one follows the link, you see more pictures. So to have a perfect cross linking capability the locDB should take control. So cross links become easier from every future wiki. --Der Reisende 12:37, 16 January 2010 (UTC)[reply]

New name for old stuff[edit]

As Roland has critizised (not here, but unfortunately on the de:-Lounge), modifying the behaviour of well known MediaWiki core features like categories makes unexperienced users confused. Categories must work as on every other MediaWiki installation. That's a serious point.

On the other hand, the natural way to make a bracket around a group of images are categories and we need those brackets on shared:. In case of images belonging clearly to a certain place, their categories correspond exactly to what we call a location when looking on it from the LocDB's point of view.

My attempt to control this geographic categories by the locDB has lead to vibrant discussions. Amongst other things, it has been critizesed that locDB controlled categories do not work any more as usual.

A solution might me to implement something similar to categories on shared:, but to give it a differnt name, say "Location" or "Place". The actual name does not matter. Here the idea in more detail:

  1. Leave categories as they come with the MediaWiki software.
  2. Make a new namespace called "Location".
  3. Treat pages in namespace "Location" like traditional category pages, maybe combined with the functionality of recursive listing as known from the special page "LdbListSubLocs".
  4. Include geographical images into "locations" rather than into categories. A location link works exactly like a category link, i.e. by saying [[Location:Berlin]] instead of [[Category:Berlin]] on the image description page.
  5. Ignore normal location links from all namespaces except namespace Image:.
  6. Allow inline location links (with a ":" preceding the namespace, i.e. [[:Location:Berlin]]) from all namespaces.
  7. Ignore normal category links (without ":") from namespace Location:.
  8. Control the hierarchy of locations by the locDB.

Perhaps, this proposal sound simple and not really substancial. But in fact, it could solve most of the known problems.

  1. The MediaWiki core feature of categories remains untouched.
  2. The currently hidden differenciation between geographical and non-geographical categories would become more evident. The tree of non-geographical categories would remain as it currently is. Geographical categories would not exist any more as they would be called "locations", or whatever, and are under the control of the locDB. Both trees are independent of each other and cannot be mixed.
  3. Additional sections like "Related categories" on category pages with a link list to related categories also would work on "location" pages and vice versa.
  4. So called flat hierarchies are realized perfectly since the user always can choose between getting displayed direct sublocations only or the complete list of recursive sublocations. No more need for multiple categorisation.
  5. If really necessary, the sublocations of very important places could be linked from a location page, e.g. location Athens could be linked from the location page of Greece. Not the other way round as it currently is.
  6. Images could by categorized as well as "localized". Even multiple "localisation" of images would be possible if really necessary.
  7. The location tree would always be synchronous with the hierarchy seen in the language versions.
  8. When creating new location pages, no link to the parent location need to be set. However, it even cannot be set.
  9. Since locations and categories are formaly seperated, it is easy to make automated lists of location pages without corresponding locDB entry. This is currently not possible since there is no formal difference between geographical and non-geographical categories. Probably, it also would not be too difficult to make a list of location pages linked from images but without correspong locDB entry. This is important for maintenance tasks.
  10. Unintentional cyclic relations are avoided with locDB controlled location trees. Curiously, the category mechanism does not prevent cyclic catagory links.

I only see two points pointed out by Roland that still would not work:

  1. Defining sort keys. This is a matter of time rather than a basic problem. In the future, we will have a LocDB module "Names" which could hold one (or more) sortkeys.
  2. Using pseudo locations like Egyptian governorates. This is, however, an undermining of the concept of locations.

-- Hansm 11:33, 17 January 2010 (UTC), updated at 20:04, 17 January 2010 (UTC)[reply]

Step 2: Analysis[edit]

Problems on focus[edit]

It seems as if we focus on three problems. Those are:

  1. Hierarchical order on the language wikis.
  2. How to deal with travel topics in an hierarchical order (LocDB or categories) and clear rules what belongs in this namespace (like travel routes, museums on one side and first aid, general preparation on the other side).
  3. How to deal with shared categories?

--Der Reisende 12:37, 19 January 2010 (UTC)[reply]

Yes, this seem to be the hot spots. -- Hansm 15:20, 19 January 2010 (UTC)[reply]

Suggestion for a meeting[edit]

The suggestion of a meeting has been made on the de: lounge. Proposed dates are 12th to 14th February, 5th to 7th or 12th to 14th March. No location for the meeting has been proposed yet. The meeting should help to focus on the current problems and created a proposal for a solution which should be presented to the community. Everybody is invited.--Der Reisende 12:37, 19 January 2010 (UTC)[reply]

All of this would be fine for me. -- Hansm 15:20, 19 January 2010 (UTC)[reply]
Meeting will be postponed to mid or end of April. Date will be given in time. --Der Reisende 13:17, 15 February 2010 (UTC)[reply]
Lets discuss it as an item of the agenda during our next members gathering. -- DerFussi 09:49, 24 February 2010 (UTC)[reply]
Don't you think this would be somewhat late? If we wait for the members' meeting just for making a date for the working meeting, this is wasted time. I think, in the live meeting, there are more important things to talk about than dates. This can perfectly be done on the wiki. -- Hansm 21:30, 26 February 2010 (UTC)[reply]