Jump to content

Case against subpages

From Meta, a Wikimedia project coordination wiki
(English) This is an essay. It expresses the opinions and ideas of some Wikimedians but may not have wide support. This is not policy on Meta, but it may be a policy or guideline on other Wikimedia projects. Feel free to update this page as needed, or use the discussion page to propose major changes.

This essay applies to subpages in wikipedia, not other wikis where they may be more applicable (wikibooks)

Monday, July 2, 1:03 PM -- Another column today--I was inspired. In two previous columns (/Why I am suspicious of subpages and /Accidental linking and hard-wired category schemes) I gave a few reasons to "eschew subpages", and in the former, I referred to the existence of a number of other good reasons to avoid subpages (though not, as I've also said, in every case). So what are those other good reasons? Here is a laundry list; there are probably other reasons.

  • Arbitrary, at what level subpages are created. It seems entirely arbitrary at what level subpages will be created. In most cases, there is no good reason why X should be considered a subpage of Y, when Y could just as easily be made a subpage of Z, and so on.
  • Only one level of hierarchy--arbitrary and unappealing. Subpages allows us to impose some conceptual structure or hierarchy on topics--but only one level of hierarchy. This is conceptually unappealing. Shouldn't it be either all or none? Better yet, why not let links internal to the articles specify the conceptual relationships between articles?
  • The parent page-subpage relationship is variable--again, arbitrary and unappealing. The relationship between parent page and subpage is entirely variable, which is puzzling and, again, conceptually unappealing. For example, we could make "CountriesA" a subpage of "Countries of the world"; then, the list of pages under "CountriesA" would be the set of the countries of the world whose names in English begin with the letter "A." We could also make "Pearl Harbor" a subtopic of "World War II," and the relationship here is that Pearl Harbor is the-location-of-an-important-attack-in World War II. We could make "David Hume" a subtopic of "Philosopher" because Hume is a philosopher. Etc. What does the slash mean?
  • We don't know when to use subpages--it's arbitrary. It is difficult to guess when other people will have created a subpage hierarchy and when not. There are no clear, intuitive standards that we are following in making the decision. This makes it marginally more difficult to guess at page titles, for one thing.
  • Subpage titles are ugly. Subpage titles are typically ugly--they employ nonstandard punctuation (the slash), for one thing.
  • One main motivation, disambiguation, will soon disappear. As soon as Jimbo and crew installs the newest UseMod version, we will be able to use parentheses in titles.

Only in some very few cases does the convenience of subpages seem to outweigh the above reasons. Even in those cases I have my doubts, and even in those cases I'm liking subpages less and less. They seem particularly useful only for a field or hobby or fictional world that has a large amount of "in" jargon that is not used anywhere else, and that constitutes a sort of self-contained system. But the fact that subpages are used only for these cases and not others might itself seem to be a problem, insofar as we would like (eventually) consistent solutions to the disambiguation problem.

Meta-subpages like /'Talk and/'Opinions seem to be all right, then:

  • Non arbitrary: there is always a "Talk". In general, we standardize on a fixed, small number of meta-subpages.
  • Non ambiguous: "Talk" always means the same thing. In general, meta-subpages always have a "meta" kind of relationship with the parent.


Here's a possible software solution, but that probably will require significant work (though some of it will lead to other benefits). Allow authors to embed non-printing instructions in a page, similar to the #REDIRECT available now. This can also be used for metadata if we want rather than the subpage idea. In particular, allow the embedding of a "Context" tag to disambiguate links. For example, in a page labelled "#CONTEXT Mathematics", a simple link to "real" would look first for a link "Real (Mathematics)", and only if it didn't find one, link to "Real" at a top level. This obviates my use of subpages for disambiguation, while making linking even easier. I think it is critically important to make correct ad-hoc linking easy for authors and editors. This would be a minor chore for the first author or editor to establish the context, but thereafter no other editors would need to bother with it--it would just work. --LDC

Seconded enthusiastically. I have some familiarity with the code of UseModWiki and might send Clifford Adams a patch to implement #CONTEXT.

Is it enough to prepend non-printing instructions in a page, rather than "embed" as LDC suggests? Example:

#CONTEXT Mathematics
In Mathematics, a [[real]] number is defined as...

would be allowed and honored, but

In Mathematics,
#CONTEXT Mathematics
a [[real]] number is defined as...

would not be.

How would you like multiple context?

About [[aaa]]...

would link to "aaa (xxx)" if it does exist, otherwise to "aaa (yyy)" if it does exist, otherwise to "aaa".

A problem with #CONTEXT as proposed is that it would invalidate existing pages (and require them to be tracked down and regenerated) as context-specific entries are created. LDC suggests:

#CONTEXT Mathematics
[[Real]] algebraic roots...
would look first for a link "Real (Mathematics)", and only if it didn't find one, link to "Real" at a top level.

Let's say that, at the moment I save the example above, "Real (Mathematics)" does not exist: the link then will point to "Real", and UseModWiki may cache the generated HTML for efficiency. Now someone creates "Real (Mathematics)". At that point my example

  • either will start pointing to "Real (Mathematics)" as soon as it is displayed (an automatic redirection that nobody has explicitly asked for, or even is aware of)
  • or will keep pointing to "Real" until is edited and saved again and the HTML regenerated (leading to potentially stale content, at least in some minor way)

Are we aware of this limitation? I personally can live with it, but I'd like to set expectations straight.

Interesting idea, Lee. I'm not sure if I definitely like it, but it sounds like a nice improvement.

Yes, /Talk pages are definitely exceptions. --LMS

Worryingly, Larry, I am coming round to your way of thinking (see Quotation/Talk) but this is primarily since we only have one level of sub-paging. The #CONTEXT fix would certainly make for an improvement all round IMHO; the caching problem should be negligible, and it wouldn't be terribly difficult to have a bot run every so often to resolve these discrepancies. sjc

I like the idea but do have some questions:

  • Will there be a limited number of #CONTEXTs allowed? If so, won't that eventually be decided as arbitrary? If not, won't it eventually cause conflict? What if authors disagree about a context (this is very possible in the arts)? Can pages be assigned more than one context? What if authors disagree about the ranking?
  • What if a very specific page (a "context" page) is created when the more general one hasn't been? Will links ignore the more specific page, or point to it? Which context (if any) will be taken as standard?
  • Would the #CONTEXT command take place of parentheses as well as subpages, or be used in conjunction with them? I, for one, would much prefer #CONTEXT Mathematics for "Real" but not #CONTEXT Play for Shakespeare's Henry V and #CONTEXT Kenneth Branagh and #CONTEXT Laurence Olivier for the movies (they could both have entries of their own, discussing different technical and artistic points, though I'm not the scholar to write them).
  • Will every page have a #CONTEXT field? I think that nearly subject could fit in more than one field. But I'm not sure what I'd make of a person having apparently distinct pages in separate areas; I think I'd rather see them all visibly linked and choose which (or all) of them to read.

Again, I do like the idea, but I think the issues above are relevant (and, besides, it's better to have a plan and not need it). --KQ