Redirects in search results - proposed software changes
Lots of people have suggested ways to improve the interaction of searches and redirect pages. Some of these follow:
what to do with standard redirects
When searching, if a redirect comes up, we should provide the title of the page it redirects to, not the title of the redirecting page. For example, if you search for "nine eleven", then en:September 11, 2001 Terrorist Attack should come up in the list of "Article title matches", rather than en:Nine eleven.
Also, if two redirects to the same page come up (eg en:nine-eleven and en:nine eleven), then only one should be displayed, not both. Similarly, when searching for "list christians", the redirect at en:list of christians shouldn't be in the list because the target of that redirect (en:list of Christians) does appear. Martin
- I'd like to vote to keep all the redirects en:if and only if someone/s commits to write, test, and implement a change to the Wikipedia software so that search results are decluttered from multiple redirects to the same article. Heck, I'll offer ten wiki-kisses to anyone who does so. :) Martin 12:20 3 Jul 2003 (UTC)
- If we filtered duplicate redirects, we'd have other problems, e.g. a spelling error showing up in the search results, but a legitimate variant being hidden.
As I've suggested before, the solution is to show the name of the target of the redirect, rather than showing the redirect itself (and then remove duplicates).
- This could lead to quite confusing results, especially when the titles are entirely different and the person searching knows not that the result is relevant. IMHO the redirects should be listed, but in small font and nicely formatted below the main article title. --Eloquence
- You will find it hard to name any example that has as many redirects as this one.
- And ugly it is, as well. --Eloquence
Telling people what they will find hard is never really advisable, Eloquence. I can easily think of an article off the top of my head that has more redirects than this one. Welly wanging. :) I don't know if that's where the article is, but I don't need to, because I know that if it's not, it will at least redirect you there. :) That's what's so great about redirects - you don't have to bother remembering exactly where articles are, off the top of your head. If they clutter up the search results, then it should be the search function that is changed, not the policy on redirects. I think it's ridiculous to change a policy that makes editing the Wikipedia easier just to fit in with an unsatisfactory technical function. We should just change the search function. There is no reason that I can think of to make all redirects appear in the search results. We could do something like Google's suppression of results that are similar to others, with a link to the full set of results for anyone who wants it. Here's how it could work: a person searches for the word "X" (e.g. "AIDS"), and the program does the search, but for each page that has "X" in the title, it suppresses the redirects to that page, because they add no useful information. (This would suppress all of the redirects we're voting on). For each page that doesn't have "X" in the title, the redirects which do have "X" in their titles would still be shown, since they do add useful information. How's that for a solution? -- Oliver P. 02:08 7 Jul 2003 (UTC)
- The barrier to this solution is finding an efficient implementation. I'd code this for you, but I'm not sure how to do it without significantly slowing down searches. -- Tim Starling 02:50 7 Jul 2003 (UTC)
- I'm all for having legitimate spelling variants as in wellie wanging, but I simply fail to see the point of a redirect such as 'AIDS Kills Fags Dead' slogan -- who would expect an article to be under that title? As per our policies, it should either be on AIDS Kills Fags Dead or AIDS Kills Fags Dead (slogan) (and for some reason that eludes me, it is actually at Slogan 'AIDS Kills Fags Dead'). As for changing the search results, it is incorrect that having the redirects listed adds no information, it adds the information that these page names are synonymous -- which may not be obvious in all cases, e.g. I search for "BASIC" and get a result for a specific BASIC interpreter, not under the name which I am familiar with -- turns out the name I know is the less common one, and it redirects to that page, but both have BASIC in the title. So I don't think any algorithm should be used to auto-suppress some or all redirects. What I want is a listing like this:
- Slogan 'AIDS Kills Fags Dead'
- Redirects: AIDS Kills Fags Dead slogan, AIDS Kills Fags Dead ..
- The slogan "AIDS Kills Fags Dead" has been used in the United States to express both opposition to homosexuality and hatred towards homosexual persons.
- In other words, a compact listing of the redirects to the page. Perhaps there should be a "more" link after a certain number, though. If only one or several redirects match (search for "welly") and not the subject page, they should be listed like this:
- Redirect to Wellington boot
- The Wellington boot, also known as a welly, a wellie, or a gumboot, is a type of boot based upon a design worn and popularised by Arthur Wellesley, 1st Duke of Wellington.
- Welly Wanging, Welly wanging, Welly throwing, Welly Throwing
- Redirects to Wellie wanging
- Wellie wanging, or wellie throwing, is a freestyle sport that originated in Great Britain, most likely in the county of Yorkshire. Competitors are required to hurl a Wellington boot as far as possible within boundary lines, from a standing or running start.
- (What kind of silly sport is this anyway?) Anyway, this would be fairly complex to code, so it will take a while until we get something like this. In the meantime, we should avoid excessive redirects. --Eloquence 03:33 7 Jul 2003 (UTC)
Your proposed scheme looks nice, but you still need that "more" link to stop it from getting cluttered. Admittedly, you said "perhaps", but I'm sure I can get you to retract that word. Think of those cartoons full of minor characters that shouldn't get articles of their own. Say there are twenty characters. All of those character names will be redirects to the main article on the cartoon. Do you want the full list showing up when you search for the name of the cartoon? I'm betting that you don't. So, assuming that we agree that not all redirects should be shown in the default search results pages, we agree that we need some way of suppressing some of them, even if we disagree about the details. Once we've agreed that, all the "cluttering" arguments for deleting redirects disappear.
I still maintain that if you search for "X" then listing redirects to a page with "X" in the title gives no extra information. Your claim about synonyms is fallacious. Is "Gnasher" a synonym of "Dennis the Menace"? Is "philisophy" a synonym of "philosophy"? The answer to both questions is "no". All the presence of a redirect tells you is that someone once thought there was some connection between the title of the redirect and the subject matter of the article it redirects to. The article itself should tell you all you need to know about synonyms, anyway. So I stand by my claim that the redirects provide no extra useful information. (And besides, in my proposed scheme if someone wants to see them, they only have to click on the "show all results" link, as with Google.) -- Oliver P. 07:07 7 Jul 2003 (UTC)
"Do you want the full list showing up when you search for the name of the cartoon?" That wouldn't happen, because the redirects would not match the search phrase. When they do match, giving the user the information about the redirects can potentially be useful -- in many cases, like this one, it will only be useless extra information. We simply cannot decide by software whether this is the case, because we don't know if the word the user searched for is uniquely identifying -- if "Turbo Basic" has been renamed to "Power Basic", and one is a redirect to the other, a user searching for "Basic" would need this information to know that the search result is relevant. Suppressing "Turbo Basic" because "Power Basic" already matches and "Turbo Basic" is "just a redirect" would be confusing, because the user would not know that they are the same thing.
In the future, I want an advanced redirect syntax that allows us to categorize different types of redirects. Then we could have spelling redirects, policy redirects, alternate name redirects and so on, and we could suppress some of them using a scheme like the one you described. Without such qualifiers, however, we are bound to lose useful information. --Eloquence 19:01 7 Jul 2003 (UTC)
Should some redirects never show up in the search?
OK, my opinion is: Use #DEPRECATED for a redirect that shouldn't show up on a search. (There was some brief discussion about this on Wikipedia-L recently.) One of the great things about Wikipedia is that we almost never create broken links, and there's no need to change that if #DEPRECATED and #REDIRECT distinguishe the 2 kinds of redirects. -- Toby 03:55 Mar 3, 2003 (UTC)
[I think we should] over time, work on technological methods to lessen the negative impact of deleted redirect via a phase-out strategy. Delete them promptly, but allow them to work for a brief transition period.
- limit indexing by search engines via robots.txt (enumerating deleted redirect web pages is probably on the order of hundreds per month)
- add caching directives to not index/cache the pages once deleted, etc.
- as a precaution, show any pages still using them on a special page deleted redirects in use)
- stop showing them in searches immediately
- and delete after 4 weeks
I don't think the phase out strategy is really needed since I don't think anyone really needs or uses a Joe is a jerk link in the first place (at least not often or as a preferred mode of operation); and while people might follow a "Donnelly" link off of Google, were they really looking for a Colorado college comic strip by Daniel C. Boyer? Daniel Quinlan 07:18, Aug 1, 2003 (UTC)
- If they were searching for, say, google:donnelly+daniel+boyer? Could be.Martin 09:37, 1 Aug 2003 (UTC)
- I would like us to consider whether there may be a way to phase out undesired redirects more gracefully (I believe HTML, HTTP, and robots.txt all have methods for removing stuff from well-behaved caches and search engines, Google seems to be the main concern). That may solve a lot of Martin's concerns with deletion of existing redirects. Daniel Quinlan 01:23, Aug 1, 2003 (UTC)
- What's been proposed before is a #DEPRECATED tag - which would:
- not show up in internal search AT ALL
- not be archived by external search engines (noindex, etc)
- Have a special page of internal links to #deprecated pages so people can easily find such internal links and fix them.
- Daniel's right: this kind of solution I would find more acceptable - I've even been pondering looking at Wikipedia code and trying to do it myself... Martin 08:29, 1 Aug 2003 (UTC)
- I think we'd really want to automate this, so I would not advocate a #DEPRECATED tag (although if #DEPRECATED tags automatically got deleted after 30 days, I'd be okay with that approach). I think the best solution is one that would not require editing or manual action once a redirect has been deleted. Just put it on the hit list and when the 30 day timer expires, it's gone gone gone. :-) Daniel Quinlan 08:57, Aug 1, 2003 (UTC)
- Why, if they're completely invisible except for avoiding broken links, would a deprecated redirect need to be deleted? What possible purpose would that solve? Martin 13:29, 2 Aug 2003 (UTC)
- They wouldn't be completely invisible, they would still lead to an article, and thus these links have meaning -- they semantically link a name with an article. I think some redirects are inappropriate and lessen the value of Wikipedia, even if they are hard to find, it does not make them more appropriate, it only addresses one of the criteria for deletion (the search engine). Similarly, an easter egg in a program could be completely offensive, but very very hard to find. I can see it now: "Hey, if you type in "Fascist Pig" on Wikipedia, it goes to 'insert name'." Similarly, "All of my works on Wikipedia redirect to my article because I'm that important." Just a few examples... Daniel Quinlan 20:35, Aug 2, 2003 (UTC)
- Woah!! We were talking about camelcase redirects here. If some camelcase like AlabamA was a deprecated redirect to Alabama, why would it need to be automatically deleted after a month?
- I take your point though: there might be cases where we want to deprecate a link, and then delete it later. I don't think they'd be very common. Martin 21:33, 2 Aug 2003 (UTC)
- Yes, I think we want to delete every deprecated redirect. And please put your "Woah!" aside, I am talking about both camelcase and every other type of deleted redirect. Perhaps we want to keep a log of them so we can notice if someone recreates one of them, but I think we should break certain redirects (well, actually, I think we should eventually break any deprecated link since the goal was to remove them, there's no reason to leave them around forever and some downsides). Daniel Quinlan 22:01, Aug 2, 2003 (UTC)
- PErsionaly, i think that #DEPRACTED redirects would be a good idea, however, i dont think that they should leed directly to a page, as normale redirects do. they sould go to a page like "this page has been moved, please update other links as nessiasory" oh, and change the text depending weather its an internal or external link, eg for internal " please help by corecting the page that linked here", and for incoming links, " please update your bookmarks..."