WMDE Technical Wishes/AdvancedSearch/Functional scope

From Meta, a Wikimedia project coordination wiki

The AdvancedSearch extension aims to make existing search options more visible and accessible for everyone. It focuses on the most common search options to prevent an overloaded interface and aims to use labels that are precise and easy-to-understand.

Here’s an overview of the parameters the extension already offers, is planning to add and cannot offer. Please note that all parameters that are not integrated in the extension can still be entered manually in the search field at the top. The list doesn't cover 100% of all existing keywords, but all keywords we already have discussed/looked into by now.

As always, feedback is very welcome!

Included use cases:

Use case Search field Implementation Description and remarks
full text search These words default search Searches without any restriction for words in the title or text.
exact search Exactly this text "" Searches for a specific sequence of characters. Punctuation marks are ignored.
excluding words Not these words - Excludes pages containing these words from the search.
one of the given words One of these words OR Searches for pages containing at least one of the given terms.
search in page title Page title contains intitle: Searches for pages with titles that contain this word.
search in subpages Subpages of this page subpageof: Searches for all subpages of a page. The results are unordered, i.e. different hierarchy levels are mixed together in one list. This keyword works in combination with all namespaces (unlike prefix: search).
Searches for pages that are in these categories and subcategories. Pages in these categories deepcategory: Searches in categories and subcategories.
Searches for pages that contain exactly these templates. Pages with these templates hastemplate: Searches for pages containing specific templates.
searching for files of a particular type File type filetype:

filemime:

Searches only for files of the selected type (e.g. jpeg) or of a predefined bundle (e.g. image). Allows specification of width and height for images/videos.
Sorting order of results Sorting order This feature has no keyword that you can type in the search input field since the function is specified via a parameter in the URL. Change the sorting order of results. The standard is that most relevant pages are placed on top of the results.
search for content in a specific language inlanguage: Searches for content in a specific language. This option will only be visible for wikis that have the Translate extension installed.

Excluded use cases:

Use case Implementation Description and remarks
search in categories incategory: Searches in categories, but not in subcategories. This leads to very confusing search results.
searching for subpages or in page titles via prefix prefix: prefix: is not integrated in the extension because of its side effects: The value of this keyword always includes a namespace, e.g. “Wikipedia” in prefix:Wikipedia:Technische Wünsche. Having namespaces both here and in the namespace selection bar at the bottom (“Search In”) would create a very confusing user interface.
prefer pages with certain templates boost-templates: Ranks images/articles with certain templates higher in the search results. Was not implemented because this isn’t a commonly used keyword.
prefer pages that were recently edited prefer-recent: Ranks recently edited pages higher in the search results. Was not implemented because this isn’t a commonly used keyword.
searching for links to a specific page linksto: Searches for pages which link to a specific page. Was not implemented because this isn’t a commonly used keyword.
searching in the source code  insource: Searches in the source code of a page, e.g. to find markup. Searching in source code usually includes characters such as ~@#&*()-+{}[]|\<>?.\.

In order for the search to work, these characters must be entered as regular expressions. However, searches with regular expressions have serious performance problems. They create a heavy load on the search backend and often don’t deliver any results (i.e. they time out). Because of these performance problems insource: was not integrated. We did not consider it reasonable to integrate insource: without regular expressions, because this would limit the applicability of the source code search severely.

searching for approximate matches (fuzzy search) ~ Searches for approximate matches. This was not integrated because fuzzy search is a complex concept and there is no short, easily understandable label that can describe the concept in the Advanced Search interface.