Community Wishlist Survey 2023/Larger suggestions/Allow for searching for the start and end of a string
Appearance
![]() | This proposal is a larger suggestion that is out of scope for the Community Tech team. Participants are welcome to vote on it, but please note that regardless of popularity, there is no guarantee this proposal will be implemented. Supporting the idea helps communicate its urgency to the broader movement. |
Allow for searching for the start and end of a string
- Problem: If one searches for a regex term in Wikipedia, such as intitle:/[a-z]{5} [a-z]{6}/ , they get junk like Topographic prominence.
- Proposed solution: Adding a start and an end function to this term would remove the junk.
- Who would benefit: People who want to search pages by their enumeration. This is especially true with Wiktionary. Also, this would help reduce load on the Wikimedia servers, as they would start need to search pages that started/ended with the start/end query.
- More comments: You can use the standard regex start and end functions, ^ and $.
- Phabricator tickets: T317599
- Proposer: CitationsFreak (talk) 21:33, 28 January 2023 (UTC)
Discussion
- I think this is impossible as the Search engine technology does not support it. See https://www.elastic.co/guide/en/elasticsearch/reference/current/regexp-syntax.html#regexp-unsupported-operators But i'm sure DCausse (WMF) has the exact details. Additionally, there are almost always better ways to do a search like this, probably via Quarry. —TheDJ (talk • contribs) 23:31, 29 January 2023 (UTC)
- Indeed, the underlying regular expression engine we do use does not support such syntax. Adding support for this (while not completely impossible and without going into the details) would require a non negligible effort to do properly (adapt/rewrite the regex parser, possibly use reserved unicode characters as start/end markups in the index, ...). DCausse (WMF) (talk) 10:55, 30 January 2023 (UTC)
- Doesn't this just require converting
/^<regex>/
into/<regex>/ AND NOT /.(<regex>)/
? Tgr (talk) 01:40, 1 February 2023 (UTC)- Indeed! I haven't thought about this possibility but this does seem to be a valid workaround, for searching for pages that end with
test
one can search forintitle:/test/ -intitle:/test./
, see it in action on wiktionary. @CitationsFreak: would documenting this workaround be good-enough? DCausse (WMF) (talk) 14:59, 3 February 2023 (UTC)- Actually after discussing with my colleagues, this workaround does not quite work because it will ignore words that have the suffix you want elsewhere in the word, e.g.
intitle:/ed$/
is not equivalent tointitle:/ed/ -intitle:/ed./
as the later will exclude educated but actually should match with the former. DCausse (WMF) (talk) 18:19, 8 February 2023 (UTC)
- Actually after discussing with my colleagues, this workaround does not quite work because it will ignore words that have the suffix you want elsewhere in the word, e.g.
- Indeed! I haven't thought about this possibility but this does seem to be a valid workaround, for searching for pages that end with
- Doesn't this just require converting
- Indeed, the underlying regular expression engine we do use does not support such syntax. Adding support for this (while not completely impossible and without going into the details) would require a non negligible effort to do properly (adapt/rewrite the regex parser, possibly use reserved unicode characters as start/end markups in the index, ...). DCausse (WMF) (talk) 10:55, 30 January 2023 (UTC)
Voting
Support if possible, but it would probably mean introducing proper regexp support (e.g. PCRE) which would have many other major benefits. Certes (talk) 21:47, 10 February 2023 (UTC)
Support * Pppery * it has begun 04:05, 11 February 2023 (UTC)
Support Arado Ar 196 (talk) 08:30, 11 February 2023 (UTC)
Support NguoiDungKhongDinhDanh 09:45, 11 February 2023 (UTC)
Support //Lollipoplollipoplollipop::talk 10:09, 11 February 2023 (UTC)
Support Nw520 (talk) 12:59, 11 February 2023 (UTC)
Support Wotheina (talk) 16:33, 11 February 2023 (UTC)
Support Novak Watchmen (talk) 17:54, 11 February 2023 (UTC)
Support --NGC 54 (talk|contribs) 01:36, 12 February 2023 (UTC)
Support Izno (talk) 08:00, 13 February 2023 (UTC)
Support Libcub (talk) 07:13, 14 February 2023 (UTC)
Support —Locke Cole • t • c 08:57, 15 February 2023 (UTC)
Support ZandDev (talk) 09:59, 15 February 2023 (UTC)
Support Aishik Rehman (talk) 09:06, 16 February 2023 (UTC)
Support ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 17:31, 16 February 2023 (UTC)
Support —(ping on reply)—CX Zoom (A/अ/অ) (let's talk|contribs) 07:55, 18 February 2023 (UTC)
Support Christian (talk) 20:35, 18 February 2023 (UTC)
Support PCRE-support would be great, but it’s not for the average Joe. Kays (talk) 04:01, 19 February 2023 (UTC)
Support β16 - (talk) 14:32, 20 February 2023 (UTC)
Support cyrfaw (talk) 18:22, 21 February 2023 (UTC)
Support Thingofme (talk) 02:08, 23 February 2023 (UTC)
Support GoingBatty (talk) 03:54, 23 February 2023 (UTC)