Jump to content

Research:Design Research Support for Semantic Search

From Meta, a Wikimedia project coordination wiki
Contact
Duration:  October-2025 – October-2025
This page documents a completed research project.


The WMF Design Research team completed a limited study comparing user perceptions of Wikipedia's "legacy" keyword search vs an early version of a "Semantic Search" prototype that returns snippets of Wikipedia articles (at the section or paragraph level) in response to natural language queries.

Methods

[edit]

A randomized survey shown to 52 Wikipedia-reader participants demonstrates a general preference for Semantic Search results over Wikipedia keyword search results, although at least one tested query saw a preference for Wikipedia’s keyword search. Participants recruited from the Userlytics platform were asked to rate their preference for Semantic Search or Wikipedia Search results as shown in the current version (as of October, 2025) of the Semantic Search data prototype. Source queries were copied or inferred from recent Wikipedia reading sessions that were observed as part of adjacent Reader-focused research.

Timeline

[edit]

This study took approximately two weeks to set up, run, and report.

Results

[edit]

Participants generally prefer Semantic Search results across 10 non-typo queries:

  • Semantic Search is more relevant = 8 queries
  • Wikipedia Search is more relevant = 1
  • Neither is relevant = 1

For the query “general grievous,” 17 of 27 raters preferred Wikipedia keyword search, and only 1 preferred Semantic Search.

Participants prefer Semantic Search for 2 of 4 typo-containing queries:

  • Semantic search is more relevant = 2 queries
  • Neither is relevant = 2

Participant ratings match “the WMF eyeball test” in 7 of 10 non-typo queries, and 4 of 4 typo queries.

There are limited indications (in 3 queries) that higher English reading proficiency may be associated with a preference for Semantic Search in some cases, but this should be treated more as an indication of where to turn our attention in the future rather than a conclusive result.

No effects were found among these participants for tech savviness, relationship with Wikipedia, educational level, or other collected demographic variables.


Limitations and directions for further study

[edit]

This initial user-facing effort established that, among the 14 tested “real-Wikipedia” queries and in the current context of presentation, Semantic Search is generally preferred by raters who visit Wikipedia in their free time. Further work involving larger numbers of respondents and dedicated study materials would be needed, however, to address and control for factors such as:

  • Presentation environment (i.e., what effect would we see if the results “look like Wikipedia”?)
  • Result length (in numbers of words)
  • Text format (sentences, paragraphs, or fragments?)
  • A broader range of query types
  • Presentation order (seeing the results sequentially and unlabeled, rather than side-by-side and labeled)
  • Participant-internal factors such as demographics and topic familiarity
  • The relationship between the “relevance” and “usefulness” of search results

References

[edit]