Jump to content


Add topic
From Meta, a Wikimedia project coordination wiki
Latest comment: 1 year ago by A455bcd9 in topic Testing Google on the examples provided


Tool/Process ideas[edit]

  • Determine whether an input question has an existing answer

Open questions[edit]

How to maintain the relevancy of an answer?[edit]


Which existing system should be used to produce candidate answer?[edit]


Can Wiki cite a specific passage of a website?[edit]

Generally useful metadata

Today, citing a specific passage is done using the quote named parameter. For cite web and Citation, this parameter is described as "relevant text quoted from the source; displays last, enclosed in quotes; must include terminating punctuation." So, when using the quote named parameter in these templates, the quoted text is provided in the citation and is presented in the reference, displayed last. The R template also has a quote named parameter. This parameter is described as "a quote from the source. Appears when hovering over the page number, so the page number must be specified." Per the documentation, there could be some complexity as websites are nonpaginated.
For an example of a Wikipedia article using these features of the R template, see: https://en.wikipedia.org/w/index.php?title=California_housing_shortage&oldid=869767410 .
For an example of a text fragments URL, see: https://en.wikipedia.org/w/index.php?title=California_housing_shortage&oldid=869767410#:~:text=Since%20about%201970,new%20residents.)%5B13%5D .
To describe a passage of a document, one uses selectors. Selectors were explored in the Web Annotation Data Model and, more recently, text fragments. Per text fragments, these wiki citation and reference templates could be extended to include support for providing hyperlinks to webpages such that quoted content would be highlighted and scrolled to for users.
Brainstorming, a selector named parameter could be added to relevant templates. Some of the citation-related templates, then, would resemble:
{{cite web
 | url =
 | title =
 | last =
 | first =
 | date =
 | website =
 | publisher =
 | access-date =
 | quote = 
 | selector =
 | url =
 | title =
 | last =
 | first =
 | date =
 | publisher =
 | access-date =
 | quote =
 | selector =
 | name =
 | quote = 
 | selector =
Resultant output hyperlinks for users would, for their URL's, combine the value for the url parameter and the value for the selector parameter, per the text fragment syntax. The url value specified on a cite web or Citation template could be combined with a selector value specified on an R template to create a resultant hyperlink that highlights and scrolls to quoted content.
For the example, the url would be https://en.wikipedia.org/w/index.php?title=California_housing_shortage&oldid=869767410 and the selector would be Since%20about%201970,new%20residents.)%5B13%5D which combine into https://en.wikipedia.org/w/index.php?title=California_housing_shortage&oldid=869767410#:~:text=Since%20about%201970,new%20residents.)%5B13%5D.
References and page numbers are also relevant. For paginated resources, users can cite them multiple times in a document while varying the specific pages cited in each instance. It looks like sort of like this[1]:1-10 and this[1]:11-12. Users might want to be able to reference a website, a nonpaginated resource, multiple times in a document while varying the quoted passages of its content. Perhaps it could resemble this[1]:quote #1 and this[1]:quote #2. In this case, resembling the R template, users would be able to view relevant content in hoverboxes while hovering over the different parts of the superscripted content. It is possible that a hyperlink which opens content in a new browser tab, highlights and scrolls to quoted content could be placed in such a hoverbox.
Providing these features for users would seemingly require a closer look at, and perhaps expanding, the R template, which, according to its documentation, provides this sort of functionality only for paginated resources.
So, citing specific passages is possible with the quote named parameter, one can get hoverboxes with quotes in them using the R template for paginated resources, and, thanks to text fragments, Wiki could soon cite specific passages of websites so as to provide hyperlinks such that quoted content would be highlighted and scrolled to for users. This would require some work on a number of interrelated templates and some consideration of new presentational syntax. AdamSobieski (talk) 19:20, 6 January 2022 (UTC)Reply
Also, I tried editing the example Where are caterpillar rhodinia from to showcase the R template and quote named parameter. It appears that the meta.wikimedia.org platform differs from the Wikipedia platform in terms of installed templates. AdamSobieski (talk) 23:43, 6 January 2022 (UTC)Reply

Could people volunteering questions be a way to identify blind spots in Wikipedia knowledge?[edit]

If someone asks for "what is the Kantian interpretation of morality" but no Wikipedia article adequately answer it, knowing that someone cares enough to ask can be motivation to materialize the knowledge.

Are all questions fair game?[edit]

Multiple answers?[edit]

Are those duplicates? or just slight reformulation? Wiki shines with the revision and open discussion process; should their be only one answer but it should aim to encapsulate the relevant complexity and subtlety that usually give rise to multiple answer?

Does Wiki need an AI Q&A system?[edit]

Wikimedia doesn't necessarily have the resources to maintain all that is required for a SOTA Q&A systems.

  • Closed-book model are not high quality enough
  • Open book systems involve many other infrastructural component

What's the essential aspect of AI that we aspire to?

Integration with Wikidata[edit]

There is a natural extension to Wikidata that stands to help the research community. Can take many form. Discuss!

How to deal with paraphrases?[edit]

Wikidata... Multiple page... Duplicate... Discuss!

How should we model questions and question-answering processes?[edit]

Per the current Wikipedia article on questions, questions can be categorized into polar (yes-no) questions, alternative questions, and open questions.

Questions can also be folksonomically categorized into knowledge domains, like Wikipedia articles.

Question-answering processes can be described as being either "single-hop" or "multi-hop". Multi-hop question-answering requires a model to retrieve and integrate information from different parts of a text(s) to answer a question.

Question-answering processes can be described as involving differing cognitive skills. See, for example, Bloom's taxonomy applied to questions.

With respect to types of inferences which occur during reading comprehension:

Distinctions between Different Types of Inferences
Author(s) Distinctions identified
McKoon and Ratcliff (1992) Automatic, Strategic
Graesser, et al. (1994)
Long, et al. (1996)
Online, Offline
Graesser, et al. (1994) Text-connecting, Knowledge-based or Extratextual
Graesser, et al. (1994)
Beishuizen, et al. (1999)
Gygax, et al. (2004)
Local, Global
Barnes, et al. (1996)
Calvo (2004)
Coherence, Elaborative
Pressley and Afflerbach (1995) Unconscious, Conscious
Singer, et al. (1997) Bridging
Cain and Oakhill (1998) Intersentence or text-connecting, Gap-filling
Bowyer-Crane and Snowling (2005) Coherence, Elaborative, Knowledge-based, Evaluative
Cromley and Azevedo (2007) Anaphoric, Text-to-text, Background-to-text
Graesser, Singer, Trabasso (1994) Pressley and Afflerbach (1995)
  1. Referential
  2. Case Structure Role Assignment
  3. Antecedent Causal
  4. Superordinate Goal
  5. Thematic
  6. Character Emotion
  7. Causal Consequence
  8. Instantiation Noun Category
  9. Instrument
  10. Subordinate Goal Action
  11. State
  12. Reader's Emotion
  13. Author's Intent
  1. Referential
  2. Filling in Deleted Information
  3. Inferring Meaning of Words
  4. Inferring Connotations of Words / Sentences
  5. Relating Text to Prior Knowledge (12 types)
  6. Inferences about the Author
  7. Characters or State of World as Depicted
  8. Confirming/Disconfirming Previous Inferences
  9. Drawing Conclusion

Tables from:

Kispal, Anne. Effective Teaching of Inference Skills for Reading. Literature Review. Research Report DCSF-RR031. National Foundation for Educational Research. The Mere, Upton Park, Slough, Berkshire, SL1 2DQ, UK, 2008. PDF.


Interesting, Keep me in the loop for now. · · · Peter (Southwood) (talk): 18:52, 18 December 2021 (UTC)Reply

It is a interesting project and I have thinked in the past about how to automatically generate questions about a specific topic and I am interested in automatically question answering. From my point of view it is not a own Wikimedia Project. I think for the beginning a WikiProject in Wikidata and try to keep updated about the Abstract Wikipedia Project can be a start.--Hogü-456 (talk) 20:25, 7 January 2022 (UTC)Reply
Make sense; Can you help me understand how ideas are thought to be Wikimedia Project vs a WikiProject? SebastienDery (talk) 17:13, 23 January 2022 (UTC)Reply

On the mode of answering...[edit]

Hello there, I am quite interested in this ... Habeeb Shopeju. I think we can give room to answer questions in multiple modes, with priority given to yes-no questions such as "Q: Does the sun rise from the East? A: Yes" and fact-based questions like "Q: In what year did Spain win the World Cup? A: 2010". Everything else can provide paraphrased sections of the most relevant Wikipedia articles.

  • I agree it should be a very simple format to let the contributors give the best answer they can. What do you think of this format? where are caterpillar rhodinia from -SebastienDery
    • I think it's a good format. A good fit for paraphrased answers. What purpose does the paraphrase section under the answer section serve? They look like paraphrased questions. - Habeeb Shopeju
      • The idea is to have an easy way to specify what should be redirected to this article. They are indeed paraphrases. -SebastienDery
        • Great then. Any plans on what the next steps are for the project?. - Habeeb Shopeju
          • Gather more interested individuals like yourself! Spread the idea through your social media if you active there. I'm thinking one thing that might be good to try is to use the existing Wikipedia page/tooling to generate a few Q&A page; this will help feel the pain of a contributor and identify what might be missing / needs to be built -SebastienDery
          • Add your name to People interested :) -SebastienDery

In the worst case, if no article comes close enough based on any similarity function in place, a task could potentially be opened up for volunteers to add suitable content to existing Wikipedia articles or create new articles if topic is significant enough.

Testing Google on the examples provided[edit]

According to the proposal: "Yet where it helps the curious with no particular goal in mind, it suffers in specificity. I currently need to read/search a whole article on fish even if all I really wanted to know was "how did schools of fish evolve?". Or how many nuclear reactors are being built in the world in 2021? Are figs full of dead wasps? Does Canada have a deregulated electricity market? How many foreign students are there in the EU? Will I be aware during general anesthesia? Where does morality come from?"

I googled these questions:

  • "how did schools of fish evolve?": first result is paywalled and the Wikipedia article is too long.
  • "how many nuclear reactors are being built in the world in 2021?": Google's snippet says "About 55 more reactors are under construction in 15 countries, equivalent to approximately 15% of existing capacity." based on the World Nuclear Association
  • "Are figs full of dead wasps?": Google's snippet tells us: "There are no dead wasps in figs." based on this random website
  • "Does Canada have a deregulated electricity market?": The snippet partially answers the question but the paragraph it's taken from fully answers it: "Alberta has a deregulated electricity market where prices are market-based. Ontario has partially restructured its electricity market. In other provinces and territories, electricity prices are set by electricity regulators." The source is a Canadian gov website.
  • "How many foreign students are there in the EU?": The snippet informs us that "There were 1.3 million students from abroad who were undertaking tertiary level studies across the EU-27 in 2018.", quoting the European Commission
  • "Will I be aware during general anesthesia?": The snippet, quoting the UK NHS website says: "During a general anaesthetic, medicines are used to send you to sleep, so you're unaware of surgery and do not move or feel pain while it's carried out."
  • "Where does morality come from?": Full snippet: "Some philosophers argue that morality is not biologically determined but rather comes from cultural traditions or from religious beliefs, because they are thinking about moral codes, the sets of norms that determine which actions are judged to be good and which are evil." The snippet doesn't fully answer the question (is it possible to do so?) but the source itself ("The Proceedings of the National Academy of Sciences (PNAS), a peer reviewed journal of the National Academy of Sciences (NAS), is an authoritative source of high-impact, original research that broadly spans the biological, physical, and social sciences.") is a great introduction to the topic and provides a broader answer to the question.

Results may differ depending on your location, but overall, I think that Google provides great answers to all these questions, and often based on high-quality sources. It's sad that a commercial company is the gateway to knowledge, but could Wikianswers do better than that? Would contributors have an incentive to ask questions and write answers on Wikianswers if they can find answers to their questions so easily on Google? Given the quality of answers on Google, I'm worried that questions on Wikianswers may only be either stupid/irrelevant ("will I die?", "where does this famous soccer player live?") or too complex (and probably without an answer). In the latter case, Wikianswers could probably overlap with the Reference Desk. What do you think? A455bcd9 (talk) 15:44, 21 September 2022 (UTC)Reply