Add topic
From Meta, a Wikimedia project coordination wiki
Latest comment: 1 year ago by Diego (WMF) in topic Automatic Fact Checking using Wikipedia

A few questions[edit]

I have a bit of difficulty in understanding the Wikifact proposal.

  1. For instance, what is a fact? The example "General Min Aung Hlaing's speech on TV Feb 8th 2021" is not a normal statement. A statement would be "General Min Aung Hlaing spoke on TV 8 february 2021".
  2. If the so-called "fact" is the transcribed talk called "General Min Aung Hlaing's speech on TV Feb 8th 2021" then there is already Wikisource. Note however, the possible problem of copyright with such content. If the Video source and the speech is compatible with CC BY-SA, then the video file can be uploaded to Wikimedia Commons and the speech can be transcribed on Wikisource.
  3. I wonder what is meant by the Wikifact proposal is some kind of Wikisource with annotation capability, where fragments of a source is called into question and wanted for fact checking?
  4. I wonder what we can do if the sourceis not compatible with CC BY-SA? Can we quote fragments?
  5. The English Wikipedia and a number of other Wikipedias have Template:Citation needed span. Are there anyone using that template and listings of it?

Finn Årup Nielsen (fnielsen) (talk) 14:32, 9 February 2021 (UTC)Reply

Thank you.
  1. As interesting, offers some in-depth discussion of what facts are.
  2. I added Wikisource to the list of wiki sites which could be benefitted by a Wikifact project.
  3. That annotation capability is interesting to consider for Wikinews, Wikipedia, and Wikisource. Also described in the technical discussion topics is an interdependency between articles and facts, and that editors would desire to be alerted or notified when facts upon which their articles depend change, are annotated, or happen to unfold.
  4. That's a good question. I don't know. In the United States, there is Fair Use.
  5. That template seems proximate to what I broach for discussion with {{fact|User content goes here.}}, except that developers could implement needed functionality with a new template span, such as {{fact}}, which could create and synchronize with Wikifact content.
AdamSobieski (talk) 16:12, 9 February 2021 (UTC)Reply
Good questions, thank you.
  1. I see that the point is not that "General Min Aung Hlaing spoke" (that is a fact we know), but what did he say in his speech. To check facts of his speechs, the claims he made.
--Teemu (talk) 20:54, 9 February 2021 (UTC)Reply
Can I suggest an overview in response to your detailed questions?
Fact-check articles are written from a different angle to encyclopaedia entries.
Instead of focusing on the facts of a subject, fact-check articles focus on disagreement about the facts?
Yes, there are aspects of this approach in existing Wikipedia sections, such as "criticism" , "controversy" and even "scandal" - much more so in French equivalents than English, as a comparison.
But to conclude?
Fact-check articles generally start/finish with a finding akin to True, False, or Mixed.
Like you, I'm not sure how that fits in exactly with existing resources, or even encyclopaedic formats, but I still like the idea of a Wikifact resource. This might lead to, for example, existing disagreements over facts on wikipedia talk pages being productively collated and curated in an institutional sense, compared with external references, as a kind of background, behind-the-scenes explainer to existing wikipedia pages.
Doesn't really answer your questions, sorry, but just some thoughts as a now (sigh) 40 year journalist, and long-time pedia user, lover. Jasonbrown1965 (talk) 03:13, 11 June 2022 (UTC)Reply

Thanks for the answers. Perhaps the proposal would benefit from a few concrete examples? — Finn Årup Nielsen (fnielsen) (talk) 10:00, 10 February 2021 (UTC)Reply

Seems unrelated to Wikinews[edit]

The proposal keeps talking about Wikinews —I'll suppose we're talking about en.wn— but I see no sign in the proposal of knowing how en.wn works. En.wn takes great pains to meet high standards of accuracy and neutrality before publishing articles, making it a source of reliable, neutral information, which other projects are welcome to make use of. En.wn does not use Wikipedia as a source, and does not use Wikidata as a source; afaics the proposed Wikifact would also use Wikipedia-style collaboration and so en.wn wouldn't use it as a source either.

The differences between Wikipedia-style collaboration and Wikinews (well, en.wn) -style collaboration are profound, and also relate to the whole matter of fact-checking; I'm happy to discuss those things, of course (subject to my own time constraints, obviously), as they may be highly relevant to the concept being this proposal; but my immediate point here is that incautious assumptions are apparently being made in this proposal about how Wikinews works. --Pi zero (talk) 19:37, 10 February 2021 (UTC)Reply

Thank you. I recently reviewed a resource about article stages and visited the Wikinews Newsroom. Hopefully, a Wikifact project could support Wikinews editors during the {{develop}} and {{review}} article stages, before the {{publish}} article stage. I would be interested in learning more towards designing Wikifact-style collaboration so that the project could be used by and of use to Wikinews editors.
AdamSobieski (talk) 22:17, 10 February 2021 (UTC)Reply
Some aspects of en.wn, related to sourcing and collaboration. I don't know of any treatment of en.wn that's quite right for this context, so I'm composing a new one. (Sorry it runs on a bit; if I knew how to capture all the key points more succinctly, I would. For a pretty good general overview of en.wn, though not what's needed here, I'd suggest WN:PILLARS.)
  • A defining impulse of news is that it must not be gotten wrong: thou shalt not publish a mistake. This attitude contrasts radically with the basic Wikipedian impulse to "be bold", i.e., publish first and trust that somebody will fix it later if it's wrong. The Wikipedian model of collaboration is (if I may say) elegant in its simplicity, and well-suited to the "be bold" attitude, but not applicable to news: modulo weeding out editors who don't share the basic aspirations of the project (especially, accuracy and neutrality), let anybody edit it as they see fit, and over time it gravitates toward those aspired ideals. After any finite time, a given article may have faults, but statistically these will tend to settle down as the number of eyeballs on the given article increases without bound; faster for some articles than for others, and gravitating closer to the ideal for some articles than for others, but it mostly works pretty well — when certain premises are met. The topic should be one where temporary flaws in an article are okay, and the process of improvement can be unbounded. Neither of these applies to en.wn, though: part of the definition of news, as mentioned, is that flaws are not okay; and, news has to be new (we have a saying on en.wn: "facts don't cease to be facts, but news ceases to be news"; put another way, a news article is a snapshot in time, showing what a news event looked like at the time, and the en.wn archives are a vast photo album of such snapshots, preserved for posterity). Moreover, from the deadline of news it follows that the much-more-rapid convergence on a very-high-quality product is going to have to be achievable with far fewer eyeballs per article. Evidently something about the Wikipedia-style collaboration model has to change for news.
  • The basic en.wn collaboration model on a news article is asymmetric, i.e., it's not multiple people all doing the same thing to the article; instead it's usually just two people, playing different roles. (It may sound crazy, from a Wikipedian mindset, to expect good results from only two sets of eyeballs, but there's a trick to why it works, which I'll get to in a moment.) The first contributor on the article is the reporter, who is expected to write the article, and thoroughly document its sources; the second contributor on the article is the reviewer, one of a small set of users authorized by the community to perform this task. The reviewer, who must be independent of the writing of this article (i.e., mustn't be a coauthor), gives the article a really thorough vetting against all our policies and guidelines, makes some light copyedits (without getting too "involved"), and either publishes the article or finds it not-ready and writes up comments on what's wrong and, when applicable, what can be done to fix it; if it's not-ready, the reporter can revise and resubmit, etc., but this can't go 'round too many times because there's a deadline. After eleven years or so under this system, I can say it works amazingly well and has some serious challenges in it that we're still working on.
  • And why does this work, with just two sets of eyeballs? (Keeping in mind, the logistics of guaranteeing even two sets of eyeballs on every single article before publication, with the second set an expert in project policies guidelines and practices who has to provide a massive block of effort on somebody else's schedule, is one of those serious challenges mentioned.) The key to it, I've gradually concluded, is that the reporter and the reviewer are working from the same playbook. It can be massively harder for a reviewer to review an article by a newcomer who hasn't learned the ropes yet, and ultimately the project could never sustain that as the dominant form of collaboration in the long term: all such reviews of newcomer submissions must be investments in helping the newcomer get up to speed and become a veteran. With a veteran Wikinewsie as a reporter, the article is submitted already mostly meeting all the project standards, and with statistically only a few flubs; the reviewer is also vetting for those same project standards, and since there are likely to be very few flubs for the reviewer to find, the odds of both collaborators missing something get to be very low indeed. If the reporter and reviewer were working at cross-purposes, in an adversarial relationship, this very-low probability of error would not hold, which is why a harmonious collaboration is crucial to making it all work (though this can't be bought at the price of lowering standards, which would defeat the whole purpose).
  • Thorough source material has to be provided (which is true both when the sources are all news published elswhere, and when some of the sourcing is original). Really, review would be impossible without this: review is already a massive task, without asking the reviewer to go on an open-ended hunt for verification. It's easiest when everything comes from a reasonably few trust-worthy news sources (though that has to be watched carefully for potential bias or inaccuracy). Wikipedia, notably, cannot be used as a source; it's inherently not even stable, after all, and moreover information in a Wikipedia article either is not itself based on a trust-worthy source and oughtn't be used, or is based on a trust-worthy source and the trust-worthy source ought to be used directly.
That's some perspective to be getting on with, anyway. --Pi zero (talk) 07:05, 11 February 2021 (UTC)Reply

The importance of RSS Feeds for the Project[edit]

It will be interesting to make use of RSS Feeds to feed Wikifact. RSS Feeds include the metadata of news items and can be consequently useful for fact checking. The application of Wikidata-driven semantic similarity measures and word embeddings on the titles of news items can be important to have unique entries for facts.

For this purpose, I propose the following useful readings:

I propose as well to see my WikiSpore day presentation about the topic. It is available at starting at 1:26:36.

Yours Sincerely, --Csisc (talk) 14:08, 22 March 2021 (UTC)Reply

CLEF Projects for Fact-Checking[edit]

It will be interesting to be inspired from the outcomes of the CLEF shared tasks for fact-checking. These important projects are Touché ( and CheckThat! ( For this purpose, I propose to read these interesting papers:

I also propose to check the research papers of the CLEF sessions of ECIR 2021 when available online. These CLEF papers provide an up-to-date overview of the progress of the two projects.

Yours Sincerely, --Csisc (talk) 11:48, 30 March 2021 (UTC)Reply

Open Event Knowledge Graph[edit]

I propose to see the project for the creation of Open Event Knowledge Graph, a structured semantic database for news. The paper is available at --Csisc (talk) 11:55, 30 March 2021 (UTC)Reply

Text Fragments[edit]

It is interesting to use text fragments to annotate facts in HTML pages. Further information is available at --Csisc (talk) 12:04, 30 March 2021 (UTC)Reply

$1.5 trillion test case | an interim approach[edit]

. . .

concept : wikifact test case : clearstream interim approach : internal

Wikifact needs an early test case.

One case might be the $1.5 trillion US "false assets" case that disappeared from a Wikipedia page about a bank for banks, Clearstream.

Fact-checking that disappearance, including an editor IP that lead to Clearstream's own ISP, might help establish Wikifact as an independent institution separate from Wikipedia.

Start here.

. . .

Automatic Fact Checking using Wikipedia[edit]

Hello, you might be interested on this project. This tool is a working prototype for automatic fact checking using Wikipedia as ground truth. Currently the system works just in English, but with enough training data could be expanded to more than 100 languages. Diego (WMF) (talk) 22:12, 18 August 2022 (UTC)Reply