Talk:OpenAccessReader/PrioritisingSignificance

Judging significance[edit]

As an astrophysicist, I usually start by judging papers' significance on whether they appear in one or more review papers. E.g., for solar physics, I might read reviews published in http://solarphysics.livingreviews.org (an open-access review journal), and keep a list of who is cited for a given topic.

Citation number is of course a good first guess. One must be careful of certain biases: papers that are wrong, but controversial, can be cited a lot; citation number can be strongly dependent on which journal the paper was published in; the benchmark for a 'large' citation number is heavily dependent on the field of research- a citation number of 20-50 is pretty good for solar physics, but a citation number of hundreds-thousands is good for a materials science, or nano-science paper.

Another way to judge whether papers are important might be to consider whether they are cited in decadal surveys. Here is one for astrophysics: http://aas.org/resources/decadal-surveys But I'm sure other fields have them as well. --Pohuigin (talk) 23:32, 31 March 2014 (UTC)[reply]

Thank you for your thoughts. With regards to disciplinary differences, that's something we'll definitely be thinking about when looking at metrics. Your comments about review papers and decadal surveys raise an interesting point, because we absolutely want to include those papers which shown to be significant by being highly cited over a long period, but also I think it would be best not to leave out the cutting edge of newer research. So perhaps we could build in ways to compensate: to invent a scenario, an article published in 2002 that has been cited 200 times may have a similar level of significance to an article published in 2010 that has been cited 40 times. We'll have to think carefully about any such weightings. - Lawsonstu (talk) 20:04, 1 April 2014 (UTC)[reply]

Always notable topics[edit]

There are some areas in which every single topic is always notable on the English Wikipedia, and some of these intersect well with scholarly publishing. For instance, all biological species fall in this category, so it would be possible to scan the literature - both old and new - for descriptions of new species (e.g. occurrences of strings like "sp. nov."/ "nov. sp."/ "sp. et gen. nov." etc. in the respective abstracts or titles) and then check whether Wikipedia entries for these taxonomic units already exist (keeping in mind article naming conventions, e.g. no species-level article for monotypic genuses).

If the Wikipedia article already exists, it should just be checked whether the scholarly article describing the species is properly cited (many Wikipedia articles on new species are started based on reports in the popular media, without citing the scholarly source).

If it does not exist, then notifications should go to talk pages of articles or WikiProjects on higher taxonomic units. To identify which these units are, one could consult places like ZooBank, IPNI and MycoBank that assist with that in an increasingly automated fashion. If this does not work, then it becomes relevant as to whether the source article is actually open (and thus minable). If it is, a script could go into the nomenclature section (which is fairly standardized) and identify the relevant higher taxonomic units (which are often also in the title) - for which the likelihood is higher that an entry on the English Wikipedia will already exist - in order to post the news on the respective talk pages.

The majority of new species and higher taxa are still being described in non-open articles, even though this is changing. Because of that, editors working in the area would likely prefer to have such a notification system for any relevant publication, open or not. Openness would then kick in in terms of whether a free-to-read copy is available, so that Wikipedia editors can at least read about the new taxa first hand (rather than through news reports, which are available for only a tiny portion of new taxa, and often very inaccurate), or in terms of whether text bits or media can be reused in the Wikipedia article.

I would be interested in prototyping such a workflow for new species for the English Wikipedia, as it intersects well with other activities by WikiProject Open Access, as well as with my interest in biodiversity. -- Daniel Mietchen (talk) 12:35, 10 September 2014 (UTC)[reply]

Hi EdSaperia, I wonder if you've given any further thought to prototyping a Wikipedian workflow along these lines, or some other? Daniel Mietchen brought this up to me in conversation today. It makes good sense to me to focus on low-hanging fruit to test an initial workflow, perhaps you can come up with something with further discussion together. Cheers, Siko (WMF) (talk) 00:05, 23 October 2014 (UTC)[reply]

Hi Siko (WMF), while this is related in terms of broad approach I don't think our technology, research or methods are at all replicated here, so I don't think it would make sense for us to explore this. However, we've now managed to produce some example output which is quite promising! EdSaperia (talk) 23:31, 3 November 2014 (UTC)[reply]

Possibly relevant papers[edit]

Scientific impact evaluation and the effect of self-citations: mitigating the bias by discounting h-index http://arxiv.org/pdf/1202.3119.pdf