Research talk:AfC processes and productivity

From Meta, a Wikimedia project coordination wiki

Work log[edit]

Archive

Wednesday, April 23rd[edit]

Updated the analysis to exclude process (AfC status change) edits.

The geometric mean bytes changed per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
Revisions by others. The geometric mean bytes changed per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
The geometric mean bytes changed per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
Bytes changed by others. The geometric mean bytes changed per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
The geometric mean number of registered editors per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
Unique non-creating users. The geometric mean number of registered editors per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
The geometric mean number of anonymous editors per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
Unique non-creating anons. The geometric mean number of anonymous editors per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.

It looks like the story is generally the same. --EpochFail (talk) 14:07, 23 April 2014 (UTC)Reply[reply]


The number of surviving (at least 30 days) articles created per newcomer page creator is plotted over time with loess smoothing. The trend is split on June 15th, 2011, when newcomer page creators began to be funneled toward AfC.
Surviving articles per new page creator. The number of surviving (at least 30 days) articles created per newcomer page creator is plotted over time with loess smoothing. The trend is split on June 15th, 2011, when newcomer page creators began to be funneled toward AfC.

I tried limiting stats generation to only those articles that survived long enough (e.g. week 2 == survived at least one week).

The geometric mean bytes changed per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
Revisions by others. The geometric mean bytes changed per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
The geometric mean bytes changed per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
Bytes changed by others. The geometric mean bytes changed per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
The geometric mean number of registered editors per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
Unique non-creating users. The geometric mean number of registered editors per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
The geometric mean number of anonymous editors per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.
Unique non-creating anons. The geometric mean number of anonymous editors per week (excluding creating user and AfC status change edits) is plotted for AfC drafts and Direct to Main articles that were deleted quickly or not.

--EpochFail (talk) 15:01, 23 April 2014 (UTC)Reply[reply]

Discussion[edit]

Following my fellow reviewers' comments here, I'd like to argue that the fact that the "AfC process reduces editor productivity" is partly caused by the fact that a huge chunk of submitters only submit the one article because they are involved with the subject matter in some way. Once their article is submitted and whether or not it's accepted, the editor is likely not going to participate any longer because of exactly that reason, his or her article (the reason why they subscribed initially) was either declined or accepted; their main reason for editing was accomplished. AfC is currently not designed to entice these submitters into continue editing on Wikipedia, and I don't know if we'd actually want these editors to do so, considering the hundreds of blatant COI and undesirable submissions. This on top of the fact that we have a severe backlog seriously affects your study's results. In other words, your measure of AfC submitter productivity in part correlates to the size of the backlog; to the amount of honest well-intentioned editors and to the junk we tediously weed out (copyright violations, vandalism, etc.) skewing those results; to the fact that we're missing expert reviewers, among other factors. These are just a few considerations we need to ponder when looking at these numbers, and the future of AfC. Cheers, FoCuSandLeArN (talk) 19:27, 21 May 2014 (UTC)Reply[reply]

Thanks for the feedback and I appreciate your observations. I'm really interested in the potential for gathering subject expert reviewers/collaborators for new articles. One cool opportunity for this is the "co-edit based" recommender inside of en:User:SuggestBot (ping Nettrom and see [1] for more details). This part of the algorithm gathers other editors who tend to edit the articles that you do in order to recommend new articles to edit. Presumably this strategy could also be used to recommend collaborators. --EpochFail (talk) 17:26, 23 May 2014 (UTC)Reply[reply]

Article size and article attribution[edit]

Aaron, thanks for this research! I have felt that the AfC process is seriously broken for a long time. In fact, I think an article has a bigger chance of AfC failure when it is submitted as a one-reference stub no matter how highly qualified the reference is - could you check this? Also, I think that the reason reviewers place the "onus on the submitter" has to do with attribution of the initial article at the moment it moves to mainspace - 100% of the article at that moment is attributed to the submitter. If reviewers rewrite the article in proper wiki syntax, which is often easier to do than review it, then the reviewers are stealing the attribution, in a way. I would like to see a workflow where a submitter can give his/her submission as CC-0 rather than CC-by, because then some reviewer can just create the stub in mainspace and the submitter can make stepwise contributions to it section by section. In fact, it would be great if the "redlink" would link to a Wikidata item first and allows creation of that (or checks the article isn't already a stub-able item on WD already in some other language) as a first stepwise contribution to an article in mainspace. In other words, the initial review for notable subjects that do not pass AfC review, should strip the subject down to stub-level and publish that first and the rest on the talk page, giving a way for the submitter, but also other wikipedians to find the rest of the draft and collaborate in the normal wiki way from there. Jane023 (talk) 09:49, 22 May 2014 (UTC)Reply[reply]

Hi Jane023. Lots of things! I'll try to tackle them one at a time.
  • Re. researching single reference stubs -- I'd like to support such an analysis. I think that using human judgement to compare the initial quality of AfC submissions and direct to main submissions is really the only way to definitively support our conclusions. Would you be interested in organizing a group of hand coders to evaluate a random sample of new articles? --EpochFail (talk) 20:24, 23 May 2014 (UTC)Reply[reply]
No, not interested in helping the AfC process at all. I am more interested in improving the article creation workflow in parallel with the item creation for WD & file upload wizard for commons, whereby the appropriate WD properties, media copyright & categorization process works in tandem with the article text curation. I am not saying that each article needs an image or than each image needs an article or item, but I do believe these workflows need to be synchronized and simplified per "ontology category". Jane023 (talk) 11:55, 24 May 2014 (UTC)Reply[reply]
I don't see improving AfC and improving the article creation workflow as separate efforts. My goal is to construct generalized knowledge about collaborative processes. You had a question and proposed a means to answer it. --EpochFail (talk) 18:39, 24 May 2014 (UTC)Reply[reply]
  • Re. attribution when moving to main namespace, I'm not sure what you mean. Since a page move preserves the edit history, it seems like licensing wouldn't need to be of any concern. --EpochFail (talk) 20:24, 23 May 2014 (UTC)Reply[reply]
This AfC page move idea is such a mess I am not sure if it is even worth discussing. Jane023 (talk) 11:55, 24 May 2014 (UTC)Reply[reply]
I'm not sure what you're talking about. It was you who proposed a complex licensing strategy for article content. I merely pointed out that this is unnecessary since page moves preserve the history of edits. --EpochFail (talk) 18:39, 24 May 2014 (UTC)Reply[reply]
  • Re. redlinks to Wikidata, I really like this idea for two reasons: (1) bringing people to wikidata when the content they are looking for on enwiki doesn't exist would probably drive contributors to wikidata and (2) wikidata could be useful for writing stub articles. When I imagine a newcomer putting an article together from scratch, I'd like to hand them something like a new article kit that includes a reference to the wikidata item. --EpochFail (talk) 20:24, 23 May 2014 (UTC)Reply[reply]
See the redlinks on the en:Koekkoek disambig page. Jane023 (talk) 11:55, 24 May 2014 (UTC)Reply[reply]
Are you directing me to the related links to WikiData and resonator? --EpochFail (talk) 18:39, 24 May 2014 (UTC)Reply[reply]
  • Re. trimming drafts down to stubs if the subject is notable, this might be hard to do. I imagine that it is difficult for reviewers to know when a draft is about a notable subject unless the draft's author makes a clear case. As far as I can tell, failing to make a clear case about notability is the primary reason that new articles are deleted (see Research:The Speed of Speedy Deletions). --EpochFail (talk) 20:24, 23 May 2014 (UTC)Reply[reply]
If you have a ton of unusable text along the lines of "he was the best singer in the history of X", but the guy was notable, than have the reviewer create the Wikidata item first (making the item stub-able", then stub it up, then inform the submitter that there are guidelines to follow and a teahouse for questions, etc, and that the stub is available for improvement along those guidelines. Jane023 (talk) 11:55, 24 May 2014 (UTC)Reply[reply]
The problem is, how do you know that the guy was notable. As the study I referenced shows, the vast majority of deleted articles fail to assert notability. Are you hoping that reviewers will personally be able to suss out the notability of the topic of articles? I think that is a lot to ask for. Maybe you imagine some information support tool to help reviewers determine the notability of a topic independent of the content. --EpochFail (talk) 18:39, 24 May 2014 (UTC)Reply[reply]

Yes, this is the core of what I am getting at. "Reviewers" is too vague a term, because there are content experts among Wikipedians who can quickly verify notability of certain subjects who will never sign up to be reviewer, but who should be served with these specific lookup requests. Some ontologies are easier than others; the ontology of the Economics portal has fewer active Wikipedians than the ontology of the Arts portal (talking only about the English WIkipedia here, others may differ). Notability in Wikipedia terms can almost never be ascertained correctly for long-tail subjects, and serving the question "notable or not?" to the proper person is the challenge. The onus of proof lies now 100% with the AfC submitter, without any specific hints how to go about it, while the definitions of WP notability are not set in stone anywhere. Jane023 (talk) 11:59, 27 May 2014 (UTC)Reply[reply]