Research:Autoconfirmed article creation trial

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search
GearRotate.svg

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.

The goal of this study is to run an experiment on English Wikipedia where we examine the effects of disabling article creation for non-autoconfirmed newly registered editors. The findings have been published on English Wikipedia at Wikipedia:Autoconfirmed article creation trial/Post-trial Research Report.

Contents

Research Questions[edit]

We are interested in understanding the effects this change in permissions will have on newly registered accounts, the English Wikipedia's quality assurance processes (in particular New pages patrol), and the quality of Wikipedia's articles. These three main themes are also reflected in the following three research questions:

RQ-New accounts
How does requiring autoconfirmed status to create new articles affect newly registered accounts?
RQ-Quality Assurance
How does requiring autoconfirmed status affect Wikipedia's quality assurance processes?
RQ-Content quality
How does requiring autoconfirmed status affect the quality of Wikipedia's articles?

Hypotheses[edit]

RQ-New accounts[edit]

H1: Number of accounts registered per day will not be affected.[edit]

Because the system is designed so that restrictions and limitations are communicated and placed at the time of action, rather than described up front, we should see little to no change in the number of accounts that are registered.

H2: Proportion of newly registered accounts with non-zero edits in the first 30 days is reduced.[edit]

Some proportion of new accounts start out by creating a new article, a path that will no longer be available. Instead, they are required to make other types of edits to reach autoconfirmed status if they wish to create articles. This can be regarded as an increased barrier to entry similar to those studied by Drenner et al,[1] who found that requiring more effort from new users resulted in a lower completion rate. We expect some of the newly registered accounts to be "single purpose accounts", ones who are registered only in order to be able to create an article. Because these accounts are unable to create articles, we expect those to no longer make any edits. In line with Drenner et al's work, we expect some of the remaining accounts to not continue their work and leave, which will also contribute to a reduction in the proportion of accounts with non-zero edits.

H3: Proportion of accounts reaching autoconfirmed status within the first 30 days since account creation is unchanged.[edit]

We have so far not found any existing research on the proportion of users who reach autoconfirmed status within a specific time period. Studies on Wikipedia contributors that mention autoconfirmed status have looked at for example spam attack vectors[2] and page protection.[3] Two studies of roles in Wikipedia[4][5] did not distinguish between accounts with and without autoconfirmed status, instead combining both into a "new user" group.

There is presently no requirement to make a certain number of edits/contributions in order to unlock privileges on the English Wikipedia, and the site is not gamified in the way that, for example, Stack Overflow is.[6] We do not know to what extent newly registered contributors seek to unlock these privileges. If a large proportion of newly registered accounts make edits that are subsequently deleted (e.g. because they created a non-encyclopedic article) and that this behaviour leads to their account not reaching autoconfirmed status, it means that those who reach said status can be regarded as productive members of the community. Secondly, H2 proposes that the proportion of accounts with non-zero edits is reduced, suggesting that these newly registered accounts will never reach autoconfirmed status. In combination, there are reasons to hypothesize that the proportion of accounts reaching autoconfirmed status within 30 days of registering will be unchanged.

H4: The median time to reach autoconfirmed status within the first 30 days is unchanged.[edit]

Similarly, we hypothesise that the time it will take for a newly registered account to reach autoconfirmed status does not change.

H5: The proportion of surviving new editors who make an edit in their fifth week is unchanged.[edit]

When it comes to measuring editor retention, the WMF research staff has at least two definitions: Returning new editor, and Surviving new editor. The difference between the two is what they measure (edit sessions and edits, respectively), and what kind of timespan is defined. Given the other kinds of measures and hypothesis we have the measure of "returning new editor" is most likely covered elsewhere, meaning we are mainly concerned about surviving new editors. Due to the time constraints of the trial, we propose measuring retention by measuring surviving new editors who make at least one edit in the first week, who also make at least one edit in the fifth week (as that week crosses the 30 day threshold after registration). From the previous hypotheses, it follows that the proportion of surviving new editors should not change.

Further segmentation[edit]

We would like to analyze surviving new editors further by segmenting this population into those who start out by creating a new article, and those who start out by editing existing articles. When looking at historical data it would also be useful to split the former group depending on whether their article survives or not. Taken together, these measures should tell us more about the extent to which newly registered accounts stick around, and what the effect of new article creation is on contributions and retention.

H6: The diversity of participation done by accounts that reach autoconfirmed status in the first 30 days is unchanged.[edit]

There are no restrictions on where to make the edits needed to reach autoconfirmed status. A newly registered account might make five copy edits to articles, a couple of edits to create a user page to describe themselves, and some edits to discussion pages (e.g. the Teahouse) in order to reach that limit, while another account might make ten articles edits. Earlier we did argue that the newly registered accounts that reach autoconfirmed status will not change significantly, meaning that the diversity (e.g. number of different pages edited) of work done by these accounts will also not change.

H7: The average number of edits in the first 30 days since registering is reduced.[edit]

We hypothesized that the proportion of accounts with non-zero number of edits will be reduced because some of these accounts will be abandoned. This will likely also affect the average number of edits made in the first 30 days since registering. Note that we in this case will also count edits to pages that are subsequently deleted (e.g. non-encyclopedic articles).

H8: The number of requests for "confirmed" status is increased.[edit]

The English Wikipedia has a page where users can request "confirmed" status: Requests for permissions/Confirmed A newly registered account might want to go there to request said status in order to not have to meet the requirements for autoconfirmed status. At the same time, someone who is new to Wikipedia might not know that this possibility exists. We hypothesized earlier that some newly registered accounts have a single purpose, and we might expect some proportion of these to seek out other venues than reaching autoconfirmed status, particularly if the venue is perceived as low cost. This can lead to a small but significant increase in the number of requests for "confirmed" status.

RQ-Quality Assurance[edit]

H9: Number of patrol actions will decrease.[edit]

The number of articles created per day will be reduced since non-autoconfirmed users can no longer create them. This leads to a reduction in the influx of articles into the New Page Patrol queue, and a subsequent reduction in the number of patrol actions.

Related measure[edit]

The ratio of patrol actions to created articles is unchanged. In other words, patrollers notice that there is less work to do and adjust their efforts accordingly.

H10: Number of active patrollers will decrease.[edit]

With a lower influx of articles there is less work to do, leading some patrollers to stop patrolling. Given the lower workload, it is unlikely that new patrollers will be recruited.

Related measure[edit]

The ratio of active patrollers to created articles is unchanged. This follows logically from H10.

H11: The distribution of patrolling activity evens out.[edit]

Because of the reduction in the number of articles that enters the patrol queue, there is less need for the most active patrollers to do as much "heavy lifting". This in turn means the patrol work is distributed more evenly across the active patrollers.

H12: The number of Time-Consuming Judgement Calls will decrease.[edit]

The analysis of New Pages Patrol by the Wikimedia Foundation describes the job of patrolling certain types of newly created articles as "Time-Consuming Judgement Calls" (TCJC). These articles are characterized as follows:

  • They're probably notable but badly written
  • They're well-written but have questionable notability
  • They're a weird mix of both (because life is complicated)

From the WMF's analysis we know that 15% of the articles that are not quickly patrolled are created by non-autoconfirmed users. Because these users are no longer able to create new articles, we can hypothesize that the number of TCJCs will be reduced. We expect the influx of TCJCs created by autoconfirmed users to remain stable.

H13: The size of the backlog of articles in the New Page Patrol queue will remain stable.[edit]

It is not certain whether the the reduction in the workload of New Page Patrollers will free up resources to go through and process the backlog of articles that still need patrolling, or if the patrollers will focus on other areas of the encyclopedia. Because we hypothesized that the number of patrol actions and patrollers would decrease in H9 and H10, also relative to the number of created articles, it would follow that the size of the backlog remains stable.

H14: The survival rate of newly created articles by autoconfirmed users will remain stable.[edit]

When it comes to article survival, we are obviously only concerned with the survival of articles created by autoconfirmed users. As we earlier hypothesized that these types of users will continue to be productive members of the community, we also hypothesize that the survival rate of their articles will remain stable. We adapt the definition of survival used by Schneider et al,[7] meaning an article survives if it is not deleted during the first 30 days of its life.

H15: The rate of article growth will be reduced.[edit]

The previously mentioned statistics on article creation and survival[8][9] indicated that around 50–100 surviving articles are created each day by non-autoconfirmed users. As before, we have hypothesized that it is unlikely that newly registered accounts will do the work necessary to reach autoconfirmed status, meaning that these articles will no longer be created and reduce the rate of article growth on the English Wikipedia.

Related measures[edit]

We are also interested in understanding more about how articles come to be. To what extent are they created as drafts in the user namespace and then moved into the article namespace? Secondly, are creations and moves done by recently autoconfirmed accounts? The proportion of articles created as moves might increase due to it becoming an alternative approach to article creation. Since we hypothesized that the proportion of accounts reaching autoconfirmed status in 30 days will remain unchanged, we would also hypothesize that creations and moves by recently autoconfirmed accounts would remain stable.

H16: The rate of new submissions at AfC will increase.[edit]

We expect some proportion of newly registered accounts to make a failed attempt at creating an article, and upon doing so will choose to use the Article Wizard instead, meaning their proposed article will be added to the review queue at Articles for Creation (AfC). This should result in an increase in the rate of proposed articles reaching AfC.

H17: The backlog of articles in the AfC queue will increase faster than expected.[edit]

The process of handling AfC submissions is similar to NPP in that it requires human intervention. Since we proposed that the NPP backlog will decrease due to the lower rate of influx of articles there, we must similarly propose that the backlog of articles at AfC will increase.

H18: The reasons for deleting articles will remain stable.[edit]

When it comes to reasons for why articles get deleted, we again refer to our previous hypothesis that newly created accounts that reach autoconfirmed status will be productive members of the community. This means that article creations will remain stable and the reasons for why articles get deleted will not change significantly.

H19: The reasons for deleting non-article pages will change towards those previously used for deletion of articles created by non-autoconfirmed users.[edit]

When it comes to reasons for why non-article pages get deleted, we expect to see a change as newly registered accounts can create pages outside the main (article) namespace (e.g. AfC as mentioned above, or their user page or sandbox). Some of this content will likely be candidates for speedy deletion due to copyright infringement (G12) or "Blatant misuse of Wikipedia as a web host" (U5). In other words, we expect to see a change in the reasons for deletions of non-article pages.

RQ-Content quality[edit]

H20: The quality of articles entering the NPP queue will increase.[edit]

The quality of articles entering the NPP queue will change due to the removal of articles created by non-autoconfirmed users. Those articles that enter the queue that are originally created by non-autoconfirmed users will have gone through the AfC process, meaning they should be of higher quality than earlier, leading to an increase in the average quality of articles in the NPP queue.

H21: The quality of newly created articles after 30 days will be unchanged.[edit]

Most of the English Wikipedia's articles are lower quality (Stub- and Start-class) with low readership,[10] and over time the articles that are created are increasingly on niche topics.[11] We therefore expect the quality of newly created articles after 30 days to remain unchanged.

H22: The quality of articles entering the AfC queue will be unchanged.[edit]

Little is currently known about the quality of content entering the AfC queue. Research on AfC has instead focused on whether the process leads to greater success in publishing articles or retaining contributors.[7][12] We have hypothesized that the influx of new drafts into AfC will increase somewhat due to newly registered accounts not being able to create articles in the main namespace, but we do not know to what extent those who choose to do so are those who create higher quality content to begin with. Given this uncertainty about the current state of articles entering AfC as well as the creators of these drafts, we hypothesize that the quality of articles entering AfC will not change.

Methods[edit]

Our methods of gathering data makes extensive use of the replicated databases available through for example Wikimedia Toolforge and Quarry. If you are unfamiliar with the schema of these databases you can learn more through the database layout manual. Some of the data is from the Data Lake. We build a pipeline that enables us to easily gather historic data as well as being updated during the trial.

H1: Number of accounts registered[edit]

We utilize the logging table to gather this data because it provides two benefits over the user table: most importantly, it has indexes that allow us to gather data across a specific time span; secondly, we can identify how the account was created. A drawback of this method is that the account registration timestamp in the logging table might differ from the one in the user table, typically by about one second. Because this hypothesis is interested in the number of accounts registered, which is usually on the order of "thousands", an error of a second in the timestamp should affect a small number (ideally zero) of accounts per day, thereby enabling us to ignore it.

There are different types of ways accounts can be created, ref Manual:Log actions. Since account creations started being logged in 2006 there have been five different types in use:

newusers
A new account is created.
create
The "normal" user registration, someone without an account registers one.
create2
A logged-in user creates an account for someone else.
autocreate
An account is automatically created, for example because someone with an account on a different language edition visits another Wikipedia.
byemail
An account is created for someone else by a logged-in user, and the created account gets the password through email.

The Python code for capturing this data is in newaccounts.py. We have [newaccounts-20170925.tsv a dataset with statistics] published in our GitHub repository. Our data can also be queried using the tools.labsdb database server on Toolforge, the database name is s53463__actrial_p and the table is named newaccounts.

H2: Accounts with non-zero edits[edit]

In order to determine the proportion of accounts that have made at least one edit in the first 30 days, we need to know two things: the account's identifier and when they registered. We utilize the logging table similarly as we did for H1, because we're interested in data across a given timespan, and record the user ID of the registered account, the time of registration (as logged in the logging table), and how the account was created (per the types listed above).

Once we have a list of accounts registered, we can look their edits up in the revision and archive tables in order to account for edits to deleted pages. Because we can get data for three hypotheses at the same time, our code gathers data for H2, H6, and H7, as that is most efficient. The Python code for this is in registrations.py and first30.py.

H5: Surviving new editors[edit]

A surviving new editor is defined as one who makes an edit in both the first and fifth week after registration. We use the same approach to finding edits as for H2, but alter it so that it uses the appropriate timespans. In order to learn what the actual reduction in proportion of surviving new editors is, we store edit counts in both weeks for accounts that made an edit in the first week. The Python code is found in survival.py.

Data is also gathered about whether an account started out by creating an article. This is done by utilizing the dataset gathered for understanding article creations to answer H9's related measure, as that dataset contains all article creations. We look up a user's first edit (taking deleted pages into account as we did for H2) and check if that edit corresponds to an article creation event. If so, we use the logging table to identify whether the article was deleted within 30 days of the creation event. The relevant Python code is found in articlesurvival.py.

H9: Number of patrol actions[edit]

Since the introduction of the PageTriage extension, patrol actions are logged in the logging table. We use this to count the number of patrol actions using the SQL query below.

This data is gathered using the ReportUpdater tool.

In order to calculate the related measure, we gathered data on article creations from the data lake using a modified version of the query used in T149021]. The modifications involve not removing current redirects because a created article might be replaced by a redirect later, improving the detection of pages created as redirects per Tbayer's notes, and removing pages created by moves because those also create redirects.

We also gather data on articles published through a move from User, Draft, or Wikipedia talk namespaces. This is a two-step process that first finds all revisions where the edit comment indicates a page was moved from either of those three namespaces, and secondly identifies whether the namespace it was moved to was main. The Python code for this is in moves.py.

H10: Number of active patrollers[edit]

We modify the query from Research:New page reviewer impact analysis that counts the number of patrollers so that it counts them by day.

Similarly as for H9, this data is gathered using the ReportUpdater tool. We gathered data on the related measure in the same way as described for H9 above.

H11: Distribution of patroller activity[edit]

We modify the SQL query used in H10 so that it counts the number of patrol actions performed by each patroller on a given day. The Python code for this is in patrollers.py, the SQL query is shown below.

Datasets[edit]

The data gathered as part of this project is published in various datasets. Please see our dataset page for locations and descriptions of these datasets.

Results[edit]

Preliminary analysis of historic data[edit]

H1: Number of accounts registered per day will not be affected.[edit]

Our dataset (described in more detail in Methods) distinguishes between several ways of registering an account. The four methods that are currently in use and will be shown in our graphs are:

create
Normal account creation, someone registers an account.
autocreate
The system automatically creates an account, which typically happens when someone with an account on a different Wikimedia project visits the English Wikipedia.
create2
Someone with account creation rights creates an account for someone else. This might typically be the result of an account creation request.
byemail
Similar to "create2", except the password associated with the account is sent through email.
Plot of the number of accounts per day on the English Wikipedia from Jan 1, 2009 to Sep 25, 2017, separated by type of account creation.

We start out by plotting the number of accounts registered per day across all four registration types, and the plot is linked on the left. Here we can see that normal account creation and auto-creation dominate, with number of accounts created per day counted in the thousands. "create2" and "byemail" barely show up on this graph.

Plot of the number of accounts created for others per day on the English Wikipedia from Jan 1, 2009 to Sep 25, 2017. "byemail" are accounts that get their password sent through email, "create2" are accounts created by someone with account creation rights.

Next we plot "create2" and "byemail" for themselves in order to better understand how many of these accounts are typically created every day. Here we can see that over the past three years, it is generally somewhere around 25–50 accounts per day. The distribution and median is quite similar for both types of account creation (ref [https://commons.wikimedia.org/wiki/File:Boxplot_create2_byemail_2014-2017.svg this box plot), but we see that "byemail" is more likely to see large number of accounts created.

We choose to regard the accounts labelled as created using the "create2" and "byemail" methods as having been created in a similar way to regular account creations. This means that further analysis only studies two types of accounts: those that are auto-created and those that are not. There are two reasons for combining accounts this way: first of all, the number of accounts created is orders of magnitude smaller than the others; and second, we expect the intent behind the account to be similar for all three non-autocreated types of creations. The issue with the small number of accounts created is that we might find them to have outlier-type behaviour that might suggest further study, but on the larger scale of things it might not be important (e.g. when measuring proportions, we might find that all "create2" accounts behaved in a certain way for a given day). Perhaps more important is that there should be a clear difference in intent between an auto-created account and other types of accounts. An auto-created account might simply be someone reading Wikipedia, while if someone creates an account manually we would expect them to intend to contribute to the project.

Plot of the number of non-auto-created accounts per day on the English Wikipedia from Jan 1, 2009 to Sep 25, 2017.

Next we estimate the number of accounts created per day depending on whether the accounts were auto-created or not. The plot on the left shows the number of non-auto-created accounts per day, with a LOESS-smoothed line added. We can see some spikes in registrations in 2014 and 2015, these appear to be related to both the SUL finalization project and the fact that the Wikipedia smartphone app required/encouraged users to register an account.

Plot of the number of auto-created accounts per day on the English Wikipedia from Jan 1, 2009 to Sep 25, 2017.

The plot on the left shows the number of auto-created accounts per day, again with a LOESS-smooted line added. In this plot we also see the spike in registrations in 2014 and 2015. From that point on, account creation appears to be related to the season, e.g. we see higher number of accounts registered in the spring and fall than during the summer.

Based on the plots we decide to use the following estimates of number of accounts created per day on the English Wikipedia: 2,500 auto-created accounts, and 5,000 non-auto-created accounts. These estimates will typically be used in order to generate other estimates, while the subsequent analysis of our hypothesis will instead mainly focus on proportions.

H2: Proportion of newly registered accounts with non-zero edits in the first 30 days is reduced.[edit]

In our analysis of H1 we argued that the main difference in account creation happens between those that are auto-created and those that are not. The analysis of H2 will follow along the same lines, with newly registered account split into two groups: auto-created and others.

Proportion of newly registered accounts having non-zero edits in their first 30 days, split into accounts that were autocreated and those that were not. Plotted by day from Jan 1, 2009 to July 1, 2017.

The plot on the left shows the proportion of registered accounts making at least one edit in the first 30 days since registration, plotted on a per-day basis and split into accounts that were auto-created and those that were not. We can see that overall, something like one third of non-auto-created accounts make at least one edit in the first 30 days. This proportion does change somewhat over time, but appears to mainly have been affected by the huge registration numbers mentioned in the analysis of H1. Before and after that period (mid-2014 to mid-2015) there are fluctuations, but these appear to be seasonal rather than longer-term trends.

We can also see that the proportion of auto-created accounts that make edits is much lower, and this is expected based on how auto-created accounts are registered. Unlike the other accounts, the fluctuations in auto-created accounts is much lower. It also appears to be on a slow downwards trend. Studying why that trend occurs is outside the scope of this project.

H3: Proportion of accounts reaching autoconfirmed status within the first 30 days since account creation is unchanged.[edit]

In this analysis we focus on accounts that made edits in the first 30 days since registrations. In other words, we measure the proportion of accounts that made at least one edit in the first 30 days since registration that also went on to become autoconfirmed within the same time period. If we do not apply this type of filter filtering the plot will just look like the one shown for H2 above.

Plot of the proportion of accounts with non-zero edits in the first 30 days since registration that also reach autoconfirmed status within the same timeframe. Split by type of creation (autocreated versus others), and plotted by day from Jan 1, 2009 to July 1, 2017.

The plot on the left shows the proportion of accounts reaching autoconfirmed status, calculated on a per-day basis from January 1, 2009 to July 1, 2017. A key trend to note in the plot is that the proportion has been stable across time, both for auto-created accounts as well as others. We see that for non-auto-created accounts the proportion is around 10%. There are some seasonal variations, but no long-term changes.

The fluctuations in proportion for auto-created accounts is larger, as expected due to the smaller number of accounts that are created (ref H1) and the much lower proportion of accounts that make edits (ref H2). Overall the trend is similar to that for non-auto-created accounts, although perhaps slightly higher. There is a spike in the graph in late 2015, specifically on September 15 and 17. This spike is completely driven by contributors going through The Wikipedia Adventure (TWA). There were 146 autocreated accounts in total across both days, of which 117 (80.1%) went through TWA. That leaves 23 accounts that did not, which is on par with the median of 12.

Final results[edit]

Caption: Table with overview of our hypotheses and whether they are supported or not.
No. Hypothesis Supported
H1 Number of accounts registered per day will not be affected. Supported.
H2 Proportion of newly registered accounts with non-zero edits in the first 30 days is reduced. Not supported.
H3 Proportion of accounts reaching autoconfirmed status within the first 30 days since account creation is unchanged. Supported.
H4 The median time to reach autoconfirmed status within the first 30 days is unchanged. Supported.
H5 The proportion of surviving new editors who make an edit in their fifth week is unchanged. Supported for autocreated accounts, not supported for other accounts.
H6 The diversity of participation done by accounts that reach autoconfirmed status in the first 30 days is unchanged. Supported.
H7 The average number of edits in the first 30 days since registering is reduced. Not supported.
H8 The number of requests for "confirmed" status is increased. Not supported.
H9 Number of patrol actions will decrease. Supported.
H9r Related measure: The ratio of patrol actions to created articles is unchanged. Supported.
H10 Number of active patrollers will decrease. Supported.
H10r Related measure: The ratio of active patrollers to created articles is unchanged. Not supported.
H11 The distribution of patrolling activity evens out. Not supported.
H12 The number of Time-Consuming Judgement Calls will decrease. Research incomplete.
H13 The size of the backlog of articles in the New Page Patrol queue will remain stable. Not supported.
H14 The survival rate of newly created articles by autoconfirmed users will remain stable. Supported.
H15 The rate of article growth will be reduced. Not supported.
H16 The rate of new submissions at AfC will increase. Supported.
H17 The backlog of articles in the AfC queue will increase faster than expected. Supported.
H18 The reasons for deleting articles will remain stable. Not supported.
H19 The reasons for deleting non-article pages will change towards those previously used for deletion of articles created by non-autoconfirmed users. Not supported.
H20 The quality of articles entering the NPP queue will increase. Partially supported.
H21 The quality of newly created articles after 30 days will be unchanged. Not supported.
H22 The quality of articles entering the AfC queue will be unchanged. Partially supported.

RQ-New accounts[edit]

H1: Number of accounts registered per day will not be affected.[edit]

H1 is supported. The January 31 work log contains the main analysis and forecasting results showing no indication that ACTRIAL has affected the number of accounts registered.

H2: Proportion of newly registered accounts with non-zero edits in the first 30 days is reduced.[edit]

H2 is not supported. The February 6 work log contains the analysis and forecasting results, showing that ACTRIAL has had no effect on the proportion of accounts making edits in the first 30 days after registering.

H3: Proportion of accounts reaching autoconfirmed status within the first 30 days since account creation is unchanged.[edit]

H3 is supported. The February 6 work log contains the analysis and forecasting results. We only analyze accounts that make at least one edit during their first 30 days, and analyze autocreated and non-autocreated accounts separately. In both cases, we find that the proportion of accounts reaching autoconfirmed status is within the confidence interval of our forecast, indicating that ACTRIAL has not affected it.

H4: The median time to reach autoconfirmed status within the first 30 days is unchanged.[edit]

H4 is supported. The February 6 work log contains the analysis. We measure this only for accounts that reach the status within 30 days, and analyze autocreated and non-autocreated accounts separately. We compare the first two months of ACTRIAL against similar data from the same time of year in 2014, 2015, and 2016. For both types of accounts, we find that both historically and during ACTRIAL the median time to autoconfirmed status is four days.

H5: The proportion of surviving new editors who make an edit in their fifth week is unchanged.[edit]

H5 is supported for autocreated accounts, and not supported for non-autocreated accounts. The February 7 work log contains the analysis. Using a model to forecast survival during ACTRIAL, we find that the survival of autocreated accounts is as expected. The survival rate of non-autocreated accounts is higher than expected, thus not supporting our hypothesis. This increase in survival can be seen as a continuation of a similar trend in recent years, and further study would provide information as to what extent it is caused by ACTRIAL.

We further segment the new editors into those who created articles and/or drafts, and those who did not create any of those. The analysis for this can be found in the December 18 work log, the January 17 work log, and the February 12 work log. We find that there's a statistical significant decrease in the survival rate of Draft creators during ACTRIAL, but that this decrease is on par with an influx of article creators (see the January 17 log for this analysis). Survival of those who do not create articles and/or drafts has increased compared to similar time periods in the previous years.

H6: The diversity of participation done by accounts that reach autoconfirmed status in the first 30 days is unchanged.[edit]

H6 is supported. The February 8 work log contains the analysis. We find increased diversity during ACTRIAL compared to similar periods of 2014, 2015, and 2016. However, forecasting models suggest this increase is not outside what one would expect, leading us to conclude that the hypothesis is supported.

H7: The average number of edits in the first 30 days since registering is reduced.[edit]

H7 is not supported. The February 9 work log contains the analysis. We find activity levels for autocreated accounts is at the same level as 2014, 2015, and 2016 taken together. When it comes to non-autocreated accounts, there is a small but significant increase compared to the previous years. Using a forecasting model we find that activity in October and November is as we would expect. In conclusion, we find no indication that activity has been reduced as a result of the trial.

H8: The number of requests for "confirmed" status is increased.[edit]

H8 is not supported. Unfortunately, there doesn't seem to be an easy way to quantify this data other than manually counting the requests. If we look at the number of requests in January 2018 (chosen randomly), we see 25 requests for confirmed status. For January 2017, there were 24 requests. There does not appear to be a significant increase in the number of requests (at least judging from these two data points).

RQ-Quality Assurance[edit]

H9: Number of patrol actions will decrease.[edit]

H9 is supported, as is our related measure of the ratio of patrol actions to created articles. The February 10 work log has the analysis. We find that there is a significant reduction in the number of patrol actions. The related measure is also supported, indicating that the reduction in number of patrol actions is commensurate with the reduction in number of articles created.

H10: Number of active patrollers will decrease.[edit]

H10 is supported. See the February 10 work log for the analysis. Our analysis looks at the first two months of ACTRIAL as a whole, and finds a significant decrease in the average number of active patrollers. The related measure, hypothesizing that the ratio of active patrollers to articles will stay unchanged is not supported. Instead, we find there to be a smaller reduction in the number of active patrollers than there is a reduction in created articles.

H11: The distribution of patrolling activity evens out.[edit]

H11 is not supported. See the February 10 work log for the analysis. We find that there is a small reduction in the proportion of patrolling actions done by the most active quartile of patrollers during ACTRIAL, but that this reduction is not larger than what we would expect. In other words, the work of new page patrol continued to be mainly done by a small group of the active reviewers during ACTRIAL.

H12: The number of Time-Consuming Judgement Calls will decrease.[edit]

Research on this hypothesis was not completed due to the complexity of defining Time-Consuming Judgement Calls.

H13: The size of the backlog of articles in the New Page Patrol queue will remain stable.[edit]

H13 is not supported. The analysis can be found in the February 16 work log. We find that the queue decreased rapidly during the first two weeks of ACTRIAL, and decreased further during October albeit as a slower rate. The first two weeks of November the queue increased again, ending up at a similar level as it was on October 1. In other words, we find general patterns of instability in the queue, thus not supporting H13.

H14: The survival rate of newly created articles by autoconfirmed users will remain stable.[edit]

H14 is supported. See the February 2 work log and the February 3 work log for the analysis. During the first two months of ACTRIAL, 19.4% of articles created by autoconfirmed users get deleted. Our analysis shows that this proportion is higher than the same two months in 2014, 2015, and 2016, but also that an increase in the proportion started several months prior to ACTRIAL. Using an ARIMA forecasting model, we find that the proportion of deletions during ACTRIAL is as expected.

H15: The rate of article growth will be reduced.[edit]

H15 is not supported. The analysis can be found in the February 14 work log. Using data from July 1, 2014 through the first two months of ACTRIAL, we find that there is not a significant reduction in article growth after the trial started. Our data suggests a reduction in article growth several months prior to ACTRIAL starting, which could be further studied.

H16: The rate of new submissions at AfC will increase.[edit]

H16 is supported. The February 20 work log contains the analysis and forecasting results showing that the number of AfC submissions during ACTRIAL has significantly increased. Using data from September 15 to November 15 in the three years prior to ACTRIAL, we find an expected daily average of 53.9 submissions per day, while during the first two months of ACTRIAL the daily average was 137.

H17: The backlog of articles in the AfC queue will increase faster than expected.[edit]

H17 is supported. The analysis can be found in the February 21 work log. We estimate the size of the queue of AfC submissions and find that this estimate appears to largely follow the trends in the best available dataset. Comparing our estimate during ACTRIAL to similar time periods of 2015 and 2016, we find that there is on average a substantial increase in the size of the queue during ACTRIAL, while in 2015 and 2016 there was on average a decrease.

H18: The reasons for deleting articles will remain stable.[edit]

H18 is not supported. The analysis can be found in our February 22 work log. We find a significant reduction in deletions for the first two months of ACTRIAL, and this reduction is mainly driven by a reduction in speedy deletions. The proportions of usage are not stable, there are for example increases in the usage of PROD and AfD during ACTRIAL.

H19: The reasons for deleting non-article pages will change towards those previously used for deletion of articles created by non-autoconfirmed users.[edit]

H19 is not supported. The analysis can be found in our February 22 work log. A key reason for why H19 is not supported is that different namespaces have different conventions and policies for deleting pages, making it impossible to directly compare them. We find some increase in usage of speedy deletion reasons for copyright infringement and advertisements in the Draft namespace, but it is unclear whether this is a significant increase. The reasons for deleting pages in the User namespace appear largely unaffected by ACTRIAL.

RQ-Content quality[edit]

H20: The quality of articles entering the NPP queue will increase.[edit]

H20 is partially supported. The February 22 work log contains the analysis. We find a reduction in two indicators of quality issues: proportion of article creations permanently deleted, and proportion of creations not flagged as "OK" by ORES' draft quality model. For the third indicator of article quality, the average weighed sum of quality predicted by ORES' article quality model, used on articles that are flagged as "OK" by the draft quality model, we find that ACTRIAL has no effect.

H21: The quality of newly created articles after 30 days will be unchanged.[edit]

H21 is not supported. The analysis can be found in the March 9 work log. We find that articles that survive for at least 30 days on average see a small but positive change in quality during that time. This effect is consistent across our dataset going back to July 1, 2014. Our analysis finds that the magnitude of this change is not affected by ACTRIAL.

H22: The quality of articles entering the AfC queue will be unchanged.[edit]

H22 is partially supported. See the February 24 work log for the analysis. For two out of three quality indicators, we find no significant change during the first two months of ACTRIAL. We find a significant increase in the proportion of permanently deleted AfC submissions from the Draft namespace, a finding that does not support H22. The proportion of AfC submissions labelled "OK" by the draft quality model, and the average weighed sum of quality predicted by ORES' article quality model sees no significant change during ACTRIAL.

References[edit]

  1. Sara Drenner, Shilad Sen, and Loren Terveen. 2008. Crafting the initial user experience to achieve community goals. In Proceedings of RecSys. DOI
  2. Andrew G. West, Jian Chang, Krishna Venkatasubramanian, Oleg Sokolsky, and Insup Lee, "Link Spamming Wikipedia for Profit", 8th Annual Collaboration, Electronic Messaging, Anti-Abuse, and Spam Conference, 152–161. September 2011. DOI
  3. Benjamin Mako Hill and Aaron Shaw. 2015. Page Protection: Another Missing Dimension of Wikipedia Research. In Proceedings of OpenSym DOI
  4. Arazy, Ofer, Oded Nov, and Felipe Ortega. "The [Wikipedia] World is Not Flat: on the organizational structure of online production communities." (2014).
  5. Ofer Arazy, Felipe Ortega, Oded Nov, Lisa Yeo, and Adam Balila. 2015. Functional Roles and Career Paths in Wikipedia. In Proceedings of CSCW DOI
  6. Ashton Anderson, Daniel Huttenlocher, Jon Kleinberg, and Jure Leskovec. 2013. Steering user behavior with badges. In Proceedings of WWW DOI
  7. a b Jodi Schneider, Bluma S. Gelley, and Aaron Halfaker. 2014. Accept, decline, postpone: How newcomer productivity is reduced in English Wikipedia by pre-publication review. In Proceedings of OpenSym DOI
  8. User:Scottywong/Article_creation_stats
  9. User:MusikAnimal_(WMF)/NPP_analysis
  10. Warncke-Wang, M., Ranjan, V., Terveen, L., and Hecht, B. "Misalignment Between Supply and Demand of Quality Content in Peer Production Communities", ICWSM 2015. pdf See also: Signpost/Research Newsletter coverage
  11. Shyong (Tony) K. Lam and John Riedl. 2009. Is Wikipedia growing a longer tail?. In Proceedings of GROUP DOI
  12. Research:Wikipedia article creation

See also[edit]