Grants:Project/nschwitter/The Role of Offline Ties of Wikipedians/Timeline

From Meta, a Wikimedia project coordination wiki

Timeline for nschwitter[edit]

Task Days required Time period (ca.)
Finish pre-processing and cleaning of data 20 working days October 2021
Conduct descriptive analysis of data 10 working days
Conduct inferential analysis of productivity and collaboration 20 working days November 2021
Write up of results and creating documentation for reproduceable analysis of productivity and collaboration 20 working days December 2021
Conduct inferential analysis of reverting behaviour 20 working days January 2022
Write up of results and creating documentation for reproduceable analysis of reverting behaviour 20 working days February 2022
Preparation of presentation of research at WikiWorkshop 10 working days March 2022 (date depends on CfP)
Conduct inferential analysis of votingbehaviour and write up of results 20 working days April 2022
Write up of results and creating documentation for reproduceable analysis of voting behaviour 20 working days May 2022
Attendance of local meetup / offline community engagement 5 working days
Write up of research as submittable PhD thesis and write up of documents/guidelines for reproducible analysis 30 working days June 2022 - July 2022
Preparing data to make it openly sharable and researching repositories for data 10 working days
Spread outcomes to the Wikipedia/Wikimedia community 20 working days August 2022
Write up of end report for WFM and preparation of academic research articles 20 working days September 2022

Monthly updates[edit]

Please prepare a brief project update each month, in a format of your choice, to share progress and learnings with the community along the way. Submit the link below as you complete each update.

October 2021[edit]

I have spent this month as planned, continuing and hopefully having finished data cleaning, and having started descriptive analyses. Data preparation was, of course, a much longer process than the 20 allotted days - it has been one of my main tasks for the past 12 months, and organising and re-organising the data is an ongoing task. But if the rule of thumb holds true that in data science, 80% of time is spent on cleaning the data, and only 20% of time is spent on creating insights, I should still be well in time :-)

I will also try to share a plot per monthly report. Today, I want to share the spatial distribution of meetups organised in the German Wikipedia.

Map of the location of meetups organised on the German Wikipedia. Most of the meetups took place in German speaking countries, but they took place in 24 different countries.
Map of the location of meetups organised on the German Wikipedia

I have recorded a total of 4408 meetups which took place and were organised through the German Wikipedia - excluding the very regular and often more formalised meetups at official Wikimedia community spaces like the WikiBaer in Berlin, as well as excluding very large meetups with more than 50 people (due to my research questions, I want to assume that everyone had the chance to meet everyone else). I cover the time from the start of the German Wikipedia up until March 2020, when meetings came to halt due to the outbreak of the Covid19 pandemic. Naturally, most of the meetups organised by the German Wikipedians have also taken place in German speaking countries: 89% of all meetups took place in Germany, 6% in Austria, 4% in Switzerland, and 0.02% in Liechtenstein.

Even though this captures around 99% of the meetups, the remaining percent took place in 20 different countries: Australia (5), Belgium (2), Canada (1), China (1) Czech Republic (4), Finland (6), France (3), Hungary (1), Italy (5), Japan (8), Mexico (1), the Netherlands (2), Poland (10), Slovakia (1), Slovenia (1), South Africa (1), Majorca in Spain (1), Sweden (2), the United Kingdom (6) and Ukraine (1). The German Wikipedians meet around the globe!

November 2021[edit]

November has been spent as planned, I started to conduct inferential analysis of productivity behaviour and collaboration. I have not yet finished the analysis but I am planning to do so by next week. I have already started to write up the results and started to set up a guideline document. So far, the preliminary results seem to suggst that Wikipedians tend to increase their editing behaviour after joining their first meetup compared to before and compared to a control group of similar Wikipedians. The effect is rather small, but consistent across different time frames. Overall though, the range in the change of behaviour is extremely large due to editors with a very high level of activity (in the past or present).

Change of edit activity of Wikipedians after joining their first time frame depicted as bar plot.
Change of edit activity of Wikipedians after joining their first time frame.

December 2021[edit]

After taking a workshop on specific advanced methods on data analysis at the beginning of the month, I finished the inferential analysis of productivity and collaboration behaviour. This has led to the first results which will be discussed in the midpoint report. December was also spent on track with the write up of a finished draft version of the results (lacking a plot for this month's report). I have started the guideline to allow for reproducible analysis in other language versions, currently focusing on discussing the process of data collection.

January 2022[edit]

January was not spent according to the timeline, but instead I worked on multiple different things. I solved some conceptual issues which inflicted both, elections and norm-relevant behaviour. I have worked this month on two conference abstracts/papers to present my reseach. I have submitted an abstract to a session of network researchers in Germany which has been accepted. I will present preliminary results on the topic of norm-relevant behaviour. In line with this, I have decided to work on norm-relevant behaviour before elections. A first figure is displayed below, describing the number of reverts per year. The figure does not (yet) take into account the volume of edits.

Barplot showing reverts over time. Increase of number of reverts in the early years of Wikipedia (up to 2007), decrease from 2012 on and now remaining on a stable level. Particularly the number of reverts where users revert IPs have been reduced.
Barplot showing reverts over time

I have also started to prepare the paper to present at the WikiWorkshop 2022. The paper is receiving its finishing touches and I will submit it in the course of the month.

February 2022[edit]

This month has been spent well on track. I have made considerable progress on the write-up regarding reverting behaviour and should be able to finish it soon after receiving some more feedback. For the report, I also updated last month's figure to be more insightful: The figure now shows the proportion of edits reverted by contributor type, revealing for example that, on average, 17% of all edits made by IPs are subsequently reverted. Regarding offline meetings, I do find that those taking part in meetings do, on average, revert others more often, but there is no evidence that an ego's network density matters (this was testing a theoretical mechanism proposed by James Samuel Coleman).

Proportion of edits reverted by contributor type across the years in the German Wikipedia. Across the years, less than 1% of edits by bots are reverted and around 2.5% of edits made by registered users (with an increase in the early years of Wikipedia and stable numbers since 2008). There is an increase of the proportion of edits reverted made by IPs up to the year 2010 (peak year where around 24% of edits made by IPs were reverted) and since decreased slightly again.
Proportion of edits reverted by contributor type across the years in the German Wikipedia.

Besides th write-up, I have also submitted a paper for the WikiWorkshop 2022 and I have started to prepare another conference presentation next week and re-started to work on election behaviour.

Considering the epidemiological development, I'm also optimistic to potentially attend meetups in the summer.

March 2022[edit]

In March 2022, I have attended a conference of German network scientists and presented my work on reverting behaviour. After receiving some feedback, I finished the write-up of the chapter. After this, I have started to work on voting behaviour; I set up the data and started with preliminary analysis. For more general info, I now published the midpoint review.

April 2022[edit]

This month, I have worked on the chapter on voting behaviour. After setting up the data, I have conducted the inferential analyses and written up the results. My PhD thesis is currently in a good state.

I have continued to work on writing the learning patterns to allow for reproducible analyses.

I have also looked into meetups taking place in my area. Most are (currently) inactive, but I hope to potentially attend a meetup in Frankfurt in July.

I have presented part of this project at the WikiWorkshop 2022 and have received good feedback and interest (thanks!).

May 2022[edit]

I had to be out of office for the majority of May but made good progress in the last week. I have finished editing my PhD thesis and I am currently waiting for feedback. After the feedback and making adaptions, I will translate the results into a more community-friendly and accessible form to share it in the final report.

I will also be able to guest lecture on Wikipedia - the project as a whole as well as my research - in July as part of a module on digitalisation at the University of Bern. This module is part of a certificate of advanced studies aimed at professionals interested in the field of sustainable development.

I have published the learning pattern on how to collect data on offline meetups with my approach and lessons-learnt from the meetup data collection. I have now just started on a second learning pattern regarding the collection of election data.

June 2022[edit]

I have received feedback on my PhD thesis draft this month and I have thus spent the majority of time since then on the rather un-exciting task of editing, rewriting, and formatting (and will continue to do so for the next month). I have started gathering more accessible visualisation approaches and I am trying to see what works.

I have also researched potential repositories to make my data accessible after the project/as part of my publications and have decided on the Open Science Framework.

I have published the learning pattern on how to collect election data that I started last month. It shares my lessons learnt, a code example and contains an extentable list with the data which has been previously collected. I have also written and published a second learning pattern on how to analyse effects of offline meetups.

July 2022[edit]

I have spent the month as planned: I was mainly continuing on rewriting and editing the thesis. It is currently again off my hands for another round of feedback.

In the beginning of the month, I guest lectured on Wikipedia at the University of Bern, embedded in a module on digitalisation. The module is part of a certificate of advanced studies aimed at professionals interested in the field of sustainable development. It went very well; I talked about Wikipedia as a whole and gave some insights into my own research.

Further, I have prepared and cleaned my datasets to make them sharable. I also continued working on developing a publication strategy and started on a short article about my research for Der Kurier.

August 2022[edit]

This was another month of re-analysis, rewriting, and editing of the thesis to have it ready to submit in September. Also, I finished and published the article for der Kurier. I have looked into other avenues of sharing the results but most of them have been rather inactive (like the Wikipedianischer Salon and the meetup community in my current city).

I have further started preparing two scientific articles. One of them will focus on the meeting data collected and highlight its (sociological) potential, to make the data more accessible to researchers. Ther other article will focus on the effect of offline meetings on online election behaviour as the analyses have revealed interesting patterns and network effects.

September 2022[edit]

Is your final report due but you need more time?

Extension request[edit]

New end date / final report submission date[edit]

December 31, 2022.


I would like to request an extentions for the submission of my final report.

My project as a whole has been going as planned and I will be tying up all loose ends this month. After a busy year, I will then be travelling in October and November. While I have started working on the final report, I would like to ask for an extension so that I can finish it when I return to the office in December (and submit it by December 31 the latest). This would allow me to write a more in-depth report of higher quality. If any information is urgent (for example the financial reporting), I am happy to provide this earlier.

Extension approved: New end date December 31, 2022[edit]

I'm approving your extension request for a new Final Report due date of December 31, 2022.

Warm regards, Marti (WMF) (talk) 16:11, 12 September 2022 (UTC)