Wikimedia Clinics/007

From Meta, a Wikimedia project coordination wiki

This is a digest (a processed, edited summary) of the online conference call Wikimedia Clinic #007, held on August 5th 2020. It sacrifices fidelity to people's exact words in favor of clarity, brevity, and digestibility.

Except for the introduction and the first topic, which is pre-scheduled, the topics are brought up by volunteers participating in the calls.

The call was attended by 2 members of Wikimedia Foundation staff and 15 volunteers.

Topic 1: Introduction[edit]

quick principles[edit]

  • listen with patience and respect
  • share your experience, but remember others' contexts are very diverse, and may not match yours.
  • be of service to other people on the call

These calls are a Friendly Space.

Purpose of Wikimedia Clinics[edit]

  • provide a channel to ask questions and collect feedback on one's own work and context
  • help direct people to appropriate resources across the Foundation and broader Wikimedia movement

If we can't answer your questions during the call, we (WMF) are committed to finding who can, and connecting you (this may happen after the call)

Examples of things the Clinics are not the place for:

  • complaints about interpersonal behavior - there are appropriate channels for this on-wiki, and there is the Trust and Safety team.
  • content or policy disputes on specific wikis. But it is okay to seek advice on how to better present one's positions.

Topic 2: Statistics tools demonstrations[edit]

Presentation by Amir Sarabadani:

  • Commons itself, in the "page information" view, can show you the number of page-views a File: page has had in the last 30 days. But it's only for the last 30 days, and only for actual page views of the File: page, not counting the (potentially many more) views of the image when embedded in a wiki page (e.g. Wikipedia).
  • Mediaviews from WMF is better -- it allows to select any date range, and does count embedded views. However, you have to specify file names one by one.
  • I (Amir) built a tool called Mediaviews-in-Category: For example, views of media files in Category:Vulpes vulpes crucigera in the past 90 days.

Discussion[edit]

  • volunteer: With respect to video views, does it count embedding of the video thumbnail, or actual clicks to play the video?
    • Amir: If someone clicks on the play key, then it's counted.
  • Asaf (WMF): The thumbnail is counted as zero?
    • Amir to double check.
  • Anton Protsiuk (Wikimedia Ukraine): Is there a page on Meta or any other text version of this?
    • Amir: not yet!
    • Asaf: Let's try to make sure it's documented on Meta. I volunteer to ensure this happens.
  • Asaf: I have also requested Amir for an "ego mode", which would measure views of media files a particular user uploaded. Just like the Userviews tool which is sister tool of the Mediaviews tool. I often use Userviews in outreach when I show people Wikipedia to demonstrate the impact of our work.
    • Amir: I can get this done soon as it isn't that hard.
  • Mehrdad (WMF): We were invited to collaborate with UNESCO on how young people access sexual health information. Using this tool, they found that their video was viewed a lot on Wikimedia. It encouraged them to release additional content under a free license. The stats of how often the information is accessed was found to be convincing.

Bonus demonstration of stylometric analysis to uncover sock puppets[edit]

chart 1
chart 2

Amir: I have also been working on a tool to analyze language-use patterns to automatically uncover potential sock puppets. On the right are two examples, one comparing two known sock puppets, and one comparing two known-to-be-distinct-people users.

  • volunteer 1: This would be so useful for picking up undisclosed paid editors!
  • Amir: I have been talking to English Wikipedia's checkusers about this. Running this on enwiki is really hard, as it's huge.
  • volunteer 2: so let's run it on smaller ones!
  • volunteer 1: I imagine it would indeed be hard on English Wikipedia. One would likely want to simply run it on pairs of suspected socks initially.
  • Amir: We cannot make this a public tool. Advance in technology such as this can put people at risk. People living under repressive regimes can be outed despite no wiki wrongdoing, and can get in trouble. Therefore, this tool won't be made public.

Topic 3: What is "Wikimedia research"[edit]

volunteer: What makes research research? When I run some SQL query, it's not "research". So when does some data-retrieval become "research"?

Discussion[edit]

  • Amir: Traditionally, "research" means the study has been peer-reviewed, accepted, and published. Research is not only theoretical: Some research is on application and some on tools. Or on how much time people spend on wiki, for example.
  • Asaf (WMF): Adding some generalizations to Amir's answer to "When does a SPARQL query becomes research?"
    • Academic research happens within an academic discipline, in which academic peers judge your work. You need to define a research question, and justify why it is a valid research question, discuss your methodology, conduct the study, create control groups for comparative studies, etc. To take Amir's example of an algorithm that could identify sock puppets, you'd need to prove that the outcome that the algorithm yields indeed correlates very well with correct judgments of whether two users are the same person or not.
    • We can run a database query to find out what users joined last month. For research, we need a research question, for example, to ask what are the factors affecting new user retention, or how retention changes based on certain factors (duly controlled for).
    • People need research and technical expertise to conduct wiki research and pick good subjects for research. There are teams of people, some bringing academic rigor and a particular discipline (e.g. computer science, management studies, psychology, gender studies, etc.), and others bringing the technical skills to perform the research. There is a room for non-academics to collaborate with academics and help them improve their research. Some existing studies had poor (misguided) research questions, or poorly interpreted data (for instance, researchers not deeply familiar with how wikis technically work may not realize some "pages" are not pages but redirects, or that some revisions are deleted revisions), and could benefit from involving experienced Wikimedians with good technical understanding of the wikis.
  • volunteer 1: It is important to engage academics. I remember a report published few years back, where I read that most of the people engaged in studying my wiki, I didn't know as contributors to that wiki.
  • volunteer 2:
    • wiki research is usually part of a specific discipline. There is no "wiki discipline" proper.
    • I think it would be really great to work more in an interdisciplinary context. Sometimes I see wiki research with an interesting outcome - but then I think: okay, you have produced data, but the interesting questions just begin there. But the authors of the study are not necessarily interested in the further questions, because they are outside their discipline.
    • I have tried to make a sketch of what a discipline of its own called "wiki studies" might look like
    • We have some 200-300 papers every year about wikis. But there is also relevant research without „wiki“ in the title, or without a bibliographical link. For example, a book about „online creation communities“ usually has a big chapter about wikis or Wikipedia.
    • My impression is that the humanities are a little bit under-represented. For many researchers, the fun of using Wikipedia as an object is that it is a large piece of data that can be used under a free license. The humanities ask questions like: what to do with this large piece of data, how to understand the data. Understanding, that is the basic task of the humanities, as Max Weber said. Contrary to the natural sicences that are looking for systematics and regularities.
  • Asaf (WMF):
    • The WMF Research team conducts its own research, but also facilitates other people's research, and encourages such research by outreach, e.g. to young researchers in relevant university departments.
    • The Research:Index page on Meta is the central hub for all things research. There is a newsletter on wiki research that exists to review and report on wiki research. The research newsletter is published on meta and is archived too. There is a search box for searching specifically within the research newsletter archive on the Research:Index page linked above.

Topic 4: In person events[edit]

  • volunteer: Have there already been real life meetings in spite of the COVID-19 pandemic?
    • Asaf (WMF): WMF has imposed a ban on in-person meetings using WMF funds. This includes affiliates that are funded by WMF. So even affiliates who have money available cannot use WMF resources to organize in person events. However, the Foundation doesn't have power to compel affiliates or individuals beyond funding restrictions, so some can organize meetups on their own. We assume some have done that.
    • volunteer 2: In New Zealand, where the pandemic was eradicated, they did have a meet-up.
    • volunteer 3: If there were some funds needed for logistical support to a virtual meeting, would it be okay to re-purpose existing funds for those expenses?
      • Asaf (WMF): WMF is open to discussing re-purposing funds. One would need to discuss this with one's grant program officer. We are generally happy to approve such requests. For example, the planned Central and Eastern Europe (CEE) Meeting 2020 regional conference is being organized as an online event now.
    • volunteer 4: There has been a discussion in German Wikipedia about a planned in-person meeting for the annual photo competition jury.
  • (post-call follow-up) See this update about a coming update from the Foundation.

Topic 5: Wikimedia in Mainland China[edit]

Pandemic situation in China for Wikimedians[edit]

Meetup in Hangzhou, 2020

A Wikipedian from mainland China reported on the pandemic situation:

  • The pandemic is pretty much over in China right now. We had a meetup in Hangzhou recently.
    • Asaf (WMF): The news about the "social credit" system in China suggests people are penalised for engaging in "anti-social" activities. Have Wikimedians, to your knowledge, ever encountered any such issues due to their involvement with Wikimedia?
      • Chinese volunteer: The reports about the Chinese "social credit" system are completely overblown. It's little more than an equivalent of the American "credit score". Police never bothered me about my Wikipedia activity.
      • Re safety: I have personally conducted 5-6 meetings in China. I have never encountered any issues ever. We did this before Wikipedia was blocked and continue doing this now. In 2013-2014 in Shanghai, the best and most attended wikimedia meetups were organized. It helped developed local communities. We faced no issues for organizing that event. There are no such concerns. Nobody stopped us from editing Wikipedia.

Access to Wikimedia wikis in China: overview and current issues[edit]

The volunteer then proceeded to share an overview of the Wikimedia situation in China, with an emphasis on access and censorship:

  • Chinese Wikipedia started early, in 2002. First meetup was held in Beijing. Most active ZHWP admins are mainland Chinese. (13 of top 20 active admins are from mainland China.)
  • During the initial blocking of ZHWP in mainland China (since 2004-5), w:Baidu Baike became a competitor to Chinese Wikipedia, attracting a lot of potential Wikipedia readers and contributors. In 2008 the block was eased, before the olympics, as part of China's promise to the International Olympic Committee.
  • The vast majority of Chinese Wikipedia had been accessible to the public, with some specific articles still blocked (e.g. politically sensitive topics like Tibet and the Tiananmen Square protests).
    • Every anniversary of Tiananmen (June 4th) attracts attention, so in late May 2015 ZHWP was blocked again. In early 2018 several other languages were also blocked (Cantonese, Japanese). June 2019 (English and all other Wikipedias were blocked fully). In December 2019 the major public IPV4 addresses of Wikimedia servers were blocked.
  • Since 2015 the government has been using "DNS Poisoning" to capture and redirect people trying to reach Wikipedia. This was easy to avoid for technical users (by editing their local "hosts" file).
  • In 2018 the firewall was upgraded. By monitoring SNI handshakes they are seeing the hostname (zh.wikipedia.org) you are accessing, even though they can't access the specific article title.
  • Now people are forced to use VPNs. But those are very unreliable in China (so people use proxy protocols other than VPNs). Furthermore, VPN users run afoul of "open proxy" blocks on most wikis.
  • Here is a series of issues in coping with the access restrictions on Wikipedia in China.
    • IP Block exempt: When an admin blocks an IP address they have an option whether or not to also block logged in users using that IP. When admins block logged-in users as well, they are making it difficult for Chinese users using proxies to contribute. This is a huge barrier to newbies wanting to join Wikipedia. To work around this, we (Chinese Wikipedia admins) are granting the IP block exempt privilege liberally, to practically everyone who asks. But it still depends on people knowing about this and asking for help (often on off-wiki channels). We use a mailing list to help people with these access problems. That's not very good, as mailing lists are not well-organized and suited for "service tickets". But we don't use OTRS. Oftentimes we find months old emails that are left unattended.
    • IP Exemptions for other projects: to contribute from China to other projects beyond Chinese Wikipedia (e.g. English Wikipedia) using an open proxy, one has to apply for IP block exemption in each project separately. I have looked up the number of applications for IP block exemptions on ENWP, and there were only 27 applications in 2019, almost all of them from mainland China.
    • A global IP block exemption won't override a local project's IP or IP-range block, so you'd still need to apply for local exemption as well. It is all very discouraging.
    • We (on Chinese Wikipedia) stopped renewing range blocks, and we are trying to unblock IP ranges that were auto-blocked by bot. Then we compared new account creations to what they had been before the unblocks, and they increased threefold. We had been barring ~65% of new accounts! We have to manually do all this tedious work.

Discussion[edit]

  • a volunteer steward:
    • we, stewards, come across applications for exemptions on Meta, and we receive requests at stewards' email queue. We give access to everyone who asks, but not many people ask, as they need to go to Meta for submitting requests.
    • Importantly, global blocks don't work on Meta, leaving the option to request exemption from them.
    • But it is correct that Global IP block exemption does not cover local blocks, and some wikis locally block open proxy IPs proactively (while for global blocks we only do that reactively in response to vandalism from certain ranges).
    • This decision, not to include exemption from local blocks as part of GIPBE, was made more than 10 years ago, to let local communities have more control on what IPs they blocks, and would be technically easy to fix (a couple of clicks by stewards in the UI). But to fix it, we must have a global RFC on Meta whether to extend the global flag to imply local exemption as well.
  • Chinese Wikipedia volunteer: We (ZHWP) have our own idea: to set up our own (WMF) mirror site of ZHWP. By changing domain name and IP it will sidestep the Great Firewall. Many Chinese universities have created their own mirrors of Wikipedia. And use OAuth-based access to restrict access to their Intranet version, through which one can search Google and read Wikipedia.
    • Asaf (WMF): WMF is interested in any reasonable and lawful step that would support access to knowledge in China. If you are intrested in pursuing this, let's discuss further after the call, and I will get you in touch with the WMF staff with whom you can discuss the technical and governance issues of any proposed changes.
    • Chinese Wikipedia volunteer : We really want somene to help us. We don't need grants support so didn't know whom to contact! Thank you so much for your support.

Topic 6: Technical feedback on Jitsi[edit]

  • there were non-functioning audio problems for some attendees.
    • Reconnecting did help with the audio.
  • Chrome or Chromium works better than Firefox for Jitsi.
  • The Jitsi app has better audio.