RuWiki History (Doronina and Pinchuk)/English/Methods

From Meta, a Wikimedia project coordination wiki

Study of the Russian Wikipedia's community history[edit]

Methods[edit]

Because this was a pilot project with no existing methodological background, we experimented with a variety of techniques, some of which were more productive than others. A list of our approaches:

  • Examined the evolution of the main page (Заглавная страница) for the first year of ru.wiki’s existence
  • Examined early articles and comparing to contemporaneous articles on English Wikipedia
  • Based on registration date, identified and tracked the first 50 contributors to ru.wiki -- their contribution history, their user pages, their discussion on their own and others’ talk pages, etc.
  • Read all ArbCom cases and decisions, rules and regulations pages and compared to English, French, and Ukrainian Wikipedias
  • Identified and tracked “key players” -- active Wikipedians who made a major contribution to the project, either as article editors or “metapedians”
  • Identified key conflicts and their repercussions in the social fabric of the community, including analysis of edit numbers during major changes in the community
  • Examined off-wiki resources like Wikireality and Tradition, as well as analogous English sites (Encylopedia Dramatica)
  • Created an on-wiki survey with a set of questions aimed to elicit basic background information and more complex self-reflection from the random sample of users who chose to respond
  • Created targeted questions for first contributors and “key players”
  • Generated basic quantitative information -- graphs of editor and article counts, visualizations of individual editors’ contribution histories, lists of editors by year of entry into project

Comments on our approach[edit]

Wikipedia, the main article space as well as the other discussion and project spaces, consists of millions of pages edited by thousands of editors, and hundreds of official and unofficial rules and regulations. Studying the history of one or many Wikipedia communities presents a number of challenges, but perhaps the biggest is where to start.

Scholars have used a variety of methods to tackle the problem of how to study a Wikipedia. Most academic studies have attempted to quantify the vast amount of information, making it easier to process and categorize. But inevitably such research is full of technical terms, which are hard to understand for a layman and do not give a feel of what it is like to be a Wikipedian or how the community develops.

Because we had a limited amount of time for research and did not want to get bogged down in technical terms or statistics, we chose a different approach. Our own preliminary study, while making use of available data and statistics, has been geared more towards individual narratives from the community. Early on, we realized that these narratives are largely contained in the discussion pages and edit histories of users, and one of our major tasks was identifying “key players” in the formation and continued evolution of the Russian Wikipedia community, and then reading through their contributions. Because prominent Wikipedians often make tens of thousands of edits, this is a fairly time-consuming task, but it produces the most complete and engaging portrait of the community members: their interests, their struggles, their frustrations and successes.

The main page

A logical starting point for this kind of project is examining the history of the Main page of the project, because this page gives a date and feel to the start of the project and may influence the direction it takes. Studying early editors -- what and how they edit, how long they stay in Wikipedia, and what some of them achieve -- was also very informative. We recommend finding the first 25 editors and the pages they created (see caveat about studying pages, discussed below) and compiling this information.

In-wiki resources

One of the main in-wiki resources is the history of mainspace articles, which provides insight not only into who edited pages, but which pages were controversial, and which topics caused edit wars. From this starting-point, we moved on to the contribution history of individual editors, analyzing both “key players” and less prominent users, and their contributions to both “major” and “minor” pages (e.g. articles, talk pages, forums, and meta-discussion).

In addition to looking at articles and users, we also utilized some of the preexisting in-wiki analytics pages. Most projects accumulate some sort of statistics pages, and some projects have pages specifically dedicated to the history of a project, for example, French Wikipedia (fr:Wikipédia:Historique de Wikipédia en français), though these often take the form of a schematic timeline of milestones and not a narrative. Pages with links to national press coverage (see interwiki in en:Wikipedia:Press coverage) are useful for determining the events that were of interest to the outside world; however, press coverage, at least in the case of the English Wikipedian, tends to be biased toward scandals (e.g., inaccuracies in articles, pornography, editors misrepresenting themselves as experts, etc.).

  • To test ideas by hard data you can use the general statistics page
  • To find details of statistics on individual users and pages:

http://[abbreviation of your project here].wikichecker.com/ for example

  • Some information about users can be found through the tools at the bottom of the Contributions page.

Pages

During our study, we experimented with focusing on pages rather than users. Specifically, we attempted to compare the development of the Russian and English Wikipedias by comparing articles at certain points in ru.wiki’s development[1] . The results of this experiment were mixed: web archives for en.wiki and especially ru.wiki are not very accurate or thorough, and we quickly discovered that comparing the two projects through pages was like comparing apples to oranges -- first, because ru.wiki has yet to catch up with en.wiki in size, and second, because the kind of articles written in both encyclopedias varies. Taking random samples of pages from one Wikipedia and attempting to find analogous articles in another is problematic because Wikipedias are culturally specific, and the presence or absence of an article in any given Wikipedia may not be a testament to quality, but rather to this cultural specificity.

Users

In studying the users of the Russian Wikipedia, we attempted to gather as much information as we could on both their in- and off-wiki activity. Many users list personal information on their Wikipedia user pages in the form of short bios or user boxes, and with these we were able to determine a great deal about many of the “key players,” including: age, profession, place of residence, as well as personal and professional interests.

While some users are comfortable with Internet exposure and use their real names as their Wikipedia handles, others are intensely private and become hostile when their real-world identities are discussed in connection with their identities on Wikipedia. Whenever we could, we attempted to avoid linking to any discussion or page that revealed the real names of users who wished to remain anonymous. Because our archival research was oriented mostly on “key players,” we attempted to balance out our user analysis by posting an on-wiki survey to the ru.wiki Forum (Village Pump) that could be filled out by any user. About half of those who responded indicated that they would allow us to publish their results on Meta, and from these we chose a mix of metapedians and authors, experienced editors and frustrated newbies, admins/arbiters and banned users. We sent targeted follow-up questions to some of these users, as well as to some of our “key players.”

Unfortunately, we hardly got any replies to the emails sent to the first RuWiki editors as well as the "pure authors" as if the latter don't want to have any contact, which does not concern the articles.

Our study of users had two prominent limitations:

  1. We focused only on registered users, thus not addressing the contribution of anonymous editors (or registered users who also edit anonymously).
  2. We focused mainly on “major” Wikipedians -- those with extensive, interesting edit histories and a record of participating in community discussion that, according to some studies, are more likely to stay long-term on the site [2]. While some of the users we looked at did not fall into this category, for the most part our attention was focused on these “metapedians.” This may seem unfair to “the authors” who work mainly on articles and avoid engaging in debate or conflict, but our story is only one of the possible ways to describe the community, and there are certainly other “human interest” approaches, such as describing what makes the most prolific authors tick.

Time was one of the factors that contributed to these limitations. Because we were only allotted eight weeks for research, we could barely scratch the surface of a major Wikipedia like ru.wiki. A longer study is necessary to delve deeper into the more nuanced workings of the community, but a long-term approach also has the disadvantage of becoming outdated even before it is published, because community dynamics tend to change so quickly.

Rules

The rules of Wikipedia can be daunting even to the most experienced Wikipedian. It takes a great deal of time to figure out the way that rules, guidelines, and unofficial community practices have developed over the years. One of the methods we followed was reading through the history of all the Arbcom cases: in the Russian Wikipedia, these have often precipitated major policy changes. The other one was to read through community votes/discussions about the rules.

But the Arbcom does not exist in every wiki community, and, as in the Russian Wikipedia, it may not be responsible for making policy, only clarifying/interpreting it. Reading the history on rules pages, and following the edits of prominent wiki “legislators” gives a better idea of how certain rules took shape. In addition to the official rules, there are a number of unofficial practices, some of which are documented in on-wiki essays or voiced during Arbcom elections, and it is helpful to examine all of these resources to get an idea of the actual practices, not just the official rules, that govern behavior on Wikipedia.

Off-wiki resources

Well-developed Wikipedia communities usually spawn a number of associated web-resources, such as collective and individual blogs, forums, and wikis. The Russian Wikipedia and its members are written about in several such independent projects, some of which have mixed goals and some of which are completely dedicated to the community. Unfortunately, the people who gravitate toward these projects tend to work on them because they were blocked in Wikipedia or left the project for personal reasons, and often they vent their frustration by creating insulting and/or harassing attack pages. As a result, while these off-wiki resources may be useful for finding links to archived material and primary sources, they are not reliable as a source for unbiased interpretation.

If the project, like the English Wikipedia, has public Wikipedia-related mailing lists, these can also be a valuable source of information on current and on-going debates in the community [3].


Conflicts

One of the major issues of writing the history of a Wikipedia is deciding how much weight to give to conflicts. Given that our preliminary study focused mostly on metapedians, who tend to be lively participants in any discussion or debate, stories of conflict were a major part of our historical narrative. This is not necessarily the only kind of narrative, of course -- the majority of users do not participate in major conflicts or policy debates -- but it is, unfortunately, the easiest story to tell. A longer, more detailed study would most likely produce a more accurate representation of ordinary users’ activity in the community.

References[edit]

  1. van Dijk, Ziko. (2009): “Wikipedia and lesser-resourced languages.” In: Language Problems and Language Planning 33 (no. 3, fall), pp. 234-255. (highlights common methodological problems in assessing and comparing smaller wikis to larger ones)
  2. Panciera, Katherine et al. “Wikipedians Are Born, Not Made: A Study of Power Editors on Wikipedia.” Proceedings of GROUP 2009 (May 10-13). (formulates a theory of “power users” on English Wikipedia, who appear to be naturally suited to the wiki-environment)
  3. Reagle, Joseph. Faith Collaboration: The Culture of Wikipedia. (specific insights into the social fabric and dynamics of the English Wikipedia)

Literature[edit]

Lih, Andrew. The Wikipedia Revolution: How a Bunch of Nobodies Created the World's Greatest Encyclopedia (a comprehensive yet accessible account of the early Wikipedia history)

Shachaf, Pnina and Noriko Hara. (2010) “Beyond vandalism: Wikipedia trolls.” Journal of Information Science 36, pp. 357-370. (insight into psychology of trolls on Hebrew Wikipedia)

Pfeil, Ulrike et al. “Cultural Differences in Collaborative Authoring of Wikipedia.” Journal of Computer-Mediated Communication 12 (2006) 88–113. (proposes methodology for comparing different language Wikipedias; discusses the Hofstede Index as a metric for assessing cultural differences)

Kittur, Aniket et al. “Power of the Few vs. Wisdom of the Crowd: Wikipedia and the Rise of the Bourgeoisie.” (pdf) (formulates a theory of “elite” versus “common” users in the English Wikipedia, suggests gradual shift in workload from former to latter over time)

Panciera, Katherine et al. “Wikipedians Are Born, Not Made: A Study of Power Editors on Wikipedia.” Proceedings of GROUP 2009 (May 10-13). (formulates a theory of “power users” on English Wikipedia, who appear to be naturally suited to the wiki-environment)