Research talk:Contribution Taxonomy Project

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search

Taxonomy feedback[edit]

Multilingual work should get more of a nod than small/dying languages and interwiki:translation. There's a tremendous amount of this going on.

Software and MediaWiki development -- and the development of the growing # of non-MediaWiki tools we use and rely on -- is central to the projects. It deserves its own top-level section.

Bot and scripted work accounts for a tremendous proportion of all project edits and countervandalism today; it deserves a larger nod, and perhaps a breakdown into the classes of such work (just as editorial work has - cleanup, license checking, patrolling, interwikis, &c - check the tasks the most active bots do).

Finally, take a look at the list of initiatives for related brainstorming about roles and opportunities for contribution. SJ · talk | translate 01:23, 12 August 2010 (UTC)

Thanks so much for the suggestions SJ. Much appreciated. I could think of software localization as one important thing we missed that's going on among those working multilingually in Wikimedia, but any specific names of activities would be even more helpful. Feel free to edit the table below the mindmap and add them!
You're right that software development should be more prominent, but I'd also like to stress that we don't think of the mindmap or table as hierarchical at all. Part of the goal of the project is to look at all contributions and give them equal weight in determining activity levels (versus only those that lend themselves to barnstars or high edit counts). The only reason some have greater branches off of them is that we felt the need to break some activities down more to get at the nuances.
I will definitely take a look at the initiatives and how we can break down bots and scripted volunteer work too. Thanks again, Steven Walling (talk) 06:08, 12 August 2010 (UTC)

Hey SJ thanks a lot for the comments, like Steven said the list is not in anyway final and it's just a way to see how contributions are added to the projects or community in general, in no way it seeks to judge 'this is more important than this', we just want to find a way to seek out and recognize anybody who's involved and contributes in some way other than just 'editcount'. DamianFinol 16:45, 12 August 2010 (UTC)

Duplication of names[edit]

There are two descriptions for the same name. Maybe you want to add Usage Research - developed world?

  • Usage Researcher - Active researcher on usage
  • Usage Researcher - Active researcher on developed world usage

Jodi.a.schneider 23:40, 12 August 2010 (UTC)

That is the way it is because there is a usage research header (for general work) and the two subheadings of usage for the developed world and global south. If you look at the mindmap, you'll get a better idea of how it's not a duplicate. Steven Walling (talk) 17:45, 13 August 2010 (UTC)
Exactly--there should be a subheading for the developed world. Right now, "Usage researcher" is given twice without subheadings, and a third time *with* a subheading:

Usage Researcher Active researcher on usage Usage Researcher Active researcher on developed world usage Usage researcher – ‘global south’ Active researcher on global south usage

So I think you want to rename the middle one here to "Usage researcher - 'developed world'" (or something similar). Does that make sense?

Now that I found "Research" in the mindmap again, "development" seems like a strange superheading; much research has little to do with programming, and only some research is usability-related. Jodi.a.schneider 15:37, 14 August 2010 (UTC)

Chapter categorization[edit]

I'm puzzled about this one: European and 'rest of world' are distinguished. Unless you're specifically targeting Europeans, this seems an odd choice. European Chapter - Member of an European chapter Rest of the world Chapter - Member of a Chapter outside of Europe Jodi.a.schneider 23:42, 12 August 2010 (UTC)

Why do you think it's odd? It seems quite obvious at me (even if perhaps not relevant here). I think the point is that most chapters are in Europe, and in Europe looks like is much more easy to create a chapter. That's why we may need different chapters models for other countries, and why such a distinction may make sense. --Nemo 07:23, 13 August 2010 (UTC)
Nemo is correct. That's the main reason why we wanted to be able to distinguish between users active in European chapters and the rest of the globe. There are different challenges and opportunities. Steven Walling (talk) 17:25, 13 August 2010 (UTC)

Research papers: Differentiating edits, Roles in Wikipedia[edit]

Based on your email, it sounds like you're interested in recognizing work that doesn't necessarily get counted as edits are incremented. A WikiSym 2010 paper innovates the way diffs are displayed to help gauge changes in the article: What did they do? Deriving high-level edit histories in Wikis. This gets at the problem that not all edits are equal.

edit: free copy and slides

In terms of roles, two papers come to mind:

edit: PDF link (else try 'one click download' from the abstract above)

Jodi.a.schneider 23:55, 12 August 2010 (UTC)

Thanks Jodi, those are both really useful! Do you know of any way we could get our hands on the ones not readily available? Steven Walling (talk) 17:26, 13 August 2010 (UTC)
Added more links above -- to PDF (author's preprint) and slides of the WikiSym paper. And direct link for SSRN PDF -- they expect you to click on "one click download" to get a paper. Jodi.a.schneider 15:44, 14 August 2010 (UTC)


We've done a bit of academic work on understanding valued work dimensions in en:wp. We approached the problem by mining, categorizing, and making sense of barnstars posted to user pages. Unlike edit counts, barnstars nicely capture contributions that are not easily quantifiable. Here is a link to the full paper: Articulations of WikiWork: Uncovering Valued Work in Wikipedia through Barnstars. One concrete result of this work is on page 4 of this paper, which lists a table overviewing the distribution of work codes (from a codebook that we've developed) for over 2 thousands barnstars. We would be happy to share some of this data. We've also brainstormed, and are actively working on a few application of this work. For example, once could use machine learning to develop a barnstar recommendation algorithm by learning from prior volunteer activities that prompted a particular type of barnstar in the past.ivan 18:35, 13 August 2010 (UTC)

I haven't read the paper yet, but the problem here is that there are some users who give a lot of barnstars to the same people (and repetitive ones), so this may alter the results. A "barnstar recommendation algorithm" sounds really bad if implemented on wiki, but maybe I'm wrong. :-) --Nemo 19:43, 13 August 2010 (UTC)
Took a look through the paper. (Really interesting work, Ivan!) Nemo, my impression is that the intent is to understand *recognized work* -- that is, what kind of contributions are valued. That could be used to help suggest tasks that are both suited to someone, and that might be valued based on interdependencies in the community. (See pages 8-9). So maybe a bit more personalized than lists of things to copyedit, wikify, etc. Jodi.a.schneider 16:02, 14 August 2010 (UTC)


Maybe it's obvious, but MeatBallWiki is the starting point for such a taxonomy. See MeatBall:CategoryRole (members). --Nemo 19:43, 13 August 2010 (UTC)

Random thoughts[edit]

Some things are missing. I'll just write a random list of ideas.

  • I wouldn't put "welcoming" under "Gnoming" (cf. w:en:Wikipedia:WikiGnome and, btw, w:en:Category:Wikipedia fauna) and maybe Category:User philosophies and even Category:User associations). Welcoming is a part of "coaching", which is a great, great, great work (example: wm2010:Submissions/Mentoring programs: Structure of the German MP and international comparison, User:Bücherwürmlein/Mentoring-programs-evaluation).
  • OTRS: there are several roles, which are "officially splitted" only in English:
    • licenses and permissions,
    • queries and complaints,
    • legal,
    • requests for help by newbies,
    • ...
  • Discussions and so on:
    • discussion "facilitators",
    • discussions reformatters (more technical, but overlaps the previous one),
    • "folk memories" of the project (CommunityLore, previous discussions, decisions and practices),
    • persons who "pull the strings" of the discussions bringing them to some conclusion (sometimes using polls),
    • mediators,
    • users specialized in resolving content disputes,
    • policies and guidelines experts and enforcers ("you've forgot this policy, you can't do that"; "here you find how to resolve your problem"),
    • ...
  • Anti-vandalism (or almost vandalism) area experts:
    • open proxies,
    • sockpuppets,
    • conflict of interest (only on, so i don't know this well),
    • spam and external links,
    • copyright violations,
    • ...
  • Bots runners:
    • automatic import of data/data synch between different language versions,
    • various simple WikiGnomes' tasks,
    • especially spell checking (very important),
    • interwiki,
    • internal maintenance (discussions and requests/processes archives, signing bots, welcome bots, automatic creation of some pages, ...),
    • ...
  • Each process has its addicts/experts/persons officially in charge who run them:
    • featured/good articles/pictures,
    • votes/requests for deletion/undeletion,
    • users renaming,
    • ...
  • Gnoming:
    • manual of style experts (there are lots of sub-areas of MOS and specific experts),
    • please note that categories are often managed by small subsets of users (cf. Umberto Eco on "folksonomies"),
    • redirect creators,
    • disambiguation,
    • de-orphanizers and link-adders (also interproject links),
    • geocoders,
    • ...
  • Images:
    • sources and licenses checkers,
    • taggers,
    • ...
  • various processes "clerks" (e.g. users surrounding ArbCom or various committees, archiving of various requests e.g. on Meta);
  • arbitrators;
  • persons who write help pages, tutorials and documentation;
  • template creators;
  • tools (e.g. tools:) creators;
  • data analysers (e.g. w:en:Wikipedia:Database reports, tswiki:Query service);
  • content reorganizers (e.g. split and merge; selection of quotations by author or theme on Wikiquote ro create new articles from existing content);
  • Chapters:
    • press relations and inquiries,
    • conferences and (big) events organizers,
    • presentators,
    • booth volunteers/experts,
    • miscellaneous volunteering,
    • paperwork and organization,
    • fundraising,
    • international ambassadors,
    • communications (bulletins, websites and so on),
    • institutions relations and contacts,
    • school projects (some members, e.g. in Wikimedia Italia, do this much more than others, some do only that; WM-DE is currently "officializing" such positions, cf. Wikipedia-Schulprojekt),
    • ...

And there are more (I didn't list some which can be found on MeatBall). Sometimes a person does only one specific thing, but often a person is involved in several, yet specific activities.
General comment: the graph is quite messy. You can categorize contributions by category, persons who give them, or places (e.g. some interwikis are added by hand, some are added by bots; some communication is on the wikis, some on meta, some on other official websites, some on blogs or social networks); I guess you need to choose one way or to create two or three different graphs or to invent some other visualization method. --Nemo 20:45, 13 August 2010 (UTC)

  • Not just template creators, but maintainers.
  • Ad-hoc statistics and reports
  • Under "gnoming" systematizers of page names (including all namespaces).
Rich Farmbrough 19:34 17 August 2010 (GMT).

internal communications[edit]

Is there a place for en:WP:Signpost and de:WP:Kurier, etc? As far as I can tell, the PR and Communication Committee categories don't capture these communications efforts. Jodi.a.schneider 16:19, 14 August 2010 (UTC)

I was just going to add these myself :) a section called "internal communications" could be useful, maybe? -- phoebe 17:32, 14 August 2010 (UTC)
I think that's definitely something to include in the next iteration, especially since HaeB from the Signpost already suggested it to me. :) Steven Walling (talk) 21:00, 14 August 2010 (UTC)


The research categories are a little confusing to me; they read a bit more like WMF job descriptions than what researchers do in the world (people might break themselves down by quantitative/qualitative, or by subject: studying editors/people/process, studying content quality, studying project growth/statistics). I'm not sure if that's quite right either though; maybe other people have ideas. -- phoebe 17:32, 14 August 2010 (UTC)

I agree that the breakdown feels a little awkward right now. Perhaps we don't even need to break down the types at all, since the number of people who consider themselves Wikimedians who do research (what we're focusing on, actually) seems to be much smaller than the general academic community that studies Wikimedia but usually don't consider themselves members of the community. Steven Walling (talk) 21:02, 14 August 2010 (UTC)
Right, I think that is correct. It's a little hard to draw the line for me between "community member" and "non community member", since definitely non-com researchers are contributing something... much like the people who write about Wikipedia (Nicholson Baker, Joe Janes) but wouldn't necessarily consider themselves editors or community members (they are definitely doing outreach of a type, but I'm not sure how to class it). -- phoebe 14:53, 16 August 2010 (UTC)

Help and external help[edit]

I tried to break down some of the things I've done, and while most of them fit one way or another (editing, admin work, OTRS, some setup for GLAM projects, etc.) I'm not quite sure how to categorize something a little more unusual that I've tried: external help.

Sometime in 2008 someone pointed out on a Village Pump that Yahoo! Answers had a Wikipedia section, and that the answers in that section were very poor. I joined there and have given several hundred answers since (I'm identified as the #1 answerer in the section, even). My participation has tapered off because of some organized trolling which may or may not be obvious, but I think that on the whole it helps even in small amounts.

I wonder how this would fit in. It's not really "PR", and it isn't really covered by any of the subcategories of "Outreach". Is, perhaps, another branch required for provision of help? The "Helpdesk/Reference" branch doesn't seem adequate, though it is probably the closest idea present in the graphic. Cheers, Nihiltres(t.u) 19:12, 17 August 2010 (UTC)

Coaching etc.? First point in #Random thoughts. --Nemo 19:39, 18 August 2010 (UTC)

Then what?[edit]

So once the taxonomy is completed, what happens then? How will it be put to use? Just curious, OlEnglish 09:38, 19 August 2010 (UTC)

Wikimedia Volunteer Roles[edit]

You might also want to take a look at this page: --Frank Schulenburg 21:41, 20 August 2010 (UTC)


"Quality control / checking and adding references" needs sub-categories for trying to address conflict of interest and groupthink situations.

"Mediation" needs a sub-category for Wikipedia:Third_opinion -- everything with a to-do list queue could technically be a role, in dispute resolution and the much larger otherwise, so the actual taxonomy is quite large.

There are also many administrator contributions not reflected on the chart. But it is a good first pass. 20:13, 23 August 2010 (UTC)

Error in the image Initial Taxonomy[edit]

On the section Gnoming, where its says Vandalfighing it should read Vandalfighting [i.e. there's a T missing]. I'm really good for nothing, but I do spot typos. Qwrk 00:03, 29 August 2010 (UTC)

Contributions of high quality content.[edit]

Oddly, there's nothing here about users who contribute high-quality content. Sure, it recognises people who "Create new articles of relatively large length or unusually high quality", but what's so special about being the first person to edit a title? What about people who take poor articles that already exist, and turn them into really good articles? What about people who work to maintain the quality of high-traffic articles that tend to degrade over time? I think there's way too much emphasis here on creating new articles, rather than contributing new content. Hesperian 05:43, 11 January 2011 (UTC)


Quality control and Peer assessment are really closely related (because they often involve the same people). They should almost be under the same heading. But if we're going to separate them, then quality assessment should really be under peer assessment.

  • Peer assessment
    • Featured articles
    • Featured lists
    • Good articles
    • Peer review
    • Article talk suggestions
    • Assessing article quality and importance
  • Quality control
    • Checking and adding references
    • Copy editing
    • Applying formatting and style best practices (there's a shorter way to say this... you know, like using typical headings we use in certain subject areas, listifying prose and prosifying lists as appropriate...)
    • Improving articles to good/featured status
    • Removing/transwikifying inappropriate content (what Wikipedia is not. Not necessarily vandalism, since much is added in good faith.)

Hope that helps. Shooterwalker 09:43, 22 January 2011 (UTC)

List participation[edit]

Of course, reading all the mailinglists is part the activity. Maybe writing emails between users is also an activity. And there are some irc-activities. I don´t know if that would be interesting, but my activity is changing over the time. Some years ago, I did some featured article activity, now I don´t do that anymore. --Goldzahn 04:34, 23 January 2011 (UTC)