The scholarly researches studying Wikimedia projects, analysts at the Wikimedia Foundation, and members of Wikimedia communities all have a shared interest in answering questions about Wikimedia projects using public datasets. Such efforts are often plagued by missed opportunities to share code, datasets and analysis techniques. This idea proposal lays out a two part strategy for enabling better collaboration around wiki research in the future:
- provide technical infrastructure for accessing, sharing and discussing data
- organize regular in-person and virtual data hacker meetups
Labs2 (L2) is so named due to its similarity to Wikimedia Labs, a set of technical and coordination resources for MediaWiki developers as well as bot/tool developers. Whereas Wikimedia Labs aims to support developers with infrastructure, Labs2 aims to support data analysts with infrastructure (potentially on top of Wikimedia Labs itself) and regular hacker meetups.
Add your username here to be notified of updates related to L2
- To facilitate the creation of more regular interactions and gatherings across the different groups involved in conducting research on Wikipedia, wikis, and other kinds of open collaborative communities.
- To increase the rate of data-based question answering done by the Foundation, Community and Academics.
- To build an even more international and interdisciplinary research community engaged in the study of wikis.
- To transcend the boundaries and silos that tend to limit research communities of most kinds.
- To bring academics into frequent contact and conversation with the Foundation and other members of the Wikimedia and related communities and vice versa.
- To facilitate research resource-sharing and opportunities for collaboration.
- To aggregate demand for tools, datasets, and other resources necessary to pursue research.
- To develop novel approaches to the study of wikis and to disseminate both the findings and practices of those research.
Meetups and research hackathons
There will be an initial set of regional meetups, probably in early November 2013. Dates and locations TBD.
How the meetups work
The meetups will function as lightly-structured research hackathons, where participants will be encouraged (but not required) to brainstorm and pursue tractable, short-term collaborative research tasks in self-organized teams. The hackathons should be designed to be inclusive and welcoming safe spaces. Meetup organizers (you can be one!) will volunteer to help locate space, wifi connections, bathrooms, and (ideally) some food and refreshments for attendees. Organizers will also be responsible for coordinating their activities with other meetups and reporting back on the results of their local event.
Goals for the initial meetup.
What we want to accomplish/produce for the initial meetup.
Before the meetup (Planning)
A local organizer will need to volunteer to lead the event, locate space with accessible wifi, restrooms, adequate seating, power strips, paper, pens, office supplies, and (ideally) some food for attendees. If you're interested in being a local organizer, please sign-up!
The Labs² project coordinators will also offer some centralized coordination in the weeks before the meetups. This will consist in a fairly lightweight process of brainstorming (e.g. via a Google form) and evaluation (e.g. via AllOurIdeas or something similar) of research ideas. The intention behind this process is to facilitate the early idea generation and selection process along the lines of an un-conference so that subsequent group formation and activity in the local hackathons runs more smoothly -- if it seems like too much of a burden, hackathon participants are invited to abandon it and just do what they wanted to do anyway.
If possible, Hackathons should be planned as day-long events on or around the same date worldwide. This will facilitate some cross-pollination of ideas as well as a greater sense of community between the regional groups. It will also make resource sharing simpler, as Wikimedia Foundation staff can help on-board multiple people to centralized Labs servers and the like.
During the meetup (Execution)
At the beginning of the hackathon, local organizers should plan to discuss the format, goals, and ground-rules (e.g. safe, inclusive space) with the participants in their meetup. Labs² coordinators can also help with this (maybe via video conference). Local organizers should also be responsible for structuring the physical space and time so as to achieve the goals of the event (brainstorming, open collaboration, resource sharing, and reporting back to the group).
A sample schedule is something like the following:
- 0900-0930 -- Meetup begins. Introductions, ice-breakers, discussion of the event format, schedule and goals.
- 0930-1030 -- Ideation and Project brainstorming. Participants review, share, (and iteratively refine) the project ideas generated during the asynchronous brainstorming process with the goal of selecting a project to work on for the rest of the day. One way to begin this process is through a sort of "speed dating" format, where individuals form pairs, explain their favorite research idea in ~1 minute to their partner, and then rotate into a new pair. Afterwards, some group conversation could ensure where interesting overlaps might be discussed. Finally, individuals could then be invited to propose a project to the group. After all proposals have been recorded, everyone could be invited to join any project that interests them.
- 1030-1200 -- Morning Work Session. Once project groups have formed, they can begin pursuing their projects for the remainder of the morning. This is an ideal time to refine project scope and outline concrete goals, milestones for the day.
- 1200-1215 -- Preliminary project reporting. After the projects have had a chance to work together for a bit, each project reports back to the group about what they've been doing. This is also a good time to encourage people to switch projects or re-visit their goals for the day.
- 1215-1400 -- Lunch break. Project groups may wish to eat together and continue their work over lunch.
- 1400-1630 -- Afternoon Work Session. Project groups formally re-convene after the lunch break to continue working.
- 1630-1700 -- Final Project Reporting. Project groups report on their activities for the day. Emphasis should be on briefly summarizing the project, its progress so far, and any plans for future work.
- 1700-1730 -- Event closing. Local organizer may wish to use this time to coordinate project reporting, discuss ideas for subsequent meetups or events, and invite process-oriented suggestions from participants.
- 1730-1900 -- Optional Socializing!. Participants are invited to join each other at a nearby cafe or restaurant for drinks, food, etc.
Note that the largest proportion of the day is left completely open for groups to develop and work on projects together. Also note that none of the activities are mandatory and in the event that individuals (or everyone!) determines that they do not want to work on group projects together, the organizer should be prepared to embrace that possibility too!
To the extent possible, collaborations across space and time between members of the different local groups are strongly encouraged and participants as well as organizers should feel free to experiment with various mechanisms for supporting collaboration and co-presence. Time zones might make it hard to coordinate activities between meetups in the U.K. and Seattle, but don't let such obstacles stop you from trying!
After the meetup (Follow-up and reporting)
A few days after the meetup, local organizers are responsible for ensuring that a brief summary of the meetup activities are added to the project repository (TBD -- probably a wiki page). Also, organizers may encourage participants to share any outputs with the wider Labs² community, such as written resources, code, photos, charts, graphs, statistics, slides, research abstracts, etc. Wherever possible and appropriate, participants should be encouraged to make these resources available under suitable Open Access licenses.
Labs² coordinators will also solicit feedback and suggestions from local organizers and participants about the event format and proposals for subsequent events or projects at this time. Please be bold!
One key component of this project is the creation of regular, loosely coordinated research meetups and hackathons. The meetups will advance the community-building goals of the project described above. The hackathons, on the other hand, will provide a bounded window of time during which interested members of the research community can gather in virtual as well as physical locations to co-work on projects. The idea would be to hold hackathons at a pre-determined time, and then to invite groups around the world to coordinate their own locations, invitations,
Potential locations for hackathons include:
- The Wikimedia Foundation
- The University of Minnesota
- Northwestern University
- Oxford University
- The University of Washington (Seattle)
- Harvard University (Berkman Center)
- Northeastern University (Center for Texts, Maps, and Networks)
We are in the process of recruiting researchers from other countries and regions of the world to join this endeavor.
Public data analysis infrastructure
This project should not just improve the availability and discoverability of public data or publicly accessible data analysis tools, but also create infrastructure and spaces for open experimentation to support our progress towards the movement's strategic goals.
Virtual coordination space
Public communication channels and project documentation organized within a central hub, and links to other resources (repos, social media, ???).
Academic or industry research professionals who are willing to provide data, mentorship, or other resources, or who want to participate directly in research with people outside their normal circles.
Other teams at the Wikimedia Foundation have successfully implemented a student internship program. Research is an obvious candidate for internships, working with partner universities will allow us to grow capacity and know-how to answer questions on our projects and bring visibility to our open data infrastructure.
In addition to formal internships, ad-hoc visits would connect academic researchers doing promising research with the Foundation and the community. Researchers would visit for one to four weeks, with all basic expenses covered (travel, room, and board). The best candidates for these visits are researchers who are a part of active or recently-completed research projects, rather than for new projects. Researchers would be able to connect with the people who know how to query the databases in toollabs, implement tools in MediaWiki, connect with members of the community, and understand how the projects operate.
Measure of success
- Technical reports to conferences and in form of scientific publications. Gold path open access and licensing will be used whenever feasible. In other cases, canonical versions of publications could also be made available openly through green path self-archiving on a researcher or institution's website and linked from the project space.
- Non-technical, practical reports to the Wikimedia communities. Posted on wiki and advertised on community mailing lists. Goal is to present the community with the research results in an actionable ways and orient the research towards the needs of the community.
- An annual presentation at Wikimania to disseminate the results of the interest of the community.
- Tools for Wikimedians based on state-of-the-art research
ask questions, propose ideas, jot down notes... be creative!
- Wikipapers -- wiki summarizing scholarly Wikipedia research 
- The Research Index on meta
- E3: Editor engagement experiments
- User Metrics system for cohort analysis
- Wikimedia Summer of Research, a WMF sponsored collection of analyses exploring newcomer retention in Wikipedia
- Erik Zachte's Wiki Stats
- User:Mr.Z-man's study of newcomer retention: Mr.Z-man/newusers