Research talk:Wikimedia Summer of Research 2011/Summary of Findings

From Meta, a Wikimedia project coordination wiki
I’ve read and have some questions.
1. What was the projected total cost for this research?
2. What was the final cost?
3. What was the projected ROI, and what metrics were chosen to quantify ROI?
4. What is the ROI to date?
5. What actions have been taken based on the research results?

Pine asked the above questions on on Foundation-l, but I thought I'd put answers here for posterity. I was one of the three co-leads for the project, and though I can't answer everything, but I'll take a crack at it...

  1. I don't know. I can tell you that it was pretty much the full time of three full-time employees, plus 9 people on short term contracts for the three months (give or take some time, since they all had different schedules).
  2. See above. I can ask accounting to try and get a full report if you like.
  3. The projected return on investment is that we had almost zero quantitative data telling us why there is a decline in editorship, and that's something basic we have to know. Getting more data on what was causing the effects we saw from Editor Trends Study was necessary if we were going to take any action beyond just the obvious, like improving MediaWiki with a visual editor, etc. Secondarily, Wikimedia had and continues to have almost no reliable analytics infrastructure internally, other than's ComScore data, and we wanted to get talented CS students to help us build out our analytical prowess on the editor-oriented side. It is absolutely critical.
  4. The return on investment to date has been that we have used the data to justify and kick off the continued full time work of Maryana and myself on editor retention issues. More about that below. We also have used the summer to develop a huge amount of new analytics infrastructure, especially for doing analysis of the dumps of Wikipedia. (See one key example on our technical lead Diederik's blog.) In general, the summer produced more than a dozen CC-licensed pieces of new research on Wikipedia's editing community.
  5. The summer research has been referenced in several features announcements since, if you check out the blog, and has thus been key background material justifying new MediaWiki features in some areas. Second, like I said above, the community template A/B testing taskforce in English, Portuguese, and German Wikipedia (so far) that Maryana and I have led is based exactly on an experiment we did this summer. Last, the analytics infrastructure we built continues to be used at the Foundation to this day and is of course GPL.

Hope these answer your questions, Steven Walling (WMF) • talk 02:57, 28 December 2011 (UTC)Reply[reply]

Thanks for the info.
First, I'm glad to hear that there has been follow up activities and that there is ROI in terms of directing efforts in feature development and editor retention.
Second, the reason that I ask about budgeted and actual costs is that if someone is comparing the ROI of research projects and many other activities of the Foundation and chapters, then cost can be a very important consideration. If you told me the dollar figures for SOR in isolation from other projects then the numbers wouldn't be very useful to me as standalone numbers, but they should be very useful to whomever is deciding what projects WMF will fund, and I hope that this data is made available to them. In other words, I hope that Summer of Research isn't being given an open checkbook or a set allowance each year. SOR and each of its component projects should be made to compete with other research projects, retention efforts, outreach efforts, software development, and so on.
Third, while I am not deeply involved in the research community or chapters at this time, but I think that it would be nice for the community to be able to see a reasonably user-friendly list of A) all of the proposed projects for each year or quarter, including their budgeted costs and ROI, B) the set of criteria that's used when evaluating the proposed projects, and C) a list of all of the projects that were chosen for funding after the proposals were evaluated. I think this level of transparency would be beneficial. Maybe WMF already does this but if you need to check with the accounting department to get the budget then this suggests to me that this kind of budget data isn't currently published. It should be something that WMF already does for its own budgeting and management purposes, and I think that it should be made public so that the community can get a clear sense of what WMF is doing, what the projects cost, and what the ROI is expected to be for each. Hopefully this kind of budgeting is already done internally by WMF in which case I would think that publishing it to Meta would be reasonably straighforward. I don't expect that this can be done overnight but it would be nice to have this done for the 2012 budget at some point. To be clear, I'm not suggesting a budget that shows a list of FTE costs, I'm suggesting a budget that shows costs by project. Pinetalk 10:12, 28 December 2011 (UTC)Reply[reply]
Let me first say +1 to everything Pine said. There is little to no information about budgeted and actual costs of almost all projects, they are more visible now with the recent research projects but there have been no disclosures about costs. I have looked across several wikis and it is hard to located a price-breakdown for any project or research undertaken by WMF in the last year, in the interest of openness I only think it would be helpful if some information, even indicative is given out about costs.
It is nearly impossible for an outsider(....and I mean me and others on this wiki, non-staff members this time.) to do an ROI calculation without factoring the costs of any project, we would be left to rely entirely on conjectures.
It is also hard to quantify things like "huge" or 3 full time employees and 9 contractors for 3 months. We have no scale or idea to go on, HR and paid work-hours can only be quantified once we have a scale to work off of, it might be easier to just give a lump-sump figure.
There is also no disagreements between WMF conducted the research and then it was mentioned on the foundation blog. I don't think it needs to be touted that the material generated is CC-licensed or under GPL.
Regards. Theo10011 13:12, 28 December 2011 (UTC)Reply[reply]
I'm not really interested in discussing general project budgeting or forecasting at the WMF here, since no one who ran the summer of research is in the position to change how that works. The Board, Sue, and our C-level leadership handle that, so go ask them. Pine asked a perfectly reasonable set of questions about the Summer of Research, so I am happy to do my best to answer. After I get back from the holidays, I would also be happy to present the exact budget of the program too, if I am able. Steven Walling (WMF) • talk 02:59, 29 December 2011 (UTC)Reply[reply]
Steven, the budget would be nice if you can get it, preferably with a breakdown by project. Speaking only for myself, I'm not interested in seeing costs by FTE so much as I'm interested in seeing costs by project, hopefully with an explanation of projected ROI that was made at the time that the budget request was first approved. If WMF doesn't do this kind of project budgeting, that information is useful although I think that it's something that should be done, and I could raise that point elsewhere as you suggested. I think that doing this for Summer of Research 2012 would be a reasonable place to start, assuming that it's not already being done. My main interest is ROI and in order to measure and compare ROIs it's necessary to know what the initial investment was. Pinetalk 20:51, 29 December 2011 (UTC)Reply[reply]
Theo, I'm not sure that understand all of your comment exactly, but I agree that regarding the questions of how projects are budgeted and ROI is determined for projects in general, it would be helpful to have more information than seems to be currently available online. I understand Steven's point that it might be better to raise the broader questions in a place that's more suited to it than this Summer of Research talk page. Would you like to raise this point in foundation-l, research-l, or some other place where the more general question could be asked? I can raise the question myself if you'd prefer. However, I think that we should wait until Steven comes back with the information from Summer of Research 2011 because this would give us a useful example of how WMF is currently doing its budgeting and ROI calculations for research. Pinetalk 20:51, 29 December 2011 (UTC)Reply[reply]
Hi Pine, actually I raised this exact issue last week on Internal-l. The response was not much different. I can assume other staff members and board members also saw the subject being brought up; to Steven's credit, so far only he responded. While I'm happy to raise the request again, I am not optimistic that we are likely to get that information. Regards. Theo10011 00:39, 30 December 2011 (UTC)Reply[reply]
That's strange. I'd think that they would want this information for their own use, and I can't imagine what the harm would be in releasing a list of costs by project and projected ROI of projects. Pinetalk 08:49, 1 January 2012 (UTC)Reply[reply]


Total expenditures WSOR 2011
Description Amount
Salaries and Wages 109,281.13
Staff Screening Report Fees 724.5
Staff Enrichment 429.44
Outside Contract Service 2,020.00
Books, Subscriptions, Reference 128.97
Travel 11,302.59
Total 123,886.63

Hey Pine, here's the info you asked for, and sorry for the wait. I can go through any items for which the definition seems vague or the spending large. Note this is end expenditures not a budget. Steven Walling (WMF) • talk 21:34, 7 February 2012 (UTC)Reply[reply]

Hi Steven,

My first take on this is that for an organization with $30 million in the bank (as of the time of the staff presentation that was recorded and linked from foundation-l), I was expecting SOR 2011 to be a much bigger total expense than shown here. This is a pleasant surprise. I was hoping to see a breakdown by individual sub-project but I don't know if records were kept to that level of granularity when the total cost is "only" $123,866.63, which is relatively small in terms of major research projects and in terms of WMF's whole budget.

My second thought is that this discussion links nicely with what I understood Sue to be saying about setting up a panel of volunteers decide which projects should get funded. I think that that having such a group could be a good thing if the board isn't currently going into that level of detail on studying project budgets. If the board isn't doing this already then that would be a concern of mine, and if Sue thinks that a group other than the board should fill that role then I'd prefer to have another group do it than to have it done only by staff in a way that's not very transparent. I think that having that kind of scrutiny of budgets, ROI comparisons, and prioritization is a very good thing. It would be interesting to know if the current RCOM will have a budget role if Sue's idea of setting up a separate panel goes forward. Do you think that RCOM would want to comment on this idea? If so, could you bring it up at their next meeting and/or on Research-l?


Pinetalk 09:33, 9 February 2012 (UTC)Reply[reply]
It perhaps seems confusing because of the page structure, but re: the "per-project" breakdown... every single research namespace page that was under WSOR was funded through this, and financially it was treated as all one project with eight contractors doing research. Some may have cost more based on how much time etc. they took, but generally we didn't break it down topically. Steven Walling (WMF) • talk 00:26, 10 February 2012 (UTC)Reply[reply]

Thanks for this Steven. I am really glad this information was released. It would have been helpful when the discussion was ongoing or when the question was first asked, but I understand it was not in your hands. I was expecting the expense for it to be more around 200k USD. Maybe if we factor in the the cost of WMF liaison in charge of the project it might come closer to it. But I am really glad, this was released. I only hope this approach continues. It comforts community members to have these spending details and for anyone interested in ROI workings. Thanks for bringing this up Pine, and Steven for providing the breakdown. Regards. Theo10011 08:43, 10 February 2012 (UTC)Reply[reply]

Hi Steven, thanks from me as well for releasing this information. For the amount of research garnered, $123k is really a steal. Well done! Craig Franklin 10:52, 13 February 2012 (UTC).Reply[reply]