Talk:Collaboration/Flow satisfaction survey/Report

From Meta, a Wikimedia project coordination wiki
Jump to navigation Jump to search

The survey was biased[edit]

The whole survey was based on the wrong assumption, that only with Flow Wysiwyg-editing is possible on talk pages, which is absurd. I've made the surveyors aware of this, but they chose to ignore it, and thus there's a strong pro-Flow bias expected due to this artificial connection between normally structured talk pages and old-fashioned Wikieditor. Grüße vom Sänger ♫(Reden) 17:27, 6 February 2017 (UTC)

Pinging Alsee for comments following their comments about this survey on Phabricator. — Scott talk 18:40, 6 February 2017 (UTC)
The survey is about the current experience, with the visual editor not enabled on talk pages. If the visual editor has been enabled on some pages, that's marginal. The visual editor is not designed to take care of structured discussions. Trizek (WMF) (talk) 13:23, 7 February 2017 (UTC)
Yeah, that Bulls*** again. Not one valid reason given in this thread, besides We don't want it and the inherent It would make Flow less shiny. BTW: The new 2017 Wikieditor, which is a VE with some layout changes, is working on talk pages. Grüße vom Sänger ♫(Reden) 16:31, 7 February 2017 (UTC)

POV[edit]

Trizek, I know that everyone at the WMF has good intentions. We all want to build the best possible future for the movement. I find it understandable that staff working on any given project are optimistic about the benefits they hope their project will bring, and that they are naturally inclined to interpret and frame data about their project in the most optimistic light. However this report has got to be the most extreme case of POV-pushing that I have ever seen come out of the WMF.

I find it disappointing, but unsurprising, that the report failed to address the extreme canvassing concerns I raised with you on Phabricator. What I do find surprising... what I find outrageous beyond all belief... is that this report has the gall to turn that around and claim the survey results were biased in the opposite direction. The gall to accuse community members of "ballot-stuffing", "post[ing] messages aimed at influencing the survey against Flow". And to do so after the WMF canvased over two thousand ballot-stuffing talk-page messages. The gall to assert that that's it was somehow the negative responses that did not have full "legitimacy".

For now I will set aside the pervasive other problems with this report. Please cite the alleged "ballot-stuffing" "messages aimed at influencing the survey against Flow". Alsee (talk) 21:14, 6 February 2017 (UTC)

I'll gather some data to give you a comprehensive answer. Trizek (WMF) (talk) 13:25, 7 February 2017 (UTC)
Trizek (WMF), the only "data" being requested is links to the messages. Alsee (talk) 13:57, 7 February 2017 (UTC)
@Alsee:, after further consideration, we believe that it is better to remove that paragraph altogether. What is important is to reflect that the survey had a portion of participants whose priority was to declare their strong opposition to Flow, without being interested in getting in the details requested by the survey. This is based on comments where the only feedback was in the lines of “kill Flow”, all of them very similar, even in fields were positive aspects of Flow were asked. These replies were submitted within a narrow period when the survey was discussed in English Wikipedia.
Those answers have been computed just like all the rest. They are legitimate, they do reflect the opinion that some contributors have about Flow. Providing more details here, like links to public pages or quotes from messages, would only contribute to heat the discussion for no good reason. This is also why we are removing that paragraph. Its main point is already reflected in the previous one with a simple statement.
Trizek (WMF) (talk) 13:15, 8 February 2017 (UTC)
Will you do the same with overeager Flow groupies, who did something similar the other wasy around? Grüße vom Sänger ♫(Reden) 15:57, 8 February 2017 (UTC)
Sänger ♫, the Flow groupie responses and the Flow hater responses are being counted fully. Alsee (talk) 16:24, 8 February 2017 (UTC)
I see my fault, I should have watched the other side more carefully. The incriminated paragraph was indeed removed. There's still no mention of the completely biased set of questions, that deliberately bent the answers towards positive Flow feedback, but that's dealt with here on this side, the structured discussion page about this report. Grüße vom Sänger ♫(Reden) 16:44, 8 February 2017 (UTC)

Trizek (WMF) sorry for the delay, and thanks for removing that paragraph. I tried to raise concerns to you during the phabricator discussion,[1] but they were never addressed. You said we should discuss it after the results were published.

  • Was my explanation there clear? Do you understand why I described the survey as "canvassed", and why the community views the results as biased and invalid?
  • Do you agree that participation was heavily stacked in favor of the most enthusiastic Flow fans? And that the survey percentages were heavily biased in favor of Flow?
  • Do you agree that the report needs to be heavily revised, to reasonably address that heavy methodological bias?

Alsee (talk) 06:17, 18 February 2017 (UTC)

Alsee, let me insist once more on why we don’t think this is a canvassed or biased survey. We have documented the Limitations. This was a Flow satisfaction survey, targeting users with first hand experience about Flow, promoted in projects that had Flow enabled. It was still open to anyone who wanted to participate. As organizers of this survey, we don’t agree with your conclusions. If there is anything in the conclusions of the survey that you find not accurate or biased, please specify. Trizek (WMF) (talk) 13:48, 20 February 2017 (UTC)
You know pretty well how biased the questions were. I tried to translate them without bias, and made you aware of this, but you chose to ignore that an insisted on your extremely biased questions. And that's just the questions themselves, let alone the canvassing between proselytes. Grüße vom Sänger ♫(Reden) 16:30, 20 February 2017 (UTC)
Trizek (WMF) regarding the Limitations section "the survey was promoted by some users seeking like-minded participation", I ask you again to identify the community posts being referred to. Was there something going on that I didn't know about? I know I advertised it to EnWiki Village Pump[2] and Centralized discussion.[3] Those are neutral locations, and there was no intent and no indication of bias in favor of individuals with any particular viewpoint. Flow-fans and Flow-haters in the community had exactly equal opportunities to see it and participate. We call that proper and unbiased advertisement.
Regarding the WMF's promotion of the survey, I am unsure whether I failed to adequately define canvassing, or if you failed to understand it, or if you and I have some disagreement on the facts of the case. You keep repeating it was open to anyone who wanted to participate. That is completely irrelevant to the canvassing issue. Canvassing means targeted advertizements which bias participation in the survey. As far as I am aware, we are in agreement on the facts of the case. I assume we are in agreement that the WMF posted thousands targeted talk-page survey invitations to people who had actively opted-in to Flow on their user_talk. (In addition to some survey postings in neutral locations.) I assume we are in agreement that people who had actively opted-in to Flow are overwhelmingly skewed in favor of Flow-enthusiasts. I assume we are in agreement that those targeted talk page invitations brought in survey-participants who otherwise would not have shown up. (If that was not the intent and expectation, then sending those messages would have been a total waste of time.) Given those facts, I am baffled how you can deny that the survey participation was heavily stacked with an abnormal percentage of a-typical Flow enthusiasts. I am baffled how you can deny that the final percentages in the survey were heavily inflated by canvassed Flow-enthusiast participation.
  • Do you disagree on the definition of canvassing?
  • Or do you disagree on the facts in this case?
  • Or do you disagree that canvassing has the effect of biasing participation, with the effect of biasing the results? Alsee (talk) 12:07, 21 February 2017 (UTC)
Re-pinging Trizek (WMF). Alsee (talk) 06:34, 4 March 2017 (UTC)
@Alsee: from the options you present, the most accurate is "you and I have some disagreement on the facts of the case". The report explains the limitations, Trizek (WMF) has explained what is our view and why, and we clearly disagree with your analysis. All the opinions reflected in this Talk page have been read by the same team who has read the survey results and will decide the next steps in Flow development. The action is moving onto the Wikimedia Foundation Annual Plan proposal.
I think a deeper problem that goes back to the earliest stages of this survey project is a confusion (or a desire) to see this survey as a vote to either continue or cancel the development of Flow, when it is not. This has been a Flow Satisfaction Survey all along, aiming to capture satisfaction levels, good points and pain points of Flow as seen by their users here and now. The goal of this survey was and is to inform the future plans of the Wikimedia Foundation, as one of several factors taken in product development.
The report includes a list of problems that Flow has, and such list is very similar to the list of problems you detailed to me somewhere else before the results of the survey were published. The report also includes a recommendation to offer Flow to the communities asking for it, which is compatible with your desire not to see Flow in communities not asking for it. However, the survey responses also reflect that Flow has positive things, and that there are individuals and projects looking forward to seeing the problems fixed and to continuing Flow's adoption following the demand for it. You and others might have a different opinion about how satisfied you are with Flow, but it is not the only opinion across Wikimedia communities, as the survey (and evidence across Wikimedia) shows.Qgil-WMF (talk) 08:49, 9 March 2017 (UTC)
Qgil-WMF: No, the limitation section makes absolutely no mention of the issue. The survey grossly denied equal participation. Anyone who had actively opted-in to Flow was delivered a privileged invitation actively canvassing them into the survey. People familiar with Flow who had not actively opted-in to it on their talk page were almost completely excluded. They had to be lucky enough to stumble across a rare neutral posting. We deal with canvassing issues like this regularly. In my community work I recently issued a formal closing[4] against 65% majority, in part because it was canvassed. We consider canvassed results to be blatantly invalid or fraudulent.
And as I explained before the report was even released, it wouldn't be adequate even if it were included buried in the limitations section. The it should be clearly noted in close proximity to the inflated numbers.
Qgil: "I think a deeper problem that goes back to the earliest stages of this survey project is a confusion (or a desire) to see this survey as a vote to either continue or cancel the development of Flow, when it is not." - Ah, the confusion is on your end. If you check the report it explicitly states: "Based on the result of this survey, the Technical Collaboration team recommends to continue the development of Flow." I assume you're reasonable enough to admit the team would choke to issue that recommendation if the results had been less than 1% willing to use Flow.
Any project that genuinely had 38% support would need to be clearly labeled as "High Risk" of ultimate failure. (AKA Liquid Threads version+1.) However given that the reality is far below even that 38% figure, it beggars belief how any objective analysis can justify a recommendation that valuable resources should be diverted away from other much needed work. Alsee (talk) 10:15, 11 March 2017 (UTC)

"How to amuse people and influence results"[edit]

Short video version: Mythbusters did it earlier and better Longer version about just a few tiny points:

  1. for almost every question there is different number of answers, then everything is nicely converted into percentages, without giving the numbers of answers in each case
  2. giving percentages based on 20+ answers... so each answer is ca. 5%? NO! One answer is "worth" 4.1% on EnWiki, but 1.16% on FrWiki and 2.43% on Wikidata. Yet they are all positioned together, as if they were comparable...
  3. Then the phrasing Overall satisfaction based on project where Flow is mainly used: let me translate it, using EnWiki as example:
    1. "Mainly used" means "out of 41 000 0000 (forty one millions) pages it was used on 2 (two) pages, which have been deleted since
    2. "Overall" means "out of 135 000 (one hundred thirty five thousands) active users we polled 24 (twenty four), i.e. 0.018%
    3. "satisfaction" means 15 out of 24 are not strongly satisfied.

I'm not even starting on other issues of this... report. But I'll send it to a colleague, who teaches methodology and statistics, so he can show pupils how to fake results (and how analyze pseudo-statistics). He'll find it useful. Good example. Thanks in advance on his behalf. --Felis domestica (talk) 23:39, 6 February 2017 (UTC)

If you expect something which is not in the report, please suggest it. Raw data is also available with numbers.
Concerning en.wp's results, they have been posted in the "Overall satisfaction based on project where Flow is mainly used" section because it is a big wiki and also for a comparison concerning opinion. Other wikis in that section have more than 60 answers each and much more active Flow boards. Do you suggest to change the relevancy of en.wp results? Trizek (WMF) (talk) 13:41, 7 February 2017 (UTC)
  • @Trizek (WMF): "If you expect something which is not in the report, please suggest it" - I am ashamed I need to suggest such things, because they should be default, but okay. I expect:
    1. Honesty
    2. Proper methodology
    3. Proper reporting
  • As for points 1 and 2, nothing spectacular, the level covered by 1st semester of any social science studies would be sufficient. And not I am not suggesting "changing relevancy of EnWiki results" - one cannot change something irrelevant. Into what? Into even more irrelevant? Because one cannot change irrelevant into relevant --Felis domestica (talk) 14:59, 7 February 2017 (UTC)
    I've make a quick change to focus on more relevant wikis, and kept English Wikipedia as a possible comparison. Trizek (WMF) (talk) 13:25, 8 February 2017 (UTC)

Stakeholders[edit]

I do not want to write about analysis of the results, which is a different story, discussed above.

I am going to write about the choice of the users the poll was addressed to. The sentence "The survey was distributed to all public wikis using Flow in order to reach users who use Flow and can compare it knowledgeably to unstructured wikitext talk pages." tells us that there was a fundamental mistake in the identification of the stakeholders. Stakeholders are not only those who have installed Flow on their talk pages. A large group of users has been forgotten: those who were forced to use Flow by the decision of the former users. Their satisfaction is equally important.

So, for future use, I suggest the following poll procedure:

  1. Identify a wiki with Flow as default, which is relatively frequently visited by casual visitors from non-Flow wikis. mediawiki seems to be such a case.
  2. Poll visitors in addition to residents, asking them to answer your questions, say, 10 days after their first recorded visit, to let them finish the interaction and get the impression how Flow works for them.

Gżdacz (talk) 08:15, 7 February 2017 (UTC)

Thank you for your feedback concerning the methodology and the interesting idea you suggest. It is always difficult to target visitors and we have used the best solution we had, as documented: message wikis where Flow is used, message people who use Flow, message people who used to use Flow. We were not only targeting people who use Flow and the survey was open to anyone. Trizek (WMF) (talk) 13:51, 7 February 2017 (UTC)

Poll without poll[edit]

One more thought on this: it is often possible to infer a lot of information from the behaviour users, without asking even a single question.

Take me as an example:

  1. I am a relatively frequent visitor on mediawiki, where I have to use Flow.
  2. I am and were doing tests of other software products of WMFL, including Visual Editor and New Editor of Wikitext.
  3. My home plwiki has Flow enabled.

Identify users which satisfy 1, 2 and 3, and verify, how many of them have switched their talk pages to Flow. I guess you have a lot of data to harvest and analyse.

Gżdacz (talk) 13:17, 7 February 2017 (UTC)

P.S. I have not switched my talk page to Flow.

It is always difficult to target visitors and we have used the best solution we had, as documented: target people on wikis where Flow is used, people who use Flow, people who used to use Flow.
Your idea may be faisable, but it only can be done on wikis where Flow is available as a Beta feature and it may exclude people who are interested in Flow but not other improvements or wikis. We will have a look at it. Trizek (WMF) (talk) 13:55, 7 February 2017 (UTC)
For me, a visitor is a person who makes an edit on a page with Flow and has significantly more edits elsewhere. It is not difficult to identify them using logs. Of course, you are restricted to users from wikis where Flow is available as beta, but you are very much restricted to such people anyway, right? Yes, it excludes people who are interested in Flow but not other improvements. All cheap and easy methods are typically restricted, one way or another. Gżdacz (talk) 14:29, 7 February 2017 (UTC)
We keep your idea in mind if we need more data. Thanks! Trizek (WMF) (talk) 13:20, 8 February 2017 (UTC)

Summary - methods - calculation - results[edit]

Felis domestica, you have moved some sections to make them fitting the "summary - methods - calculation - results" system. Why would it be more relevant than the current one, which is focusing on analysis first? There is plenty of methods (and cultures) and we welcome any suggestion concerning the plan if we discuss about it. :) Thanks, Trizek (WMF) (talk) 14:09, 7 February 2017 (UTC)

Because it is a basic, most default way of writing a research report:
  1. State the question
  2. Review current state of knowledge
  3. Describe methodology
  4. Give results (making clear distinction between results and discussion)
  5. Discuss results
  6. Conclude and show implications
Here the order is Conclusion - Discussion of results - Implications - (results, hidden away in separate document). --Felis domestica (talk) 15:28, 7 February 2017 (UTC)
We have made another choice: give the conclusion first, for people who are not interested in having all details. The more curious you are, the more you read and dig in. Trizek (WMF) (talk) 13:24, 8 February 2017 (UTC)
No reason given why the report has to be presented in this particular "basic, most default way". Are we just supposed to ignore the usefulness to the reader, or ignore how common it is two write reports with another structure? MartinPoulter (talk) 14:53, 14 February 2017 (UTC)

Why wasn't there a question for whether the community wants threaded commenting in Flow?[edit]

It seems strange to me to have a survey about a tool that doesn't allow for branched-commenting which is intended to replace a tool that does allow threaded commenting without asking the audience for whether this design choice is appreciated by the community. Why wasn't the community surveyed on this issue? Does the Flow team think they already know the answer well enough? Was there a fear that a vote that the community wants threaded commenting would actually influence Flow development and make the Flow developers change their roadmap? ChristianKl (talk) 16:35, 15 February 2017 (UTC)

ChristianKl, that question concerning the comments structure has been raised on free forms and also in various other feedback since a long time. There is a Phabricator task about it, and the Collaboration team have time to time discussions about that problem. We have noted that people are asking to have a more "compact" mode on Flow. If Flow is a project for the next fiscal year, we will research for the best way to display discussions (threaded, threaded, branched...). Trizek (WMF) (talk) 18:33, 16 February 2017 (UTC)

Results[edit]

So among a group that is made up of a higher proportion who opted into using FLOW per "distributed to users and communities that had first-hand experience"

The majority of people prefer the wikitext editor. Not very encouraging IMO. Doc James (talk · contribs · email) 17:03, 20 February 2017 (UTC)

Indeed. The only feature which gets a large support is the ability to follow individual sections, so it would be worthwhile to invest on a lightweight solution for phabricator:T2738. The other things people care about relate to organising, summarising and refactoring discussions, which confirms earlier findings: this is an area in which wiki pages excel in general although there are some specific tasks which can be made easier (the survey asked about specific solutions to part of the problem). --Nemo 17:00, 7 June 2017 (UTC)

Excluded data[edit]

Why was Flow's broken wikitext support excluded from the 2016_Flow_satisfaction_survey,_rawdata.pdf? I know multiple people submitted this issue in the text field. Flow literally can't save wikitext. It has wikitext simulator that breaks or mangles wikitext. This would explicitly include any responses that mention of Parsoid in a negative context, but editors unfamiliar with the technical details could have raised the issue in the form of generalized complaints of faulty wikitext behavior. How many people complained about wikitext issues?

And what else was filtered out of the text responses? Alsee (talk) 00:35, 24 February 2017 (UTC)

Ping Trizek. I see you're still active on this page.[5] Perhaps you could answer my question above? No one has responded in essentially a year. It was severely troubling when the WMF arbitrarily refused to release the actual survey data, and even more troubling when the WMF instead provided faulty or fraudulent "raw data" which appears to have systematically excluded responses regarding a severe flaw of Flow. Alsee (talk) 18:46, 2 February 2018 (UTC)