Jump to content

User talk:Yair rand/Global Council distribution formula

Add topic
From Meta, a Wikimedia project coordination wiki

Formula for non-Wikipedia projects


Given that by design many people who read Wikidata content and access WikiCommons media don't produce any Wikidata or WikiCommons pageviews while most people access Wikisources actually do produce pageviews I don't think it's a good idea to put more emphasis on readership for non-Wikipedia projects. ChristianKl17:08, 22 May 2020 (UTC)Reply

@ChristianKl: I don't know how to measure Wikidata or Commons.
I agree that measuring pageviews for Wikidata doesn't make sense, but as I mentioned in the notes, the editor numbers don't work either, and I don't have any other numbers to work with. Since the output of one member for Wikidata more-or-less matches its actual community size (based on things like watchers of the main page, which is usually a decent proxy), I left it in for the demonstration. In practice, we'd certainly want to use something else.
For Commons: Commons actually does get a considerable amount of pageviews, interestingly enough. Commons got nearly twenty-nine million unique visitors last month. Still, most people accessing Commons content aren't generating Commons views, and most editors who are "active" on Commons don't consider themselves Commons editors (or, "part of the Commons community"), so there's a lot of distortion in both directions. Like Wikidata, we'd want to use some other numbers, but I can't think of what. --Yair rand (talk) 15:37, 24 May 2020 (UTC)Reply
It seems to me like editor numbers are better then page views, yet you weighted page views much stronger then ediorship. I would rather use SQRT_3(PageViews*ActiveEditors_5*ActiveEditors_100) then SQRT_3(PageViews*PageViews*ActiveEditors_5). I don't object to using pageviews but to rank it higher then editorship. ChristianKl21:30, 24 May 2020 (UTC)Reply

Formula Wikipedia projects


I don't see a good reason to weight the numbers of speakers stronger then the number of page views. Why do you think it should be weighted higher? ChristianKl17:08, 22 May 2020 (UTC)Reply

@ChristianKl: The idea is to fit the aspirational "represent communities we wish to serve" bit. While I think that giving some weight to native speakers is a good idea, I admit that I put it higher than I would have otherwise because the Strategy people and WMF seem to be pushing hard for that kind of representation, and the worst-case scenario would be if they just decide to appoint the membership instead of permitting at least a sizable community majority. --Yair rand (talk) 15:37, 24 May 2020 (UTC)Reply
@Yair rand: Wikiproject Medicine is a community project that does serve some people in Africa in the developed world but Swahili Wikipedia likely wouldn't speak up for the interests of Wikiproject Medicine.
It makes more sense to me to speak of representation when someone actually is able to present editorship or has the trust of readers who visit content. From my perspective it makes more sense to have some of the 25% of Global Council seats that go to WMF/affiliates. It seems wrong to me to give the Chinese Wikipedia more seats then the German one given that the German is serving more users and also soliciting more donations. ChristianKl21:42, 24 May 2020 (UTC)Reply

Raw data


It would be great to have the raw data for your calculations in a table so that it's easy for other people to run their own numbers. ChristianKl17:08, 22 May 2020 (UTC)Reply

@ChristianKl: Posted here. --Yair rand (talk) 15:37, 24 May 2020 (UTC)Reply
I'm not sure native speaker counts are what matters. Plenty of people read Wikipedia's in their second language. ChristianKl22:32, 24 May 2020 (UTC)Reply
I can think of two counter-arguments:
  • We generally make a priority of allowing people to read content in their own, native, language. Making content available in a second language is much better than nothing, but it's not considered completely ideal.
  • Supposing we have a situation where group A speaks language A as a first language and B as a second, and group B speaks B natively but also speaks A as a second language, should each edition count as having the speaker population of both groups combined? It feels vaguely like double-counting.
Still, I don't think it would be reasonable to discount second-language speakers entirely. Perhaps with half the weight of the count of native speakers, or something like that. --Yair rand (talk) 00:25, 28 May 2020 (UTC)Reply
I think there's a problem with the council having multiple roles. On the one hand, new WikiProjects need more monetary support per active editor/page view then established projects and I don't see a problem with a structure that leads to such support.
On the other hand the established Wikimedia community has certain values and if you give people outside of the established Wikimedia community a lot of power to reshape our norms that doesn't feel proper. A majority of the world believes for example that blashemy is a huge problem. This leads to UN resolutions such as the anti-blashemy resolution. If we would have a council that seeks to provide equity to non-Western views and allow the council to set norms that forbid blashemy in way that forces the major Wikipedias to have to change their content, that would feel very problematic to me. Germany doesn't have many speakers but has a Wikipedia community that effectively self-governs itself. It seems alright for the German Wikipedia community to be raise more money then they spend with excess money going to support the development of a new Wikimedia community in a currently underserved community but it feels improper if a bunch of underserved communities can enforce anti-blashemy rules that force the German Wikipedia into having to change content.
Apart from it being directly a bad outcome if DeWiki or EnWiki suddenly have to treat blashemy the way the majority of the world believes blashemy should be treated, it will also cause a lot of internal conflict as neither the DeWiki nor the EnWiki community will be very welcoming of another body enforcing norms on it. If you actually want peace between stakeholders it makes much more sense to staff a global council in a way that representative of the views of the current community. ChristianKl11:33, 29 May 2020 (UTC)Reply

Affiliate distribution: Preliminary thoughts


Some preliminary thoughts on how to distribute representation among the affiliate groups. We have 174 affiliate organizations, of drastically differing sizes and scopes. Some data points that could be used:

  • Membership counts. Higher membership, more representatives, (maybe capping WMDE if otherwise everything gets messed up). This sounds intuitively reasonable. However, if this is a large part of what determines representation, I'd give it less than a few years before many affiliates start putting ever-increasing efforts into membership drives. This is a real thing that happens at other umbrella organizations that use this method, and it's not pretty.
  • Number of supported editors. Countries - how many editors in the country. Languages, sister projects - how many editors. Not all affiliates have such associated groups, however. Some are about particular goals or efforts.
  • The general population of the associated country/region/language. Similar to above, not all affiliates have that.
  • We could attempt to measure how much the groups actually accomplish. This is difficult, and trying that kind of thing often tends to end up easy to game. It's also important not to incentivise spending as much money as possible.
  • How much money they bring in. (I don't like this idea.)
  • Number of affiliates, raw count. We could just say each affiliate has one vote, and divide things however, like they do with the board seats. That is ... problematic, for various reasons.
  • Some sort of measure of importance as determined by the editors. Looking for general editor satisfaction that something was accomplished by the affiliate. Maybe something to do with one of the other recommendations regarding measurement of such things. Maybe during the regular survey, ask which country/language, then if there's an associated affiliate ask if the editor approves of the affiliate's work.
  • How many people from the affiliate vote. There could just be general affiliate-member elections, selecting from a broad pool of candidates. Problem: We'd probably have to exclude affiliate voters from the regular project votes, or it would feel really unfair. (Might feel unfair anyway, tbh.) This may cause an increasing gap between non-affiliated and affiliated.
  • Some convoluted combination of the above, set up so that it's difficult to deliberately change the outcome by taking specific actions.

Assuming we end up having regional/thematic hubs, it may be worthwhile to specifically have representation both from the hubs and the affiliates themselves, not relying on hub appointees to represent the affiliates. Some geographic affiliates have overlapping areas. Some affiliates have relevant expertise/experience which would make it helpful to have a representative of theirs, independent of the general importance of representing the group.

The above points are not yet sufficient for building a reasonable method out of, I think. --Yair rand (talk) 05:06, 3 August 2020 (UTC)Reply

Your formula implies 30-40 seats for affiliate members. That'd give them at least five times more representative than the next biggest source (en-wp) and fifteen to twenty times more representatives than important constituencies like fr-wp and de-wp. That's going to be a very big, cohesive interest bloc on an otherwise extremely fragmented Council. I'd expect them to dominate proceedings as a matter of course. --RaiderAspect (talk) 10:19, 6 August 2020 (UTC)Reply
@RaiderAspect: As I suggested on the page itself, I would expect the non-elected seats to be divided between affiliate-selected and those appointed by the Council itself. I would imagine something like 15-25 seats for affiliate-selected. --Yair rand (talk) 17:34, 6 August 2020 (UTC)Reply
I'd missed this at the time, but *both* of those are also wildly OTT. The affiliate representation, at best, should be 5%, given that it enables double-representation. The appointed seats should also not be larger than 5% and I'm at a loss on why there should be any appointed voting seats. The representatives can just ask for expert opinions if they are needed. Nosebagbear (talk) 23:07, 7 June 2021 (UTC)Reply

Fairly extreme degressive representation


As someone who feels that this system is already going to make it harder for individuals who feel that the WMF would rather sidestep direct consultation and shift to primarily communicating solely with affiliates and now this Council, I have some additional concerns.

One is the sheer scale of degressive proportionality (particularly when layering methods) - a full square root for Wikipedia, absolutely hammers representation for certain groups. There's avoiding preventing no representation for small languages/projects and there's the case that my "vote" as an en-wiki editors is so much smaller.

On top of this we have both readers and speakers - I feel that having both of these weights away from current editors - retention is more critical, and so having one or the other is okay, but not both.

The presence of so many affiliate spots + a number of other appointed spots, means that free-standing members of large wikipedias have their comparative voice reduced 5 times - why not use the appointed seats to be responsible for balancing representation, or feel the earlier stages did it sufficiently and avoid more than a handful of appointed seats for critical needs?

And in relation to some comments above, if we are deliberately weighting our mechanism out of fear that the Board might dictate the method that's also concerning - we shouldn't let fear of poor judgement on their behalf make us pick ourselves a poor mechanism. Something designed to work with the WMF shouldn't be having them have any say in their structure or we are handicapping their countervailing weight for the future. Fiat justitia ruat caelum - if they want to refuse a purely community-based approach, then let them say so, as that would be right. Nosebagbear (talk) 14:48, 8 November 2020 (UTC)Reply

Re affiliate and appointed seats: The Recommendation states "It will be composed of both elected and selected members", and doesn't get any more specific on the balance. It would be technically satisfied by, say, 90 elected and 5 selected, but I don't think the relevant folks would accept that. Regarding using the appointed seats to balance language representation: It's doable, I suppose, but I suspect that we'd be much better off with the kind of person that the Hindi Wikipedia would elect as their representative rather than, well, the kind of person the relevant other group(s) might select for their representation, if you get my drift.
Re underrepresentation for larger projects: Yes, the larger Wikipedias are underrepresented. The English Wikipedia in particular is drastically underrepresented in the formula relative to its proportion of editors. (In the most extreme case, there are a couple of languages where an editor's vote would count effectively more than 20 times that of an English Wikipedian's vote.) Part of what led to this in my proposal is the need for representation for groups like sister projects (which are unfortunately often ignored in decision-making), emerging project languages (per the Strategy Recommendations), and smaller projects (About 30% of Wikipedians work in smaller language editions that each have less than 2% of all Wikipedians, and I feel that it's very important that we avoid having a lot of contributors that have zero representatives speaking a language they understand). (The readership stat, incidentally, benefits ENWP. ENWP has a greater portion of the readership than it has of the editorship.) But also, there is the simple difficulty that I don't think the English Wikipedia would be able to elect, say, 20 representatives in an informed manner. There would simply be too many candidates for any voter to evaluate, in my opinion. (A way around that would be to decrease the overall size of the GC, but that would again cause problems in representation for other languages and projects.) How many representatives do you think ENWP would be able to elect as a group? --Yair rand (talk) 22:00, 23 November 2020 (UTC)Reply
@Yair rand: I'm rather concerned that inability to vet that many candidates is factoring in, but if it is, your concern seem unfairly focused on the "too many to examine" component as opposed to whether that small communities will be able to pick their candidate, even where a number of them need to be grouped together in disparate languages, hindering any vetting of their candidates for their seat, not to mention several wikis with major governance issues. To answer your specific question, 20 representatives (so, say, 35 candidates) would be a lot for a single election, but picking 10, on a biennial basis, would be well within our abilities.
To me, having "a couple of languages where an editor's vote would count effectively more than 20 times that of an English Wikipedian's vote" in order to try and promote fairness is, genuinely, morally outrageous. When trying to weigh up unfairnesses (now a word), their impact should be assessed not merely against each other, but against the status quo - that is, no extra structure.
Having my electoral weight shrunk to 5% is so egregiously unfair that it would be significantly less damaging to just not have any new structure. I don't think it could be reasonable accepted to have anyone's vote count more than twice that of another's.
If that wasn't going to be done, the only way I could see to avoid it would be to use a method that for any decision to pass the GC, it would need a majority both of representatives, and of what I might call their "non-degressive proportion" or their "represented populace" (in effect, if there wasn't any weighting occurring beyond that of editor/reader numbers, would there be a majority).
This would prevent both tyranny of the majority (clearly a concern of the small-language wikis) and tyranny of the minority (something enabled by any regressive scheme, including your proposed one) Nosebagbear (talk) 15:02, 2 December 2020 (UTC)Reply
@Nosebagbear: Okay, let's try some experimentation with some formulas. New idea #1: Switch from D'Hondt method to Sainte Lague method (effectively making it easier to get that first seat), lower the degressiveness somewhat (interlanguage from ^0.5 to ^0.75, interproject from 0.75 to 0.85), even the balance between the editors-alone variable and the mixed (by removing the *3), and use whatever second-language-speaker estimates are easily available, and merge the West Asia and North/Central Asia groups so they don't go without spots... Output is 49 for Wikipedia (10 for en, 3 each for es, de, fr, 2 each for es, ru, ja, minor CEE languages, 1 each for ar, pt, it, hi, fa, pl, ko, tr, vi, nl, id, bn, uk, he, cs, minor languages from Northern Europe, South Asia, ESEAP, Africa, North/Central/West Asia, and Everything Else), 3 WT (1 en, 2 others), 2 Commons, 1 each for WB, WS, WQ, WD, Mediawiki, and all others.
Hm, moderate loss from a "representative of the world" perspective (Asian languages dropped to 16), but nobody lost their spot entirely (except kinda C/N Asia having to share w/ W Asia), and a considerable boost to editor representativeness with the large Wikipedias, but not making it anywhere near avoiding having votes from one project count more than twice that of another's.
I'm uncertain about whether this is an improvement overall. What do you think? --Yair rand (talk) 08:10, 7 January 2021 (UTC)Reply
Hi @Yair rand:, it certainly seems better at balancing actual number representation vs ensuring general representation. I'm not sure why "not making it anywhere near avoiding having votes from one project count more than twice that of another" is an issue - where has that requirement come from? It seems wholly arbitrary. Though I do get surprised that WD has such a (comparatively) small editor base Nosebagbear (talk) 09:23, 7 January 2021 (UTC)Reply
@Nosebagbear: I was referencing your point above regarding avoiding having anyone's vote count more than twice that of another's. --Yair rand (talk) 10:01, 7 January 2021 (UTC)Reply
Sorry, @Yair rand: - I interpreted your most recent use as meaning "no project with more than twice the representation (as in, total reps) than another" - which seemed wildly off and I should have sensed checked that wasn't what you meant. Blame the lack of morning tea. Hmm, I've not re-done the maths using this set - what's the biggest imbalance here (in terms of smallest wiki to get a whole rep vs, presumably, en-wiki?)
@Nosebagbear: (Going by the mixed active-editor/very-active-editor metrics described on the page.) The smallest single wikis to get representatives here are, by a fairly large margin, the Hindi and Bengali Wikipedias (each with roughly 0.4% of Wikipedia's editors, each getting 1 rep). Including groups of wikis, the African languages group is even smaller, at 0.28%. At 1 seat, and barely a thousand active editors and only 36 very active editors, the African languages group is about 13X overrepresented relative to the English Wikipedia. --Yair rand (talk) 07:12, 11 January 2021 (UTC)Reply
Thanks for providing the numbers, I'll have a think about where I stand on this variant. Nosebagbear (talk) 09:06, 11 January 2021 (UTC)Reply
@Yair rand: having had a more detailed thought, I still am inclined to think that this representation would be severely problematic for IGC (certainly to me it seems dubious how it could claim a mandate when it is so failing at adequate representation), but is most likely to be unacceptable and cause major blowback if actually inserted as a level of governance and a permanent structure.
Both an IGC and any permanent GC should be able to demonstrate support for its existence from 2/3 of both local communities and total editor count. An election of representatives must therefore also have a separate question (not an option in the election) as to whether the local community supports its creation/existence. The permanent body must also have a means of forcing votes by its reps. Nosebagbear (talk) 21:15, 5 March 2021 (UTC)Reply
@ChristianKl: You made a similar point earlier on issues with the formula's treatment of large Wikipedias. Do you have an opinion on whether the variant formula above might be sufficient to deal with the issue?
Also, anyone have any opinions on whether something like this might work as a distribution formula for the Interim Global Council? --Yair rand (talk) 15:27, 4 March 2021 (UTC)Reply
I still see absolutely no reason to make readership twice as important as editorship for Wikidata/Commons (and likely soon WikiFunctions). The distribution for the individual Wikipedias looks more reasonable to me. ChristianKl18:15, 5 March 2021 (UTC)Reply
I'd add modifiers for zh (hampered by censorship, hugely important in present and future), WD + Mediawiki (drastically underrepresented here, given the leverage of that work), and WF + Commons (may be slightly underrepresented, given same). –SJ talk  03:05, 13 June 2021 (UTC)Reply

Additional variables


People have lots of ideas for what variables matter. I suppose that eventually we need a list. Here are my ideas, some of which repeat what others have said.

  1. More trusted, because of past demonstration of alignment with wiki values. Contrast with less trusted, like a non-wiki software developer, finance person, or demographic representative with high skills but unknown values.
  2. Bringing money into the wiki movement
  3. Important but not on a schedule. People with highly specialized skills, like for minority indigenous languages or technical expertise, may be in a yet another council pool of many such people and rotate into a reserved seat for important but not urgent non-continuous representation.
  4. Highly active Wikimedia participants with little or no account activity. Event organizers, fundraisers, and organizational administrators are in this group.
  5. Highly active aligned non-Wikimedia contributors. OpenStreetMap, Internet Archive, Mozilla, Code for All, other large international nonprofit organizations, and other community non-corporate media organizations may be in this group. Wikipedia depends on their content, and there could be a pool of such people who rotate into occasional representation.
  6. University affiliation - I single out universities as the primary non-Wikimedia partner for which to give representation.
  7. Donor seat - outright put seats for sale for an aspirational price to sponsor the global council, at least 1, and leave it vacant if not sold.
  8. unaffiliated Wikimedians - highly active individuals who are in the majority of contributors who have no history of Wikimedia community organizational affiliation, but who obviously know the culture and values
  9. subject matter experts for short term challenges - we could identify the most popular content categories in Wikimedia every few years and make appointments of experts in those fields to fill that gap
  10. subject matter experts for long term challenges - some challenges may take decades to address, and subject matter experts for such problems should not be overly competitive with people who get allocations for challenges which we can resolve in years
  11. specialty expertise - there are some subjects which non-professionals will not discuss or be able to understand. Finance, administration, legal counsel, lobbying, code development, and harassment/threat response come to mind. Either the council needs continuous representation or a pre-negotiated budget and resource allocation to grant reasonable access to all of these.

Blue Rasberry (talk) 14:32, 18 November 2020 (UTC)Reply

Let's set the precedent for requesting crazy invasive demographics


To be a candidate for the global council everyone has to respond to a demographic survey. All things equal, the Wikimedia community requests survey completeness from respondents, but every respondent is free to decline any question. All questions get answered in Wikidata or similar and become queryable. Questions include everything in the Wikidata data model of a biography

  • d:E:E10 - humans, but we with Wikidata can develop this more

I would also like institutional affiliations, more and more specific info about gender and sexuality, income, location affiliations, and I am not sure what else. To be discussed, but we should plan for something rather than let it happen organically.

Blue Rasberry (talk) 14:32, 18 November 2020 (UTC)Reply

We currently allow people to indicate gender in their preferences, and most projects invite users to display Babel info on their userpages. For more general demographic info, I think it would be problematic to ask it of people. During the first strategy iteration, it was proposed to allow users to (optionally) fill in "parameterized self-declared characteristics", and it was not received very well. Privacy and anonymity are important concerns, and even a vague expectation that people should provide personal information could be damaging. --Yair rand (talk) 03:56, 2 December 2020 (UTC)Reply
If you give favors to people for personal characteristics they disclose you are effectively discriminating against people who live in situations where their privacy is very important to them because they might face negative consequences when the information about them is known. ChristianKl01:55, 16 January 2021 (UTC)Reply
@Yair rand and ChristianKl: The Wikimedia Movement is making certain commitments to diversity. You both are familiar with the media narratives that Wikimedia is mostly privileged, white, male, Western. As soon as the Global Council is established, I predict that the ongoing discussions about diversity in the Wikimedia Movement will lead to immediate requests for demographic data of the Global Council's members. If we get such questions, then how do you think we should answer them? Here are some possible questions:
What percentage of the council are..
  1. non-male?
  2. non-Western?
  3. from lower/middle income countries?
  4. young/students?
  5. non-English speakers?
To what extent do you think that we can plan to establish the council without being able to answer such questions? Also, if the council somehow favored the appointment of people of certain demographics, should anyone be able to detect that and if so how? Blue Rasberry (talk) 15:13, 17 January 2021 (UTC)Reply
@Bluerasberry: Asking for the stats will change the stats in exactly the direction we'd want to avoid; some people from the relevant groups will not run if revealing such data is requested. We could likely have a decent estimate of those from non-Western and lower-middle income countries simply from the primary languages/projects of the users, but for many other stats like age, I think we'll just have to do without knowing exact stats. Many people do provide certain personal info without prompting, which perhaps could be compiled into imprecise datasets, but I'd advise against implying that members should reveal any amount of their personal data. --Yair rand (talk) 06:23, 22 January 2021 (UTC)Reply

Your source says your distribution is invalid


Penrose method says:

A precondition for the appropriateness of the method is en bloc voting

Your own source says that using Penrose here is invalid, without even mentioning the criticism section. For the record, it looks like you propose ~33% of the editing population get ~6% representation, and ~50% of the editing population get ~17% representation.

I also agree with the various objections posted above by others.

Between this, the Foundation's Strategy to gut our Notability/Verifiability/ReliableSourcing and other polices, and some other things from the Foundation, it almost seems like the Foundation is trying to pile up things to provoke the community into full rebellion. Alsee (talk) 08:14, 23 November 2020 (UTC)Reply

@Alsee: The purpose of using the Penrose method here is of course rather different than the typical reason. It is serving the purpose of ensuring smaller-project representation, rather than for dealing with bloc voting, but I think it's still clearer to mention/link the method than to write out an explanation of taking the square root of each output.
I am not, and have never been, a WMF employee, by the way. Nor part of any of the Strategy groups. This is just a single proposal by an individual volunteer. I encourage others to draft up their own proposals, and hopefully we'll be able to arrive at a consensus position for GC distribution. --Yair rand (talk) 22:00, 23 November 2020 (UTC)Reply

How would you characterize the parties likely to form from your proposed seat distribution?


I am compelled by the mathematical purity of your argument. Can you please try to visualize how it might play out in practice and describe the kinds of parties that might form from it? If they are non-optimal, what kind of patches might work best? 05:08, 7 June 2021 (UTC)Reply