Talk:Wikimedia Enterprise/Archives/2020

From Meta, a Wikimedia project coordination wiki

"creating undue reporting burdens"

Does this imply that there will be even less transparency around this than the financial reporting of the Wikimedia Foundation? Ainali talkcontributions 06:46, 9 July 2020 (UTC)

Hey @Ainali:. So my understanding is that the LLC's activities will be reflected in the Foundation's Form 990 (our annual US tax filing for those unfamiliar) as well as our audited financial report. If we were to change this structure in the future, then we might need to file additional US tax filings. Seddon (WMF) (talk) 12:29, 10 July 2020 (UTC)
That was my fear, because I suspect that will reduce it to one single line item and not detail the various departments, projects and "products" within the LCC. Ainali talkcontributions 12:49, 10 July 2020 (UTC)

Risk assessment

I'm surprised there is no mention of the risk assessment with such a project. Can the team please publish whether one has been done, if it exists and can be pointed to, or if there are plans for one? -- Fuzheado (talk) 16:01, 9 July 2020 (UTC)

In terms of a long term risk assessment of the project including factors that may impact its ability to fund the Wikimedia movement its a little early. Right now its so early into the project that we are simply exploring whether it is possible to do this. I think at least 50% of our traffic comes from search engines though it's potentially more due to the weird way referrals on mobile and apps work. Since companies are already integrating us into their products, personal assistants, and this kind of integration continues to move further and further in that direction, this is in itself a risk mitigation activity and a response to the potential threat that big tech companies devalue our placement in their search results. Naturally if this turns out to be something of merit in the long term, we will need to do a deeper dive into what happens if that provision switches to some other 3rd party. Seddon (WMF) (talk) 05:30, 13 July 2020 (UTC)


I understand that naming is hard, but this name is terrible. "OK API" makes it sound like we're providing a not-OK API to non-paying users, which is a terrible precedent to set. And "ocean of knowledge" is an Orwellian "ministry of truth" name given the project is about reducing (gratis) access to information. Nemo 16:23, 14 July 2020 (UTC)

@Nemo bis: Yeah totally fair comments. So internally we pronounce it "Oh-kapi" in line with Okapi which does at least sound much better. Being totally honest, we needed to give it some sort of name to be able to refer to it in public and there were a number of several other bad options. We liked the Okapi especially as it's an odd looking animal, and from the country of one of our team members. Totally acknowledge ocean of knowledge was difficult to say with a straight face. Which makes me sad because it was partly my idea. Sorta wish we had gone with Oodles of Knowledge. We had actually been talking about this as a team and we are definitely dropping "oceans of knowledge". Right now it's OK as in open knowledge but we might just use Okapi as a sort of internal code name until we can come up with something better by someone who is exponentially better at naming things than we are. I think I'll move the page to Okapi (Data services) to reduce the awkward pronounciation you mention. Seddon (WMF) (talk) 03:19, 18 July 2020 (UTC)
Real-life okapi look nicer than gnu(s), so why not? I was taken aback when I read "open knowledge API", considering that Open Knowledge Foundation is already a thing. (Perhaps if the rebranding effort goes through, it should be "WikiPedia Paid API (WiPPAPI)"?) "Oceans of knowledge" seems to overstate the case, but I could totally legit get on board with "oodles". For what it’s worth, I read the acronym as "Okapi" primarily and "Okay API" secondarily, but then I am familiar with the animal and not everyone in the world would be. Perhaps writing it in title-case would help, or having a well-crafted mascot graphic. Anyway, thank you for sharing the background with us, Seddon (W?F). One day there might be a Wikipedia article that starts with "Okapi (open, ocean of, or oodles of knowledge API[1]) is a data interface standard that …" where [1] references this post! Best wishes, Pelagic (talk) 14:14, 28 July 2020 (UTC)

Development practices

To ensure quality, are standard development practices going to be followed? For instance the mw:Wikimedia Product Guidance. Nemo 16:25, 14 July 2020 (UTC)

From how I've seen how the team is working, the way we are approaching the project, and also the way we are acting upon the feedback we are getting; I would say that we are increasingly aligned with the WM product guidelines. We haven't actively spoken about that set of pages you've linked to and I will do so with the team this week. Seddon (WMF) (talk) 03:25, 18 July 2020 (UTC)

Corporate structure

I actually like the idea of setting up a separate entity to do this. However, it only makes sense if the expenses are really properly segregated: it's going to be very hard to account for the cost of the time of the very many WMF employees and contractors I've already seen working on this to some extent.

Did the WMF discuss corporate structure with other friendly non-profits which have similar issues? The most obvious is Mozilla Foundation with its Mozilla corporation and others (I believe they had an entity for Thunderbird at some point); but there are also other non-profits with rather complex branches (if you consider 501(c)(6) too, Linux Foundation?) and for-profits with non-profit ties (Red Hat?). Let's not reinvent the wheel. Nemo 16:37, 14 July 2020 (UTC)

@Nemo bis: I'm not an accountant so I'll speak with the team. I do know that right now that a lot of the details are still being worked on and that we have sought external counsel what our approach to this should so there is no reinventing the wheel we are definitely aiming to follow industry practices on this. I also believe that the very early work on this examined our fellow free knowledge/open source orgs. My understanding is that there a number of ways to approach this and they also change with the scale and maturity of the organisation. Right now, my understanding is that an LLC is appropriate as a lightweight setup to starting this but it may be appropriate to evolve in future years. Seddon (WMF) (talk) 02:57, 18 July 2020 (UTC)
@Nemo bis: just to add that we are planning on speaking with other partners in this area: OSM/Humanitarian OSM, Mozilla etc. Kate Chapman is ex OSM and was involved in the spin up of Humanitarian OSM and is on WMF staff that is extremely helpful to us. I know Mozilla is a slightly different beast and far more mature so it might be an insight into the future but they are on our discovery list of people to speak with. Seddon (WMF) (talk) 15:46, 22 July 2020 (UTC)

Interoperability and standards

mw:OKAPI#Hosting_locations claims that «We're prototyping using AWS because it's faster». I have no idea what this means: AWS is by no means "faster" than anything else; it's just like any other provider. I assume it means "we use AWS because we like it/we know it". That's understandable.

What really matters to me is whether the team is going to be instructed to only use fully free software tooling and open standards, so that the infrastructure can be moved to any other hosting provider at any moment without changing the code. The easiest way is to avoid using basically any of the AWS products (there are about 200), given most of them have some proprietary spin; but if you're super careful you might find that some can be used with open/standard interfaces. Is the chosen team proficient with moving things from AWS to other hosting providers, or are they going to need help finding out what the requirements are?

Simple version: make sure all the work on the AWS side is performed with an orchestration tooling which can just be configured to point to another hosting provider and work all the same. Be it simple Ansible scripts which do things on remote machines, or very complex usage of something like the OpenStack API, you get it. Nemo 16:48, 14 July 2020 (UTC)

@Nemo bis: so yes you are right with my use of the word "faster", that it was what the team was most familiar with and that given that we are conservatively planning for not being able to utilise WMF hardware for our biggest users, utlising AWS was the fastest route to prototyping. I'll add some clarification there. I know that the aim is to try and make anything we use platform agnostic but let me speak with the team and see if I can get you an answer that isn't just waffle on my part. Seddon (WMF) (talk) 02:41, 18 July 2020 (UTC)
@Nemo bis: just a follow up on this after speaking with the product team. We have purposefully containerized the entire application using Docker to allow flexibility of infrastructure as well as remove any dependencies to AWS. This was identified as a requirement pretty early on so things have been designed in a way to avoid obvious pitfalls like becoming dependent on RDS or S3. We've created abstracted services to work with anything that may come up. This would provide us the future ability to plug in any storage that is required though right now it's definitely not quite plug and play because of how early we are in the prototype stage. In terms of orchestration, again we have not yet needed to cross that path but we will ensure that we follow the same kind of approachSeddon (WMF) (talk) 16:34, 23 July 2020 (UTC)


I suggest to go through the text again with Writing clearly in mind. Once the text is sufficiently understandable, it will be useful to mark it for translation. To start with: there are definitely too many negations in "We don't expect to charge for access to our APIs for most, if not all users". No idea what this was meant to convey (maybe "if any users"?). Better turn it in a positive sentence. Nemo 16:52, 14 July 2020 (UTC)

@Nemo bis: yeah agreed. At the moment there are a lot of unknowns but I'll give it a rewrite later this month. Thanks for pointing this out! Seddon (WMF) (talk) 02:21, 18 July 2020 (UTC)

Community Input section

"Given the nature of the project, primary decision making on this project will rest with the " it says right now. I'm dying of suspense over here, what's the rest? :-) -- ArielGlenn (talk) 08:27, 15 July 2020 (UTC)

@ArielGlenn: yes! I've expanded the section a little. At least so there are complete sentences :) Seddon (WMF) (talk) 02:24, 18 July 2020 (UTC)

High-volume? Most users?

What is high-volume and what is "most users"? My collective anti-vandalism tool can send (via user's browser, javascript; without bot rights) 1-20 requests to WMF's API per second. Will I or my users have to pay for this? :D Should I continue develop of this tool or doesn't it make sense?- Iluvatar (talk) 18:43, 19 July 2020 (UTC)

@Iluvatar: - I don't have a specific number to hand right now but I am 100% certain you will not have to pay for your tool. In fact all existing tools will continue to use the main API offerings which will always remain free to use and is also undergoing improvements and investment. So definitely keep investing your time in your tool. Okapi is in addition to the Wikimedia API suite, not a replacement for it. It's also expected that community developers will likely also be able to use and benefit from Okapi free of change. Seddon (WMF) (talk) 19:05, 19 July 2020 (UTC)

Business development plan?

We are way past the dates in the table, but no business development plan has presented for community review yet and no update on the delay has been given. Is this project on hold? Ainali talkcontributions 22:07, 23 November 2020 (UTC)

@Ainali: Hey, less on hold and more that our timetable has massively slid. Partly due to the project readjusting a little in scope and also partly due to the capacity. We are doing another round of a focus groups and we are planning on publishing a position essay in January. I'll update the timeline with the new expected dates. Seddon (talk) 22:16, 23 November 2020 (UTC)