- The following discussion is closed. Please do not modify it.
- Most likely, new comments will not be taken into account by the new three Working Group members in their work of developing the final Recommendations. You are free however to continue discussing in the spirit of "discussing about Wikipedia is a work in progress". :)
This recommendation goes against all that I believe is good with free knowledge. Charging for access to knowledge is a not a slippery slope, it is jumping off a cliff. Ainali (talk) 10:06, 10 August 2019 (UTC)
- I have to agree with this. The recommendation as currently written also seems to not understand how big players are using our APIs. (They mostly don't.) --LydiaPintscher (talk) 13:56, 10 August 2019 (UTC)
- Lydia To your point of big players not using APIs - I believe you're largely underestimating the amount of effort the WMF staff put into investigating this. They have checked initial interest, made comparisons and projections, did extensive due diligence. The fact is that big corporations do not use the API mainly because it is not good enough for them. A service with a contractual response time and the maximum allowed downtime would be largely attractive to them - they prefer to pay the charge and not have to engage their own scarce programming forces on regular basis to scrape it. It is also 100x easier to get them pay an invoice than to get them donate. Given the fact that we can significantly improve and diversify our financing (I'm thinking of something even in the range of 5-15% of our budget long-term not entirely unthinkable), and that we can make the service so much better and more reliable for everybody else (who will not have to pay), it sounds like a really plausible idea. Pundit (talk) 08:24, 16 August 2019 (UTC)
- @Pundit: - My thoughts were along the same lines but I stand corrected. Can WMF publish the relevant reports (with relevant redactions, where necessary) for better understanding? Winged Blades of Godric (talk) 07:42, 17 August 2019 (UTC)
- @Winged Blades of Godric: I don't know, but I could understand how they could be reluctant to publish such reports - I think it is fair to assume that the WMF staff does their job right, and risking the whole concept by being overly explicit may just not be sensible. Pundit (talk) 11:44, 17 August 2019 (UTC)
- I believe that the willingness to pay an invoice is irrelevant. Our mission is to provide free knowledge. If we start going "yeah we will, but if you are of an arbitrary large size using one arbitrary way of our many tools to retrieve it, it is not free anymore" the mission is compromised. Now I do see the difference between libre and gratis, but the line we draw is still arbitrary. Now, I don't see a risk this ever would lead to charges coming to regular readers but that doesn't really matter, I still believe our ideals are corrupted. There is an argument for that or mission only states humans should be able to share the free knowledge and not corporations. But if we are going down that line, then all corporate use of API's should be charged, not only the big players. And frankly, I believe this line is of a disservice to both ourselves and humanity. Ainali (talk) 08:25, 27 August 2019 (UTC)
- Agree with both of you. This is just completely contrary to our goals of free knowledge for all. Hell no.Winged Blades of Godric (talk) 07:27, 11 August 2019 (UTC)
- From what I understand the big players are major users of our APIs, and would use them more if we could make them better. Doc James (talk · contribs · email) 06:51, 15 August 2019 (UTC)
I understand the sentiment behind this - charging for high volume non-authenticated use and using the money to make WMF Engineering and MediaWiki suck less. But this is untenable for other reasons - I host my tools on Google infrastructure, and I suppose some other volunteers would use AWS. Oh, and:
> Inspiration for this can be the GNIP / Twitter service, which has tiered support - the basic API is free for everyone, but access to the Twitter “firehose” (tweets generated in real-time) costs up to 100,000$ / month.
So your inspiration is the equivalent of charging for a recent changes feed? Utter bollocks. MER-C (talk) 14:55, 11 August 2019 (UTC)
- Afterthought: And the immediacy of such a feed is wrong, because it contains unreverted spam, attack pages, vandalism, etc. MER-C (talk) 07:55, 12 August 2019 (UTC)
- Second afterthought: a paid support model (where the client pays for technical support, uptime guarantees, bug fixes and feature requests) has the benefits of bringing in revenue to employ engineers to work on the API and forcing the WMF to suck less without having to figure out which API requests need payment. MER-C (talk) 15:35, 12 August 2019 (UTC)
As Ainali comments, this is totally against the free knowledge ethic. More evenues for what? For hiring even more staff that doesn't even know about the community and the principles that they have behind? This idea sounds to me like a suicide of Wikipedia and its whole concept. My absolute refusal. Next proposal. Xavi Dengra (MESSAGES) 21:02, 13 August 2019 (UTC)
- Personal comment First I would like to acknowledge all the efforts that have gone into this process. The board of course has not considered the recommendations in depth and thus does not have a position yet. I personally view this is a positive and very reasonable source of revenue. This is going to improve the overall quality of our APIs for everyone, well only the very highest users will be expected to pay. If NGOs, education institutions, and research institutions have very high volume usages we will still have the ability to give them free access. Doc James (talk · contribs · email) 16:25, 14 August 2019 (UTC)
- Personal comment Speaking just for myself - I think that if by offering paid API to several big corporations we can improve the quality for everyone, and be able to better serve those who can't pay or just do not have traffic comparable to Google or Amazon scale, this is a great win for everybody. We definitely need to be flexible and open to requests for help and support from organizations with high traffic, but being non-profit (even by default policy). Pundit (talk) 07:21, 15 August 2019 (UTC)
- I look forward to seeing a short list of 'big players' who are using our APIs in a bulk manner and their methods. Winged Blades of Godric (talk) 07:40, 15 August 2019 (UTC)
I just want to expend what LydiaPintscher already said. Big players have no incentives to do massive use of our APIs even if they were reliable. For them, it's way better to download our content in bulk with regular updates and build their own APIs on top of it, in their own datacenters. This allows them to customize their data storage according to their needs, reduce latencies, have global replications... For example, building a clone of the Wikidata Query Service with a one day replication lag requires to only download the daily edit dump once per day. I am afraid that we are not going to charge these giants but smaller companies that do not have the money and/or the incentives to build their own replicas.
An other point to study is that the message "you are using the content created by our contributors, you should support us by donating" might be much weaker if we already charge the companies that might give. A newspaper headline like "Wikipedia earn 10 M$ from charging Company1, Company2 and Company3" might also hit our fundraising from individual donors. Tpt (talk) 09:13, 3 September 2019 (UTC)
Comparison with like-minded orgs
Comment This bring a somewhat similar example to my mind. MetaBrainz − also a non-profit foundation, also operating an open-data/free-knowledge project − does have a tiered-system for API access and a pay-for Live Data Feed. I did not dig deep into this, but it is unclear to me whether usage restrictions are actually enforced − the language makes it more sound like a voluntary gentlemen’s agreement.
(I don’t have any particular insight into MetaBrainz finances or on usage patterns of their data, so I don’t know how comparable it would be). Jean-Fred (talk) 20:42, 11 August 2019 (UTC)
I also dug a bit into en:OpenStreetMap practices: the official API discourages heavy users ; and the same goes for the official tiles. In both cases, heavy hitters are encouraged to either set-up their own, or contact a commercial service provider. (I’m sure plenty of people more knowledgeable on OSM could expand/correct me :)). Jean-Fred (talk) 16:30, 13 August 2019 (UTC)
- Right. This is not unseen in similar projects. I think it is a good idea. Large volume access might have a significant cost for the WMF and, when it comes to large commercial organizations, it would be fair that they pay, i.e. contribute to cover operational costs. Large volume access could also be granted for free for partners where it makes sense, such as academic research. But most often these are usually good with current tooling (XML dumps, recent changes stream API, etc). That being said, it's important to not hurt independent researchers or community members who use these tools to improve our projects. Did the Working Group consider any particular API access to be covered by paid agreements? --MarioGom (talk) 12:30, 15 August 2019 (UTC)
According to their 2018 financial report, MetaBrainz made 450,000$ in 2018 from their support program − 400K from the “Unicorn” tier for very big players. Jean-Fred (talk) 10:34, 3 September 2019 (UTC)
- this is truly a very small amount of money in exchange for compromising our most basic principles. What we cannot afford to offer free should be done by others. DGG (talk) 16:55, 16 September 2019 (UTC)
- Well, for MetaBrainz, that appears to be 88% of their income. :-)
- Re: should be done by others: sure, this seems closer to the OSM model I mentioned above, where heavy hitters are told to talk to commercial providers. I’m not too sure what you mean by “others” though? Is that simply “outside of WMF”? There are some Wikibase consultancy companies that (at least in my book) are very much part of the Wikimedia movement − is it not fine? If the WMF were to hypothetically spin off some sub-company that supports and charges for additional API-like features − would that then be okay because it’s “others” (while for me it seems kind of the same as if the WMF would do that itself)?
- To be honest, I don’t really advocate for the recommendation − I think I’m lacking information and data to be able to judge either way (which is why in my small way I tried to gather some data in this section). At the same time, I’d be keen to know why this would necessarily be “compromising our most basic principles”. In the MetaBrainz model, access is still free and it does does not appear that anybody gets their free access cut off if they exceed some threshold ; more that if you are hitting a lot, then you are encouraged to become a tier-supporter. Again, I don’t know if that’s necessarily a good model/idea for Wikimedia. There is a line somewhere that we should not cross — I’m not sure where I would draw it, and I’m trying to understand where others would. Jean-Fred (talk) 09:57, 17 September 2019 (UTC)
There is not enough detail on this recommendation for me to support or oppose it. Libcub (talk) 06:23, 11 August 2019 (UTC)
Current high-volume usages
How much are currently used the API, and by whom? I am thinking of things like:
- the percentage of API calls that are generated by us vs by third parties;
- the percentage of current API calls that would no longer be free;
- who is doing those API calls - Laurentius (talk) 15:22, 12 August 2019 (UTC)
- A couple of corporations are using millions per day.
- We are talking about less than 10 organizations. No individuals / NGOs are using these large numbers from what I understand.
- These are the voice assistants, like Siri, Amazon's Alexa, etc... Doc James (talk · contribs · email) 11:12, 15 August 2019 (UTC)
- If it's so few organisations, why not just approach them and ask for large annual donations, rather than creating special a paid service for them? Thanks. Mike Peel (talk) 12:54, 15 August 2019 (UTC)
- Donations are several orders of magnitude more difficult to acquire. Corporations are used to paying for high quality APIs. Donations, on the other hand, bring us into an entirely different pipeline, of basically charitable work, which is quite crowded up. Also, it seems fair to charge per overly exhaustive use, especially when it allows us to make the service better not just to them, but for everybody. Pundit (talk) 14:07, 15 August 2019 (UTC)
- It will allow these organizations to pay us for services rather than donate money to us. This will help us from a political perspective. We get criticized for accepting donations from large tech organizations. Doc James (talk · contribs · email) 14:27, 15 August 2019 (UTC)
- I'm skeptical that getting paid for providing a service to tech organizations will get less criticism, but I don't have much expertise in that area. Our relationships with companies like Amazon and Google are already quite problematic. (I'm assuming that the language of "turning .. into a revenue source" implies that they won't just be paying enough to run the API?) --Yair rand (talk) 17:14, 15 August 2019 (UTC)
- They will be paying enough to run the API, and also the costs will not grow as significantly with each new customer. If we have a couple of paying corporations, it will already be a big win and a source of revenue to fund our core work, including better free API for non profits. Pundit (talk) 19:39, 15 August 2019 (UTC)
- Could you do something in between operating a paid service and asking for donations? Something along the lines of "your <IP address/range or retail device> is using a huge amount of API queries, please can you pay us $X if you want to continue doing this, otherwise we'll have to apply a rate limit to the requests?" That way, there's more grey area, and you don't have to operate a separate paid service. Thanks. Mike Peel (talk) 20:17, 17 August 2019 (UTC)
Is there some data or analysis on the feasibility of API monetization? In particular, I wonder what could be the cost and the revenue, and whether the profit would be meaningful. E.g., if the expected profit would be in the order of 100k$, it would probably not be worth.
I assume that we will always provide free access to API to most people, and therefore we will not be able to exploit them as a commercial company would do. Moreover, we will always provide dumps of our data, and that limits how much we can charge: basically, we can't charge for accessing the data, but for the convenience of using a ready-made access point. - Laurentius (talk) 15:26, 12 August 2019 (UTC)
- A paid support model could work, and has my support as long as the additional engineers will result in collateral improvements for the community. But no charging for access. MER-C (talk) 15:30, 12 August 2019 (UTC)
- I believe MER-C has the right of it here. Charge people for support, not just for access. That way, those who are willing to use self or community support still can have the same access at the same level without paying a nickel, but bigger players who will want dedicated hardware and to be able to call someone on the phone can pay a premium for such a service level. Seraphimblade (talk) 12:11, 13 August 2019 (UTC)
- User:MER-C yes the plan is definately to put some of the money back into improving the API. This will result in better service for everyone (more reliable up time, more options) with these services covered by funds raised from those who massively use the service (which is basically the commercial voice assistants). Doc James (talk · contribs · email) 06:49, 15 August 2019 (UTC)
- That's excellent news. MER-C (talk) 09:36, 15 August 2019 (UTC)
It seems to me that the Working Group and assigned WMF staff should do further research and perhaps a feasibility study to determine estimated costs and benefits. For example, they could talk to Microsoft, Google and Amazon, and report back that we could easily get $400,000 in high volume API access fees per month, but it would cost $100,000 to establish the charging scheme. The WMF Board could then decide whether the costs and benefits are worthwhile. From a policy perspective, when the public accesses our websites, we build our brand and have the opportunity to gain banner ad donations. When the public accesses our information through the API, via Siri or Cortana, we do not build our brand and get no banner ad donations. So, ethically, I am not opposed to charging for high volume use of the API. Hlevy2 (talk) 11:22, 15 August 2019 (UTC)
- Agreements for high volume users could also include requirements for attribution. This could help us support us getting the attribution we deserve. Doc James (talk · contribs · email) 12:42, 15 August 2019 (UTC)
Impact on bots and small organisations
One thing I'm worried here is that bots can make large numbers of API calls to do their editing work, would there be some way of avoiding affecting these? Also, what would happen with smaller organisations that wanted to make a lot of use of the API, would this make it more difficult for them to compete with the larger organisations like Google etc.? Thanks. Mike Peel (talk) 12:56, 15 August 2019 (UTC)
- We definitely want to be flexible with non-profits, even if they generate a lot of traffic. The money from big players will allow us to offer the better API for free also to many non-profits, which will greatly benefit from it. Pundit (talk) 14:09, 15 August 2019 (UTC)
1. We might think we can charge big commercial users, but do they have alternatives? For example, could they just decide to stop linking to Wikipedia for answers and link to one of our mirrors instead, like Alchetron? They might get that for free, or at least for a lot less than what we’d be looking to charge. Just because you create a market doesn’t mean you can be sure of controlling it.
- The Working Group should do research to determine whether Wikidata or the API offers something of value that mirrors cannot offer. Hlevy2 (talk) 02:56, 16 August 2019 (UTC)
2. Also, if really big commercial users are paying for a service, there will be contracts, and there will be massive penalties for non performance on the part of Wikipedia. We might find these penalties for non-performance to be punitive.
- The Working Group should do research to determine industry practices and contractual performance expectations. Hlevy2 (talk) 02:56, 16 August 2019 (UTC)
3. If some users are paying, will their needs be prioritised by the development teams over other, non-paying needs?
4. What impact would these kinds of commercial arrangement have on existing fundraising. At the moment we have a simple proposition-if you value free knowledge, give us a bit of money. If we complicate it with contracts from Google etc. isn’t it the case that some potential donors will be less likely to give? I’m guessing here so anyone with professional experience of fundraising may be able to tell me this generally doesn’t happen. Mccapra (talk) 20:07, 15 August 2019 (UTC)
- Mccapra raises important issues. Again, we have a base of small donors which can easily donate $1.1 billion by 2030. Major policy changes could risk that revenue stream. When people access the website or the mobile app, and we provide a satisfactory user experience, we build our brand and gain an opportunity for a banner ad. When a Siri or Cortana user access our database, we do not build our brand or gain and opportunity for a banner ad. Our small donor fundraising base will decline as users stop visiting our website directly. So a revenue stream to compensate for that loss is appropriate. However, some people have argued that to survive, we must take the $1.1 billion and invest it in the Knowledge Engine so that our brand will be a part of the future world of knowledge provision. The movement has to ask itself, what does it take to offer value to users in the year 2030, and how can we make a significant contribution to that new world? Hlevy2 (talk) 02:56, 16 August 2019 (UTC)
- The impact on our fundraising of the switch from screen to voice is potentially very large and something I hadn’t thought of. Thanks. Mccapra (talk) 05:09, 16 August 2019 (UTC)
With respect to funds, if we have more money (as raised through processes like this) we will be able to solve a larger number of total needs. Ie solving some of the needs of paying folks will 1) likely be creating things the non paying users will also want 2) will bring in money to direct to other important purposes for our movement. Doc James (talk · contribs · email) 06:28, 16 August 2019 (UTC)
From Catalan Salon
We are unsure about the heavy usage of the Wikimedia API. Data or some research would be appreciated as a foundation for the recommendation (...)