Create analytical database replicas for WMF wikis (Community Wishlist/W509)
For years, the Wiki Replica databases have been used by users to create reports, develop tools, or conduct studies on data from Wikimedia wikis. A common challenge that is increasingly getting more difficult is that the Wiki Replica databases, which are closely similar to the production databases on which Wikimedia wikis are run, are not suitable for reporting and analytical queries.
In data science jargon, there are two types of database designs: online transaction processing or OLTP, which is optimized for the transactional processing of the data (such as editing a page, creating an account, updating a block, etc.) and online analytical processing or OLAP, which is optimized for analytics use of the data (such as counting all edits by a user, visualizing the trends of account creation, or stratifying blocks based on type).
Requests like phab:T414199 and alike are being declined (perhaps rightfully), as they are more aligned with OLAP use cases. To quote a user on that ticket, while creation of analytical databases has been a vision for WMF, "it is very difficult to convince regular [WMF] managers of the value of this kind of stuff, because no one inside of WMF is explicitly asking for it." The community greatly benefits from these analytical queries. Thousands of tools on Toolforge rely on analytical queries of Wiki Replica databases and could become more effective and featureful if OLAP databases were created by WMF. English Wikipedia has hundreds of many database reports (as do many other wikis, see example) which all rely on these databases and some have become defunct due to the changes to the MediaWiki production database which were necessary for the production transactional system, but rendered specific analytical queries infeasible or time consuming.
Unassigned
Bots, tool makers, researchers
This wish currently has 8 supporters. Voting for this wish is open until it is completed.