From Meta, a Wikimedia project coordination wiki
The description following is from March 2005. Since then the following changes have happened:
- More database servers, now with 16GB of RAM
- More memcached
- Server names and roles constantly change, those mentioned are the ones that happened to be in use at the time.
The caching architecture is the same.
Wikimedia uses several levels of caching to improve site performance:
- Squid cache servers handle about 78% of requests, almost all which are made by viewers who are not logged in to the site. During load surges from media mentions, the Squids handle almost all of the traffic. First use of Squid was in 2 February 2004 (see Cache bugs for early issues).
- Memcached is used to save web pages which have been parsed, so that step doesn't need to be carried out repeatedly. This adds about 7% to the overall cache hit rate for pages. As of 19 September 2004 34 instances each of 180MB are in use: 12 on Yongle, 6 on Bart and Bayle, 2 on Isidore and Moreri and one each on dalembert, Tingxi, Alrazi, Friedrich, Harris and Avicenna for a total of 6120MB. It also caches login session IDs and user interface text in the various languages.
- APC on the Apache web servers is used for PHP caching to improve the performance of the web servers. PHP is normally compiled into bytecode when run, this saves much of the CPU overhead of continually recompiling the same code.
- The database servers have large caches:
- Ariel has 8GB of RAM total and 5.8GB is used for InnoDB caching, giving a hit rate over 99%. The remaining RAM is used for in-memory sorting, temporary tables used in SQL query processing and buffering of non-InnoDB table types.
- Suda, the former master and now fallback master and general database slave has 4GB available for caching, as does Bacon, a query slave.
- Ariel has a RAID controller with a 64MB battery-backed cache to help disk performance. This is particularly significant for database transaction log entries, which are very regularly written to disk. This cache allows the RAID controller to say that a write has completed without having to wait for the disks to actually write the data.
Load balancing 
In June 2004 the load was balanced with:
- Round robin DNS distributed page requests evenly to one of three Squid cache servers.
- Squid cache servers used response time measurements to distribute page requests between seven web servers. In addition, the Squid servers cached pages and delivered about 75% of all pages without ever asking a web server for help.
- The PHP scripts which run the web servers distribute load to one of several database servers depending on the type of request, with updates going to a master database server and some database queries going to one or more slave database servers.