- Our DNS servers run gdnsd. We use geographical DNS to distribute requests between our four data centers (3x US, 1x Europe) depending on the location of the client.
- We use Linux Virtual Server (LVS) on commodity servers to load balance incoming requests. LVS is also used as an internal load balancer to distribute MediaWiki requests. For back end monitoring and failover, we have our own system called PyBal.
- For regular MediaWiki web requests (articles/API) we use Varnish caching proxy servers in front of Apache HTTP Server.
- All our servers run either Debian or Ubuntu Server.
- For distributed object storage we use Swift.
- Our structured data is stored in MariaDB. We group wikis into clusters, and each cluster is served by several MariaDB servers, replicated in a single-master configuration.
- We use Memcached for caching of database query and computation results.
- For full-text search we use Elasticsearch (Extension:CirrusSearch).
- https://noc.wikimedia.org/conf/ – Wikimedia configuration files.
- See also: wikitech:Clusters
As of October 2014, we have the following colocation facilities (the name is derived from an acronym of the facility’s company and acronym of nearby airport):
- Equinix in Ashburn, Virginia.
- EvoSwitch in Amsterdam, the Netherlands (cache).
- Kennisnet in Amsterdam, the Netherlands (networking for the above).
- United Layer in San Francisco (cache; USA West and East Asia).
- CyrusOne in Carrollton, Texas.
The backend web and database servers are in Ashburn, with Carrollton to handle emergency fallback in the future. Carrollton was chosen for this as a result of the 2013 Datacenter RfC. At EvoSwitch, we have a Varnish cache cluster and several miscellaneous servers. The Kennisnet location is now used only for network access and routing.
Ashburn (eqiad) became the primary data center in January 2013, taking over from Tampa (pmtpa and sdtpa) which had been the main data centre since 2004. Around April 2014, sdtpa (Equinix – formerly Switch and Data – in Tampa, Florida, provided networking for pmtpa) was shut down, followed by pmtpa (Hostway – formerly PowerMedium – in Tampa, Florida) in October 2014.
In the past we've had other caching locations like Seoul (yaseo, Yahoo!) and Paris (lopar, Lost Oasis); the WMF 2010–2015 strategic plan reach target includes «additional caching centers in key locations to manage increased traffic from Latin America, Asia and the Middle East, as well as to ensure reasonable and consistent load times no matter where a reader is located».
A list of servers and their functions used to be available at the server roles page; no such list is currently maintained publicly (perhaps the private racktables tool has one). It’s however possible to see a compact table of all servers grouped by type on icinga (click to have a list of the servers in each group and then click on the names to see their machine details). The puppet configuration provides a pretty good reference for software what each server runs however.
Status and problems
You can check one of the following sites if you want to know if the Wikimedia servers are overloaded, or if you just want to see how they are doing.
- Ganglia (needs password as of 2017-10?)
- Icinga (temporarily restricted)
- Networking latency
- http://status.wikimedia.org/ (up/down indicators not to be trusted) (but "New external availability metrics based on new Catchpoint data")
If you are seeing errors in real time, visit #wikimedia-tech on irc.freenode.net. Check the topic to see if someone is already looking into the problem you are having. If not, please report your problem to the channel. It would be helpful if you could report specific symptoms, including the exact text of any error messages, what you were doing right before the error, and what server(s) are generating the error, if you can tell.
More hardware info
- Technical FAQ – How about the hardware?
- Your donations at work: new servers for Wikipedia, by Brion Vibber, 02-12-2009
- wikitech:Clusters – technical and usually more up-to-date information on the Wikimedia clusters
- Server admin log – Documents server changes (especially software changes)
Offsite traffic pages
Out of date information
Useful information about other sites
- Evolution of LiveJournal systems:
- Google cluster architecture (PDF)
- MySQL User’s Conference 2004 blog highlights