Wikimedia hardware status
This page is no longer updated; please see https://wikitech.wikimedia.org/wiki/Main_Page
The Wikipedia server setup is described at m:Wikimedia servers.
This page is intended to contain a log of significant changes to the hardware and software setup, to produce a working history of the project. When it gets full, please move the page to one with a suitable archive name, including the year, and start a new one.
A huge number of changes since last the entry; Florida site relocated; Amsterdam site on-line; there are now 83 servers spread around the world; 67 in Florida, 5 in Paris (2 down), 11 in Amsterdam.
There have also been many other substantial software and configuration changes; see the server admin log for these.
- The image server is currently overloaded; system administrators are working to cut down on load prior to installing more efficient software and more powerful hardware to handle this task. See Image server overload 2005-03 for details...
A new Cisco gigabit network core switch has been installed, which should increase system performance and manageability.
Three Squid cache servers (chloe, ennael, bleuenn) located near Paris, France, are put in general active usage. Users are routed through GeoDNS to this cluster or the Florida cluster depending on their country.
(Transitorily: ennael has memory problems and is off, and reduced bandwidth to some other European countries prevents result in service only available to French users.)
5 new 3GHz P4/3GB RAM Apache web servers now in service.(Benet, Biruni, Rose, Smellie, Anthony)
After initial service to prove reliabililty some will be used for Memcached as well as Apache; or for Squid.
Two new database slaves (Holbach, Webster) are now in service. These roughly double our slave database capacity.
5 new webservers installed (kluge, khaldun, hypatia, humboldt, averroes)
Wikipedia's servers are currently undergoing extensive maintenance to try to resolve earlier problems. Until these problems are fixed, the site will be very slow.
2GB of RAM from bart moved to browne leaving browne with 4GB. Bart has 4GB of new Kingston (?) RAM now. Bart and bayle switched from Apache to Squid service.
New nfs storage server (albert) and database search server (bacon) installed.
Eight new Apache servers installed.
New hardware is ordered see http://mail.wikimedia.org/pipermail/wikitech-l/2004-August/011886.html
Memcached now has more RAM to increase the potential hit rate of the parser cache. 15 instances each of 512MB are in use: 2 on bart, 2 on bayle, 4 on yongle, 4 on rabanus and 3 on will for a total of 7.5GB.
Today the site was switched from using Suda as the main database server to using Ariel. It should be faster now. Suda used six 10,000 RPM SCSI disks and 2GB of RAM for the database. Ariel is using six 15,000 RPM disks and 7GB of RAM.
25 June 2004
Full text search caused performance problems, with queries to the database backing up to the point that normal page reads were threatened (database connection errors when the DB runs out of connections). Building the first index for ja made ja editing impossible for 90 minutes and slowed all other wikis greatly today. The programming for the update which scans the whole table to update the index has been improved to reduce the chance of this problem.
23 June 2004
Zwinger had trouble keeping up with the database load without hurting other site performance. Will is now being used as a backup database slave instead.
June 22 2004
In addition to its other duties, Zwinger is now operating as a database slave, intended to be used for backup operation only. Read only is set, to prevent accidental writes.
June 21 2004
Ariel is now operating as a database slave, set to read only, and is gradually having read operations switched to it. Initially, watch list queries were switched. Full text search was being turned on as the required indexes were rebuilt.
June 12 2004
- The addition of a third Squid cache, maurus, seems to have improved system performance. 
June 11 2004
- The server was down for maintenance from 18:00 to 18:30 UTC. Reason: "To reboot Zwinger with an updated kernel which will fix the disk driver. This was planned to improve performance; as the main file (not database) server the sluggish disk is a bottleneck." as per . Zwinger wasn't using DMA for its ATA disk. It is now.
June 10 2004
One of the new machines, Moreri, has been changed from Apache web server to Squid cache server.
The 2U server has been delivered and three of the new servers will go online today. See this post on Wikitech-l for full details.
4 new servers are installed -- 1U p4 machines with 4 gig of RAM each, and 80GB drives. One of these will be changed into a mirrored RAID with 2x200Gb. Memtest86 will be run overnight on May 26th. The 2U is due in on Friday.
Server will be down for maintenance on 2004-05-12 from about 02:00 to 03:00 UTC.
The replicated database on curly fell out of synch last week. To replace the index file and get the link back up will require briefly taking down MySQL on suda. During this downtime, some of suda's data files will be moved onto its new hard disk. Further details
- A short amount of downtime is experienced on May 4 as hardware upgrades take place
- A RAID card is added to zwinger and a 2nd 250G drive.
- The database is migrated to a new raid5 array
- Browne passed its memory tests and is back online with 2G of ram.
- memtest86 is being run on browne overnight on May 3
- Jimbo will insert the RAID card into zwinger and mirror that
- The RAID 5 array on suda is rebuilt
- suda will be taken down "for a moment" to setup the new raid 5 on the 3x146GB drives
- Upgrades will be taking place on 30 April from 16.30 UTC onwards. This includes installing a single 250GB drive into zwinger in preparation for someone to migrate us to that bigger disk, and installing 3x146gb scsi drives in suda in a raid 5 array. The machine may be down momentarily in order to set up the RAID 5 array in the bios. During this time, curly will be used as a database server, which will cause some slowness
- Browne is (mostly) down. Coronelli and curly are acting as squid servers.
- Front-end squid server coronelli has recently been having problems, leading to severe slowdowns, but it is now back in service
- The database server geoffrin has had a number of problems, and a temporary replacement, suda, is being used instead.
- Why Wikipedia ran slow in late 2003 and January 2004.