Wikimedia Quarto/2/tech

From Meta, a Wikimedia project coordination wiki
(Redirected from WQ/2/tech)

+/-

Technical Development
Technical Development

Most of the below report has been written by James Day; the part on the Paris machines is largely by David Monniaux.
Information about our servers may be found any time at Wikimedia servers. Developer activity falls into two main areas: server maintenance; and development of the MediaWiki software, which is also used for many non-Wikimedia applications. Most developers (though not all, by their choice) are listed here. One may show appreciation for their dedication by thank you notes or financial support. Thank you !
Until now, all developers have been working for free, but that may change in the future in order to support our amazing growth.

Installation of Squid caches in France[edit]

The cluster near Paris.
Our servers are the three in the middle:
(from top to bottom: bleuenn, chloe, ennael.)

On December 18, 2004, 3 donated servers were installed at a colocation facility in Aubervilliers, a suburb of Paris, France They are named bleuenn, chloe, and ennael by the donor's request. For the technically-minded, the machines are HP sa1100 1U servers with 640 MiB of RAM, 20 GB ATA hard disks, and 600 MHz Celeron processors.

The machines are to be equipped with Squid caching software. They will be a testbed for the technique of adding Web caches nearer to users in order to reduce latency. Typically, users in France on a DSL Internet connection can connect to these machines with a 30 ms latency, while they can only connect to the main cluster of Wikimedia servers in Florida in about 140 ms. The idea is that users from parts of Europe will use the Squid caches in France, to reduce by 1/10 second, access delays both for multimedia content for all users and for page content for anonymous users. Logged-in users will not profit as much, since pages are generated specifically for them and, thus, are not cached across users. If a page is not in a Squid cache, or a page is for a logged-in user, the Apache web servers must take 1/5 to three or more seconds plus database time to make the page. Database time is about 1/20 second for simple things, but can be many seconds for categories, or even 100 seconds for a very big watchlist.

The Telecity data center
The Telecity data center

The Squid caches were activated in early January 2005, and an experimental period ensued. As of January 31, the machines cache English, French, and multimedia content for Belgium, France, Luxembourg, Switzerland, and the United Kingdom. The system is still somewhat experimental, and it is expected that caching performance can be increased with some tuning. The installation of similar caching clusters in other countries is being considered.

Installation of more servers in Florida[edit]

In mid-October, two more dual Opteron database slave servers, with 6 drives in RAID 0 and 4GB of RAM, plus five 3GHz/1GB RAM Apache servers were ordered. Delays, due to compatibility problems, which the vendor had to resolve before shipping the database servers, left the site short of database power; until early December, the search function had to be turned off, at times.

In November 2004, five Web servers, four with high RAM (working memory) capacity used for Memcached or Squid caching, experienced failures. This resulted in very slow wikis at times.

Five 3GHz/3GB RAM servers were ordered in early December. Four of the December machines will provide Squid and Memcached service as improved replacements for the failing machines, until they are repaired. One machine with SATA drives in RAID 0 will be used as a testbed to see how much load such less costly database servers might be able to handle, as well as providing another option for a backup-only database slave also running Apache. These machines are equipped with a new option for a remote power and server health monitoring board at $60 extra. This option was taken for this order, to allow a comparison of the effectiveness of this monitoring board with a remote power strip and more limited monitoring tools. Remote power and health reporting reduces the need for colocation facility labor, which can involve costs and/or delays.

A further order of one master database server, to permit a split of the database servers into two sets of a master and a pair of slaves, with each set holding about half of the project activity, as well as five more Apaches, is planned for the end of the last quarter of 2004 or the first days of the first quarter of 2005. This order will use the remainder of the US$50,000 from the last fundraising drive. The database server split will allow the halving of the amount of disk writing each set must do, leaving more capacity for the disk reads needed to serve user requests. This split is intended to happen in about three months, after the new master has proved its reliability during several months of service as a database slave.

Increased traffic and connectivity[edit]

Traffic grew during the third quarter of 2004 from 450 requests per second at the start of this period to 800 per second at the end. In the early fourth quarter of 2004, that rose further, with daily peak traffic hours exceeding 1,000 requests per second ([1]). Average bandwidth use grew from 32 megabits per second (mbps) at the start of the fourth quarter of 2004 to 43 mbps at the end. Typical daily highs were 70 mbps, sometimes briefly hitting the 100 mbps limit of a single outgoing ethernet connection. To deal with this traffic, Dual 100 megabit connections were temporarily used, a gigabit fiber connection was arranged at the Florida colocation facility, and the required parts were ordered.