Talk:Wikimedia servers

From Meta, a Wikimedia project coordination wiki

Jump to: navigation, search

Archived to Talk:Wikimedia servers/archive1


Contents

[edit] Apache and Apaches

The section about 'Apache Servers' is hard to read and misleading. Why do you use the term 'Apache' both for the web serving software and the machines? Why not name the machine after their function?

[edit] Name suggestions

If you would like to suggest a name, please enter it and link to the relevant Wikipedia article. This list is no longer being used to name servers.

[edit] Encyclopediasts or similiar (with articles)

  • Aristotle - Aristotle, wrote so many varied works that together they comprise a kind of encyclopedia of all Classical Greek knowledge (good name for a really important database server).
  • Mills - John Mills
  • Franklin - Ben Franklin (author of Poor Richard's Almanac). (or/and w:en:Rosalind Franklin ?)
  • panckoucke - see w:en:Encyclopédie Méthodique or w:de:Charles Joseph Panckoucke (editor of the « Encyclopédie Méthodique »)
  • ashurbanipal - w:en:Ashurbanipal IV (maybe not an encyclopedist per se, but the first known to try and bring together all knowledge. Sooner or later there must be a server function that such an ancient name would be appropriate for.)
  • ephraim or chambers - w:de:Ephraim Chambers/en:Ephraim Chambers
  • johann - for w:de:Johann Georg Krünitz, w:de:Johann Samuel Ersch (en),w:de:Johann Gottfried Gruber (en), w:de:Johann Jacob Leu
  • Judah - Judah haNasi was the editor of the Mishna, a compendium of Jewish religious law, on which the Talmud and the Shulkhan Arukh were based.
  • Mortimer - Mortimer Adler - Author/director of planning for Britannica
  • Asimov - Isaac Asimov
  • Johnson - Samuel Johnson - wrote A Dictionary of the English Language
  • Insert name - "Beautiful it may be, but Dale Hoiberg, the editor of Encyclopedia Britannica, thinks there is such a thing as too much information. Wikipedia reminds him of a Jorge Borges story in which the cartographers of an empire make a map so large it's as big as the empire itself and rather useless." Find the name of the fictional cartographer it's a guild of cartographers
  • McHenry - Robert McHenry, former Editor-in-Chief of the Encyclopædia Britannica and current Wikipedia critic (good name for a machine that performs some obscure menial task).
  • amane - 西 周 (NISHI, Amane), wrote 百學連環 (Hyakugaku-Renkan), the oldest encyclopedia in Japan.
  • Maimonides - Maimonides was a Jewish rabbi, physician, and philosopher who is best known for codifiying the Jewish religious law in 14 volumes known as the Mishneh Torah.
  • Seldon - w:Hari Seldon, fictional encyclopediast from w:Issac Asimov's famous "Foundation" serious. It's his idea to produce and disseminate w:Encyclopedia Galactica, to contain and preserve all human knowledge, in a long dark age after the Galactic Empire's collapse.
  • Jorge - w:Jorge Borges, the argentinian writer, author of "The Library of Babel".
  • Duden - w:Konrad Duden. German philologist. Duden's dictionary was the officially recognized standard of German spelling through much of the 20th century.
  • Roget - w:Peter Roget, creator of the first thesaurus of the English language.
  • Grimm - w:Brothers Grimm, German professors who were best known for publishing collections of authentic folk tales and fairy tales, and for their work in linguistics
  • Rousseau - w:Jean-Jacques Rousseau, great Swiss encyclopedist (he wrote the music pages in Diderot's encyclopedy), philosopher and writer

[edit] Others (with articles)

[edit] People without articles

  • Al-Munjid (almunjid?) is a famous old arabic encyclopedia, mentioned by Khalid
  • chagas - Manuel Pinheiro Chagas
  • mellado - Francisco de Paula Mellado
  • wassili - Wassili Nikititsch Tatischtschew
  • pavel - Pavel Bujnák
  • jacob & carl - Jacob Johann Ankarström & Carl Christophersson Gjörwell
  • banerjea - K. M. Banerjea
  • francis - Francis Lieber
  • Black - Adam & Charles Black - Founders of Britannica
  • kiyonori - 小中村 清矩 (KONAKAMURA, Kiyonori), edited 古事類苑 (Koji-Ruien), a Japanese encyclopedia consisting of 1,000 volumes, created by the government during the Meiji Period.
  • fr:utilisateur:Treanna - a regretted great french wikipedian and historian

[edit] Ancient Library originated names

May I Suggest that if names are needed, Ancient Libraries be honored? I think this keeps with the theme. Toastysoul 10:14, 3 October 2006 (UTC)


[edit] Early computers & SciFi names.

Enigma, Leo, Univac, Multivac are a few that come to mind. Napier deserves a mention as well.

HAL9000

[edit] Where are browne and coronelli?

browne and coronelli are listed as Squids on the page, but they are not listed in [1].

Browne is not sending ganglia data for an unknown reason. Coronelli is down. —Kate | Talk 14:08, 2004 Dec 11 (UTC)

[edit] Why usage statistic doesn't work?

Statistic for wikipedia on this page deosn't work from october. Why?

Because it places too much load on the servers. They will be back at some point when a better way to do it. —Kate | Talk 14:08, 2004 Dec 11 (UTC)
How about copying the raw files to another computer and running webalizer against them to generate the stats? Any news on the progress of getting fresh stats would be appreciated.

[edit] Content page edits

User:Kate what are you doing?--213.66.119.190 22:13, 5 Jan 2005 (UTC)

There was a few changes up to [2] that User:Kate reverted and then made some kind of change, what's going on User:Kate?--213.66.119.190 11:22, 7 Jan 2005 (UTC)

Since nobody else would, I'll revert to the mentioned edit and let Kate respond.--213.66.119.190 11:34, 7 Jan 2005 (UTC)

I think she updated the name links. They were just reedited.--213.66.119.190 22:16, 14 Jan 2005 (UTC)

[edit] Servers in *.fr

Is there an estimate when the squids in Europe will start helping wikipedia? Is there a good possibility to balance the load between EU/US-squids through round-robin-dns? Would be great to see some more servers helping to cope with the load :-) -- Arno 84.128.35.174 12:06, 6 Jan 2005 (UTC)

They are currently active and serving en: and fr: for users in most of Europe excluding the UK and Ireland. See statistics. Kate.
They seem to be serving the UK as well now. -- Anon, 17 Feb 2005.

[edit] discrepentcy: Is "ariel" the master database server or just the standby?

In the table it says "ariel" is the standby and "suda" is the master database server. In the thrid bullet point below it states that "ariel" is the master and "suda" is the standby. Which is correct?

For a while, Ariel was the database master. Recently, Suda was elevated to master status, because of its more powerful IO array.

However Ariel is now the master again, temporarily (explanation). Suda is still the preferred master ATM because although ariel is faster, it has less disk space. Kate

Can I ask if this is still valid after all the fun around the 22nd of Feb? When I was lurking on IRC it seemed that you may have switched again. Thanks. --Haggis 11:11, 23 Feb 2005 (UTC)

ariel is still the master. after some compression and rebuilding of ariel's innodb files, the disk space issue is mostly solved for the moment. after the new hardware order arrives and is burned in, we'll probably change to using that for the new master. kate.

[edit] So how do you rate Squid?

I know you folks are working hard and that you have almost unparalleled traffic growth to contend with, but from the user's vantage point I haven't seen a whole lot of site speed improvement in the English Wikipedia since the installation of Squid servers. Some, but not much. Is it a case of solid hardware/software/networking improvements being offset by crazy traffic increase, or have the improvements (particularly Squid) proved disappointing? JDG 05:57, 18 Feb 2005 (UTC)

Before we used Squid we had only two servers, one of which was sorely underpowered, and believe me it was a lot worse. :)
We set up squid well over a year ago; roughly 75% of HTTP requests are cache hits, leaving only 25% for the Apache+PHP cluster to deal with. In the last twelve months our traffic in average hits/sec has increased eightfold; the cache hit ratio has stayed about the same, so in raw numbers cache misses requiring service have also increased eightfold. (That's a doubling every four months.) In the same time we've gone from about 8 servers to about 40, which as you can see doesn't quite match the eightfold increase though it's hard to compare a number like that.
See the Squid stats since the end of February last year: http://wikimedia.org/stats/live/ and our bandwidth usage in the Florida cluster: http://65.59.189.201/www.bomis-total/www.bomis-total.html
Squid's been wonderful: without it we'd have been dead in the water long since (or had to do a lot of redundant work on alternate caching systems to reduce the cache hit service time to be equivalent to what Squid would have given us). Having a front end cache keeps cache hit returns fast, helps free up the apaches' queues for real work, and vastly reduces the impact of spiders and 'flash-mob' hits. --brion 06:21, 18 Feb 2005 (UTC)
Ok, I guess you would have been dead in the water. It's not easy to grasp but I suppose when you reach a certain traffic level you're doing well just to stay even. Google and Yahoo are the undisputed champs for fast response under super heavy usage, but apparently they manage it mostly through brute force: dozens and dozens of regional server farms... I was going to suggest more resources go into load balancing, which I think is more key than caching. But I'm not in much of a position to pontificate. My admin experience was with busy webservers, but way back in `00. I had great results breaking up the load between 6 boxes using simple DNS round robin, but my average concurrent users were at most 1/4 your average... Oh well keep up the good work and I hope the latest fundraiser allows you to actually gain on the problem rather than just keeping pace. JDG 06:41, 20 Feb 2005 (UTC)
Our current load balancing is pretty awful, although it has been getting better recently... there are so many things needing work that it tends to be quite low down on the list of priorities, though. Maybe some day... -Kate. (incidentally, i'm not sure it's more important than caching - if you removed squid, all apaches would be at 100% CPU no matter how you distributed the load :)

[edit] Can we please move that list?

The list of servers has become ridiculously lengthy and should be moved to a page of its own. When I did that, however, I was quickly reverted. As the explanation given for that revert seems not quite satisfactory, I'd like to suggest to re-revert (unless I am missing something and there is a sound reason for Alterego's action). Kosebamse 18:56, 10 August 2005 (UTC)

[edit] Questions about statistics

I have some questions about the statistics at http://www2.knams.wikimedia.org/stats:

  • In what timeframe and how often is the data collected?
  • Is it possible to get the data for pages that are not part of the 100 most visited pages?
  • Why are the numbers in the count column of the referer and the user agent stistics very low compared to the URL statistic?

Thank you for the answers. -- 84.167.85.10 20:42, 12 August 2005 (UTC)

[edit] Ambiguous albert :)

The standalone list says SuSE Linux 9.1, whereas the albert entry @ meta-wiki says Fedora Core. What is true, then? ;) 80.129.108.48 12:04, 20 August 2005 (UTC) -andy

this one. i fixed the other one. kate

[edit] Is there any particular reason

that two of the database servers have public ips? Plugwash 23:32, 22 August 2005 (UTC) likewise it seems even more odd that one of the squids has a private ip, aren't the squids the systems that actually take the requests from the internet. Plugwash 23:35, 22 August 2005 (UTC)

[edit] Bad link

http://wikimedia.org/stats/live/ is 404 for me right now. -- Beland 01:30, 23 August 2005 (UTC)

[edit] Moving overview to the top

I guess that the overall system architecture is more interesting to most people than the detailed (long) listing of every single server. What do you think about swapping their positions in the article? --84.56.228.90 12:55, 23 August 2005 (UTC)

[edit] connection refused

"If you are getting a "connection refused" error, that is a squid problem. Determine which IP address you are trying to connect to...". Please could you add either a simple explanation of how to do this or a link to a simple explanation of how to do this (preferably an external link, as internal links will give "connection refused" messages). I had this problem last night, and after a bit of searching on google found a reverse DNS, which gave me a list of ~20 IP addresses, all of which result in "this wiki doesn't exist" pages. I don't have IRC set up on this machine, and cannot access IRC at work when the problem happens there. Thryduulf 07:33, 13 September 2005 (UTC)

[edit] Actual?

Who can insert the Talk:Hardware ordered August 30, 2005? Thanks.80.185.50.73

Why no advertising? I think a modest amount of targeted advertising would be ok, as long as it only took up a very small amount of space. The money collected should be more than enough to pay the costs of running Wikimedia.org.

Paid accounts could be offered that are free of advertising.

[edit] Now actual?

What is with the Hardware ordered September 14, 2005? 80.185.38.93 19:11, 16 November 2005 (UTC) Status: instaled problem is the new image server ordered november 05.

[edit] Amane CPU's

What CPU's does Amane has? According to Intel there's no 3.4 GHZ Xeon MP and no 3.4 GHZ Dual-Core Xeon, only normal ones with HT. Is that system a multiprocessor system without multiprocessor CPU's? --212.204.66.66 10:24, 24 November 2005 (UTC)

[edit] Statistics

I would like some statistics like cpu-, disk-, network-, ram-usage. This could be monitored with mrtg and snmp for example. what about this idea to add a monitoring suite to each server? -- 24.18.207.109 17:44, 25 December 2005 (UTC)

Just have a look at ganglia, it will probably tell you about everything you want to know, if you click yourself through to single server level. There hast been some talk on the tech-mailinglist about switching to another Monitoring System (Nagios, I believe), but untill then Ganglia is what you get :-). Happy Christmas, as I am European and I can call it anyway I want ;-), --Mdangers 22:27, 25 December 2005 (UTC)

[edit] Updating?

Is it that this page hasn't been able to be updated, or do we really have over 20 servers that were ordered in Nov and haven't been installed yet? I'm really not asking this from a compliaint standpoint, which we have all too much of around here, but from a what needs to be done standpoint. Is it a lack of manpower, and if so is it because we can't find people we can trust or does there just need to be more done to attract people that can help. In the same vein, there's nothing new in Wikimedia servers/hardware orders. Is it already known what is needed to be bought next or is that just delayed until there's time to install the current hardware? Anyway thanks for your time and efforts. I don't have the technical admin skills to help, but maybe I can contribute in other ways that are helpful. - Taxman 20:05, 25 January 2006 (UTC)

I concur - Ravedave

[edit] How many run the GNU/Linux operating system?

The article states "all but two of them run the GNU/Linux operating system".
But the table lists isidore as running "FreeBSD5.3-REL" and zedler as running "Solaris 10".
Something's wrong. What? --David Edgar 17:53, 3 February 2006 (UTC)

This page is wrong and should be phased out in favor of our offsite cluster management documentation at https://wikitech.leuksman.com/. isidore is running Linux these days. --brion 19:12, 3 February 2006 (UTC)

[edit] I have a Network Forensic System to Donate

Solera Networks

Where do I send it and to whom? Please name the server "Sequoyah" after the inventor of the Cherokee Alphabet.

Jeff V. Merk

[edit] Kennisnet clusters

Hello, there are 13 Kennisnet clusters, not 12. --80.136.226.130 06:35, 24 April 2007 (UTC)

actually there's 1 Kennisnet cluster, with (by my count) 28 servers. this page is hopelessly out of date. Kate

I'm missing Yarrow in the list. --62.154.250.10 07:12, 11 October 2007 (UTC)

[edit] Why don't use blade server

Why don't use blade server for Wikimedia Projects? It occupy smaller rack space and easier to maintaince.--Ellery 06:18, 16 May 2007 (UTC)

[edit] Server diagram

Wikimedia servers

Some time ago de:Benutzer:Kolossos challenged us at de:WP:Grafikwerkstatt to draw a new server diagram "if someone is bored". I'm not admitting anything, but here is a draft. I am a bit weary to do something like this without being involved in the matter, especially since there is so little documentation. So, please comment and tell me what all I got completely wrong. --Hk kng 21:44, 11 November 2008 (UTC)

Nice work! A small note about search: theres is no lucene and conventional search, it's both mw:Extension:lucene-search, but some are version 2.0, and others 2.1. --Rainman 09:15, 12 November 2008 (UTC)
So what are the machines marked "LS2 indexer" (srv56) and "LS2 test" (srv77/79/80)? --Hk kng 14:02, 12 November 2008 (UTC)
All hosts that begin with search (e.g. searchidx1,search1,search2...) are lucene-search 2.1, and all others are lucene-search 2.0, with exception of srv77/79/80 which are currently unused (they used to host ls2.wikimedia.org which is a test environment). --Rainman 14:33, 12 November 2008 (UTC)
So is srv56 also part of this test enviroment, or are there two indexers working? If so, should I depicture them as sharing the load, or are they adressing different machines? --Hk kng 18:45, 12 November 2008 (UTC)
They index different wikis, so they are sharing the load ATM. But eventually srv56 will become obsolete. --Rainman 18:17, 14 November 2008 (UTC)

I think it looks good enought to change it against the old image from 2006. It's a wiki and it's an editable SVG, so it is not so that a small bug would be a real big problem. --Kolossos 13:51, 14 November 2008 (UTC)