Talk:Upgrade discussion April 2004

From Meta, a Wikimedia project coordination wiki

Sorry if this isn't applicable or is already being done but does wikipedia/media wiki cache history pages the same as regular content pages? From looking at the URL it seems history pages are 100% dynamic? I was thinking about wikipedia performance and I realized the history page changes with the same exact frequency as a regular content page so why not cache history pages in the same way, especially if a history page ever gets externally linked traffic (and surely history pages get significant traffic anyway with everyone checking for changes or looking for a history of trolls)?

History is 100% dynamic. The problem with caching (and database replication) is that someone who is looking for vandalism or checking a recent edit from the recent changes page needs to see the current history. We have put history on a replicated child server and occasionally get complaints/glitches when that server lags in replication by a few seconds and doesn't show the latest status. This replication lag issue even with a very fast slave server is one of the factors which causes me to think that partitioning the data by language/timezone is the best way to go for database scalability - it'll increase the ratio of database cache RAM to data and should scale very well with less chance of replication issues. Still, there are other views and there's active work being done on the software to help with all scaling options, including replication without troublesome out of data data display. We do cache these pages:
  • User watchlists (the pages individuals say they have an interest in and want to see a custom listing of all changes to those pages). This is controlled by a switch and we turn on a variable cache time if load requires it.
  • The list of all pages and some other very high load dynamic pages. Again, switch-controlled. A switch also to allow any administrator/sysop to regenerate the page.
  • Login details.
  • Probably more - it's not a part of the system I've looked at very closely. Jamesday 21:59, 7 Jul 2004 (UTC)

Is wikipedia's caching 100% edge caching? why not cache data on the webservers too? If the cache invalidation code was faster then making history pages cachable would not be a problem? How much traffic do history pages get? Perhaps a lot with pagination. At any given time what percentage of open connections to the database are to satisfy history page queries?

Data is cached both in squid servers and in the web servers. For the web server caches we use memcached (which I don't recommend) to store the content HTML, i.e. without the navigation divs. Page views for logged in users are cached by serialising the user preferences relevant to rendering and concatenating them to the cache key. Invalidation is by updating a cur_touched field in the database. On page view, the cached page is loaded, and if it has expired, it is deleted. The cur_touched field is also used for client-side caching.
Caching of history views and diffs would make a difference, however profiling suggests that lowest hanging fruit at the moment is in fact much simpler: a cache mapping titles to IDs. -- Tim Starling 04:09, 8 Jul 2004 (UTC)

Filling empty CPU sockets on SMP motherboards[edit]

FYI, The last time I tried this, with Intel CPUs, I found that the CPUs could not simply be the same model/clockspeed, they had to have the same 'stepping'. Near as I can tell this means the same production run. So, it can be hard to obtain CPUs to match the existing installed ones when filling an empty CPU socket. There's always the option of buying 2 new CPUs and selling or repurposing the old. For all I know the SMP 'stepping' situation may now be moot due to changes in technology, YMMV.