Jump to content

User:Brooke Vibber/Berlin 2009 roundup

From Meta, a Wikimedia project coordination wiki

Summary of discussions I was in on at the MediaWiki developer meetup in Berlin...

Some of these are starting to add more detailed announcements at the tech blog now...

OpenStreetMap integration[edit]


Many folks are interested in getting more flexible mapping integrated into Wikipedia. Integrating with the OpenStreetMap project seems like an excellent way to do it, working with an open project with a similar ethos without having to reimplement mapping from scratch.

For an initial implementation, we're looking at:

  • Setting up a local mirror of OpenStreetMap data and map tile generation
    • Main contact: JeLuF
    • WMDE is funding 2 servers for this, plus a third for toolserver-related map experimentation.
      • These first two will be hosted in Wikimedia's racks in the Netherlands. Depending on load requirements and failover, in the future we may establish additional OSM mirror nodes in our US hosting, but we don't have a firm commitment on it yet.
  • Updating MediaWiki extension for embedding inline maps
    • Main contact: Aevar
    • SlippyMap extension needs some fixes, and static map generation for non-JS clients.
  • Mailing list for central communication on mapping projects:
    • maps-l has been set up for this
      • New mailing list is good so folks more on the OSM side can stay involved without being inundated with wikitech-l :)
      • Note also OpenStreetMap page on meta

We're hoping to have something visible to present in time for the State of the Map conference in July. Aevar may be available to present here; WMDE and/or WMF will see about arranging details and travel funding as necessary.

  • Also make sure we have a pres at Wikimania! (Patricio excited about making sure we've got one)

OSM issues to consider:

Future map- and geodata-related goals to consider:

  • Localization issues
  • Data overlays, highlighting of places
  • Native storage and querying of geographic coordinates (from inline maps, geocoding templates, etc)


Main contact: Luca

  • Current model planned for initial test deployment on GeneWiki page set
    • Trust server to run on a restricted-access box in Wikimedia cluster
      • Possibly with more direct DB access to pull past revision data, but just API will be faster too in-cluster
        • Luca coordinate w/ Brion on usage of fetchText.php and similar
      • Edit hook to ping server via HTTP or such; make failure cases nice and clean
      • Luca will continue developing the server code on this basis

Work ongoing, no deployment dates yet.

PediaPress updates[edit]

Chatted a bit with Heiko -- we've got some server and UI end updates ready to go, improving rendering performance and easing creation of large book lists. Yay!

Image storage and other wacky tricks[edit]

Chatted a little with Inez @ Wikia about image stuff -- alt backend storage ideas (but neither of us has firm implementation yet :) and a neat trick they're experimenting with to delay loading of inline thumbnails below the fold if the user never scrolls down, saving on image bandwidth.

Images, media, and uploads[edit]

Talked a little w/ Michael Dale and FireFogg Jan on some upload issues...

  • Upload status feedback for copy-from-URL can be implemented by pushing progress updates to memcached or such. (The actual file is stored locally on the web server node for now.)
  • Possible server-side transcoding of videos to aid in push-button uploads...
    • Consider how to keep the workflow clean. Currently most stuff in MW assumes that as soon as its uploaded it's ready to use; we might want to show a placeholder thumb with queue progress, so it can be placed in page etc while it's transcoding.

Interlanguage link centralization[edit]

Talked with Nikola Smolenski:

  • Nikola's been advocating the current Interlanguage extension implementation using a central wiki which is edited containing just interlanguage links in text; this feels awkward to me, and I've recommended considering taking it a bit further and making a usr-friendlier UI for editing the centralized links instead of using a separate wiki with manual text editing.
    • We'll see about doing some UI mockups and see what looks plausible to implement.
    • Chatted a little about backend implementation and how to pull in an alternate DB connection in MW (CentralAuth examples)

Quicker localization updates[edit]

Siebrand & Gerard:

  • Some localization activity drops off because delays in getting the localizations updated from translatewiki to live sites delay the instant gratification of local editings.
  • Agreed that there could be benefit to pulling updated localizations more quickly without always waiting for the next full svn up:
    • Caveat: new versions of things can require code updates, so should only take changes to localizations when the English master has not changed from current deployment.
    • Caveat: possible security concerns with immediate updates; HTML messages etc. Delaying until review & commit from translatewiki to SVN seems like best compromise (per Brion, Siebrand)
    • We currently have no infrastructure for actually doing it. :D
      • Will need to brainstorm on how to add in a message update overlay or something which can make this realistic. Sounds doable. No timetable established yet.

Extension prereqs metadata[edit]

w/ GerardM

For automatic testing and for friendlier extension setup UI it's rather nice to have some metadata on prerequisites:

  • min/max PHP ver
  • min/max MediaWiki ver
  • does it require another extension to also be enabled
  • etc

Agreed it would be useful to talk later on hammering down a standard extension metadata file format for this info.

  • (There's been some past discussion of this, needs to be brought up to date or replaced with something final.)

New search UI[edit]

Nikola reminds me that we need to deploy the updated search UI on all wikis! It's been on test wiki for a while.


LA2 recommends we try to get some WM presence at FSCONS in Gothenburg in November, maybe someone from the Usability Project.

  • Need to get in touch w/ them to suggest, see if there's interest.

There will also be a Wikipedia Academy (Swedish-language event) in Stockholm the days following FSCONS.

  • Make sure Frank & friends are in the loop :D

Hacking At Random[edit]

Henna suggests organizing some WM/MW presence at HAR, hacking/camping event in Netherlands. Sounds fun! :D

  • August 13-16, 2009

URL shortening[edit]

Stumbled on article about URL shortening issues; might be interesting to consider setting up our own URL-shortening service or providing nicer shorter permalinks in general.

Hacking Days[edit]

Spoke with Carlos and Patricio about hacking days plans... Still need to finalize based on availability but we're targeting:

  • Day or half-day ahead for setup/intro/planning/brainstorming (setting up work tracks unconference style)
  • Open hacking room throughout the conference
    • power, network, nice places to sit and hack or brainstorm or check your email
  • Wrap-up session open to public

XMPP RecentChanges feed[edit]

Duesentrieb, Leafnode, Catrope

Necessary components:

  • Having MediaWiki send out the XMPP payload over UDP
    • Catrope may be interested in poking this; using API code to wrap up the formatted RC data. Needs some cleanup on API formatter/output code
  • XMPP client library, accept the bits over UDP and send out over regular XMPP to test XMPP server at toolserver
    • This part needs to be deployed in internal cluster
    • Leafnode interested in making this
    • Brion will need to make sure it's reviewed and deployed
  • XMPP server -- test with off-the-shelf server on TS
    • Leafnode, Duesentrieb etc to test with this for now.

WMDE sponsored software projects[edit]

Will resummarize this later, Dues will announce separately

Stats notes[edit]

w/ Arash

  • Some chatting on logging-related stats
    • has been talking w/ Erik Zachte, some stuff currently doable some stuff may need addl work
    • Some peeking at cool things in Google Analytics such as conversion goals; can we set up some similar sorts of tracking stats at our traffic level?
  • Dumps needed for some of EZ's stats
    • Can we set up a one-off enwiki history dump on the existing system, maybe with a dedicated box, in the meantime until the new system is ready to go?
      • Will want to coordinate w/ Tomasz on this

Image zooming[edit]

  • http://toolserver.org/~kolossos/zoom-image/zoom-ol.html
    • Use of map-style tiling for pan-and-zoom on large images in general
      • Great for panoramas!
      • Potentially handy for other things
  • (cf gigapan.org)
    • which has a pretty Flash utility, but it can be done in pure JS too as above (Google Maps/OSM/etc style)

Neat! Consider adapting this for large scans, panos etc.

Wikiword navigator[edit]

Duesentrieb's spiffy thingy...


uses interlanguage links and categorization as basis

How much needs to be done to use this sort of system to assist in translating search keywords for commons etc?

  • Commons is interpreted like a language :D
  • Is being worked on... awesome!
  • Should be able to present state of work at Wikimania

Google Summer of Code[edit]

Went through a first pass over student applications with Siebrand, Tim, Michael Dale, and Roan Kattouw/catrope.

  • 2 we're very happy with and ready to approve
  • 2-3 potentially interesting that need mentors from SMW or Usability Project zones
  • Several others which might have interesting ideas but need to be fleshed out
  • A few buzzword-bingo or not really relevant submissions
  • A couple totally bogus subs

Approval deadline is April 15, we'll do another pass after feedback.


Chatted with Siebrand, Tim et al on bugzilla issues

  • We definitely want to improve handling to make sure we're keeping up with input issues :D
  • Some thoughts:
    • Set defined goal 'timeout' to get a response in
    • Be more proactive about setting a LATER or WONTFIX on things we don't think are feasible or desirable
    • Be more active about patch review :D
    • Clean up default assignments, extension components which aren't receiving attention etc
      • "Bugmeister" position may be valuable to ensure things are being handled, assigned, or discarded. This may be a role for a new WMF developer position, we'll plan more and announce as there's news :D

Search server RAM[edit]

Chatted w/ Mark -- let's make sure this discussed upgrade actually gets done soon!

Internal governance tech visibility[edit]

Chatted a bit w/ Domas about ensuring that tech maintains a strong visible presence in WMF internal governance.

WYSIWYG notes[edit]

Chatted a little w/ Domas about templating and WYSIWYG issues; we agree that w/ modern wikitext the main issue usually isn't the inline formatting, it's the media and templates and tables! Formatting per se tends to be abstracted away into the templates other than simple linking etc.

Want to keep an eye on the projects poking towards that direction to make sure they're optimizing for that use case.

Cached object expiration rebuild pressure[edit]

w/ Inez @ Wikia

Noted that some of the fancy things in their new skin are expensive to build, but when they expire from memcached many web servers try to rebuild. Chatted a bit about how this is implemented with the message cache and if there's a way to generalize this for other objects that are expensive to build to avoid DB pressure on expiration.

On-wiki image editing[edit]

w/ Nikola

Brief chat; ref to the SVG localization talk and demo implementation from Alexandria.

Also consider other sorts of image modification such as cropping, rotation, color adjustment.

Idea to allow for reference to alternate versions of an image, such as localized or cropped, similarly to the way we can refer to a specific page in a DejaVu document.

I noted also there may be some value in coordinating w/ the video sequencing framework Michael Dale's working on -- meta-media of this sort might benefit from a shared model for both still and video stuff.

Enotif lagging issue[edit]

(w/ Tom Gries)

  • Still some possible issues
    • The update flag needs to be updated at a clean consistent time
      • -> Make sure these are properly synchronized
      • If the watchlist bolding is lagged, it gets very confusing!
      • Ensure that it's all handled nice...

(Options for priority queue? But make sure the multiple issue and the lag issue arent interacting badly anymore too!)

Skins notes[edit]

w DannyB

Hoping to replace the two existing skin architectures with one cleaner one :D

  • Note coordination w/ usability proj, some potential gsoc proj
  • Cf bug 17817

Message cache performance[edit]

w/ Siebrand

Extension messages have been becoming much more expensive!

  • There was some talk on this, not sure what exactly the status :D
    • Some thought about serializing the ext messages together
    • Need status update from Michael or someone else at the session!

(Added by Siebrand - please correct where I am mistaken or use incorrect jargon) Got more information from TimStarling. There has been talk between him and Domas. The idea is to have a high performance message cache in a BDB. In it, all messages would be stored, and they would be retrieved when needed, as apposed to the current approach where all messages are loaded. Allegedly the compiled database can be loaded into kernel cache, making it the fastest way to access messages. I asked Tim about usability, and he said that it could also be usable outside the WMF. People who can implement this may be few, and he indicated that Domas would probably be the best man for the job. Analysis needs to be done if all current Wikimedia extension are using wgExtensionMessagesFiles[] and wfLoadExtensionMessages(), because this feature would not support anything else (1 day). Estimated time required for implementation in the source code would be 2-3 days more after that. Guess the former does not need to be done by Domas.

Changes/additions would be in/from MessageCache.php, and Tim thought that some refactoring would also be required in the message functions in GlobalFunctions.php. When using the BDB message cache feature, the cache would be precompiled (just like current serialisation), and wfLoadExtensionMessages() would end up doing nothing in such a use case. As far as I understand, the new feature will be backward compatible.