Wikimedia Conference 2010/Developers' Workshop/Notes/OrganizationWorkingGroup

From Meta, a Wikimedia project coordination wiki

Wikimedia Developer's Meeting Berlin - 14, 15 April 2010

Google Summer of Code Prioritization:[edit]

http://socghop.appspot.com/gsoc/org/list_proposals/google/gsoc2010/wikimedia

We have 5 slots available: could get more if we trade but MUST make sure we have enough mentor availability if we do.

PDF of the current projects: http://dl.dropbox.com/u/4991761/gsoc_proposals.pdf

Which projects would we accept if we had extra slots?

  • some interest in the audio recording proposal but there are Flash issues (html audio input not yet standardized i think) and it may not be a full summer project
  • possibly some interest in interwiki transclusion (Tim notes performance/localization issues). Roan (who is mentoring) will talk to the student and figure out more specific goals and intended steps. If those conversations are fruitful, then we may vote that session up in the ratings.


Danese feels that as we've never before successfully led more than 3 GSoC projects to completion, 5 is a good number for this year.

What are our real "Top 5"?

  1. Extension management platform for MediaWiki. Student: Jeroen De Dauw; mentor: Brion Vibber. Extension Management platform is very strong – definitely going to happen. Brion (present) knows this student and is prepared to mentor.
  2. General RDF export/import in Semantic MediaWiki. Student: Samuel Lampa; mentor: Denny Vrandecic. Denny is an academic and Samuel is a student. High confidence that this could be completed in a summer.
  3. Improve metadata support for uploaded media in mediawiki by displaying and extracting embedded IPTC, XMP, and other metadata formats. Student: Brian Wolff; mentor: Chad Horohoe.
    • Multimedia Metadata overhaul project is also strong – follow-on from meeting in Paris. Chad (present) knows the student and is prepared to mentor
  4. Wikisource Legal Tool. Student: Stephen LaPorte; mentor: Ariel Glenn. Looks interesting – This is a Legal domain expert using GSoC to learn to write a program to inject Legal content into Wikipedia. The project looks around the correct size for GSOC. Some concerns that the student might not yet be particularly competent in PHP, but there is an available proof of concept and the project would be a good way to learn PHP. Ariel knows the student and is keen to mentor.(Note added 2010-04-15 by Ariel: Ariel is happy to mentor but does not know the student.) [NOTE: that must be another student that Ariel had contact with then. My apologies for wrong information - DMC)
  5. We have two alternatives for our last slot and have one week to reach a conclusion:
    1. Javascript overhaul of Semantic MediaWiki. Student: sanyam goyal; mentor: Yaron Koren. We like that this supports our general migration to JQuery, however we have concerns that Semantic MediaWiki should really have their own GSoC area. Comments clearly show that Yaron is interested in mentoring. Jeroen thinks Yaron has proved to be a very good mentor during GSoC 2009.
    2. Reasonably efficient interwiki template transclusion. Student: Peter Potrowl; mentor: Roan Kattouw. We have questions about the scope of this proposal. Roan is potentially interested in mentoring and will contact student for more info.

Proposals we are deprecating:

  • Porting texvc to Python we'd prefer to have it in php, and a two-step port seems unlikely to succeed. texvc: Mixed feelings on rewriting an already working piece of software now in OCaml in Python. A direct rewrite in PHP would be preferred. He's said on the mailing lists that he won't consider rewriting it in PHP, so this project is a non-starter. (Note from robla on 2010-04-15: I don't believe he ever said this...it's just no one convinced him that making it PHP-based was a great idea. He specifically said he'd be willing to take on a PHP-based project on-list.) [NOTE: quote from student's comments on proposal page "...I don't think a PHP port was a serious option since it has no parser generator (and anyone who wants to pursue it anyway should look at LaTeX2MathML to see what happens when you try to write a parser by hand in PHP)..." - this influenced our ranking - DMC]
  • The group of proposals concerning maps seem to be taking a shotgun approach with the same student proposing several projects, which is concerning. In general this area will probably be better assisted with targeted contract to complete the work (which will most certainly take more than a summer).
  • Most popular related articles we deprecated due to likelihood student is unaware of complexity of feature / paucity of data sources for deriving article rank.
  • Peer-based voting for bias indicator we deprecated due to likelihood student is unaware of complexity of solving this problem. Doesn't seem solveable in a summer.


About Semantic MediaWiki: It would be preferred if they would have their own mentoring organization, since they are not too related to the Wikimedia Foundation (and they have several proposals including two that are very highly ranked). We will probably deprecate one of the two sessions currently in the top ranking.

Project Management for Wikimedia GSoC Engagements: Rob Lanphier (RobLa) has agreed to handle monitoring / reporting as Assistant Admin. Danese is the official Project Admin in the GSoC system.

2010-04-15 Second session[edit]

Topics

Bug triage[edit]

  • Andrew: general issue is that no one is responsible for triage. No one will ack it, and it may have been fixed without having been noted. No WMF staff member is assigned to fixing bugs.
  • Danese: has good news. Submitted budget and still has to be approved, but wants to early hire a Bugmeister. Priyanka was initally hired for it, but priorities shifted. So we need someone else for the job. Wants to hire someone committed to the practice for 2 years, and wants to design a process that will allow this. And have proper overlap to the successor. Wants to have a proper environment/atmosphere to allow newcomers/developers in.
  • Andrew: Reporters want some sort of feedback.
  • Danese: that's what the Bugmeister is for. Priyanka will be "bug / process infrastructure".
  • All: She needs to fix the weekly report that broke with the Bugzilla upgrade.
  • Danese: Changing to another tracking system, possibly integrating with project management features, is being considered. No decisions have been made yet regarding it. Also planning a developer event on the east coast, in the Winter Quarter, like a hack-a-thon. "Mass bug smash" may well be the first one.
  • All: talk about how to make bug smash work, and that at least some preparation needs to be done before having such an event, to select low-hanging fruit.
  • Danese: Calcey may be used for non-usability stuff in the future, sequestering of Usability Team is going to disappear post-Stanton so Tech has a deeper pool of resources (and because it was a somewhat artificial distinction). Also planning to distribute work across people more so people don't get overworked, can take vacations without stuff collapsing.
  • CoE: Bugs are marked RESOLVED FIXED even though fix isn't live, difficult to tell what's live and what's not.
  • Tim: Alternative is to not mark non-deployed things as FIXED, maybe tweak Bugzilla to add a new bug status for this.
  • Roan: Code review has release notes, but it doesn't work as, at least he, expects it to.
  • Ryan: Track software and ops things separately, link related items?
  • Danese: Priyanka will hopefully look into this once she has more time.

Volunteer developer outreach[edit]

  • Danese: Thinks perhaps she was selected for CTO partially because she has built a lot of developer communities. Need to do a better job in dev. community outreach. Need to give newbie devs more love. 2-3 months in f.e. a GSoC project isn't enough to get to an acceptable skill level. First priority will be that the current staff becomes more resiliant (per prior comments).
  • Chad: points out the community has grown without organisation. Loads of sheep, no dogs. Results in people doing little bits of things they are interested in. However, at times we need to focus attention.
  • Danese: Road mapping for the software *will* happen. Quarterly releases should probably happen again. One hire will be assigned to code review, and take some load off of Tim. Other communities have good practices for code review; no reason to not learn from that, and educate ourselves. Plan to hire a tech writer, too.
  • All: discussion about old/wrong/expired documentation, unmaintained code at mediawiki.org and in the extensions/ folder of the repo.
  • Danese: points out that she wants to strengthen ties with O'Reilly to try and get them to allow us to make use of the information that they have (i.e. access to their books). We want to professionalize the tech dept.
  • Andrew: How feasible would it be for 3rd-party extensions to be hosted outside our SVN repo, as lots of them are unreviewed, insecure, unmaintained, etc.
  • Ryan: SVN is better than people pasting code at mediawiki.org
  • Tim: PEAR has web frontend on top of SVN extension repo that manages releases and other nice things.
  • Danese: We should be looking at PEAR, CPAN, CRAN
  • Siebrand: How much is Jeroen's GSoC project going to cover? Need to ensure that what he makes can be used, so that WMF doesn't have to rewrite it.
  • Chad: Jeroen is basically a GSoC success story in terms of orignally being a GSoC student we didn't know, but he stuck around and has become useful/productive
  • Danese: We will eventually also hire a formal Release Manager so Tim can go back to coding.
  • Tim: On backporting: We currently just backport to releases that are less than a year old, but we don't really know how popular which release is. Advocates ping back from installer, or something. Ryan wants this optional, Siebrand also wants it for installed extensions.
  • Tomasz: We actually do have stats for downloads.

http://download.wikipedia.org/cgi-bin/awstats.pl

  • Tim: We mostly backport security fixes, sometimes general fixes. We also backport i18n updates for core.
  • Chad: Release manager should take care of backporting and identifying fixes to backport. Maybe we could look at a release manager per-branch, like how PHP does. Would enable us to better maintain the branches we still officially support.
  • CoE: Should we identify one release per year to have a longer end-of-life time, say, two years?
  • Roan: Doesn't really make sense, people should just upgrade.
  • Tim: We mostly backport for people who have patches, incompatible extensions or other difficulties with upgrading such as Wikia.

SVN/Version control[edit]

  • Danese: Was discussed at length during dinner last night pro/con. Informed the interested that Tim needs to be convinced. My sense is that we can experiment, but will not consider a migration for this year, because other high priority destabilising things will be done this year (eg. new data centre).
  • Tim: SVN is not a long-term solution. Agrees with Danese not now.
  • Danese: SVN is Apache top level project now. Hope to broaden the communicaty and documentation that way. Appears to have rejuvinated the project.

Code review/patches[edit]

  • Andrew: Currently only committers can queue by review by committing. The patch submitter is basically screwed and needs to start asking around for someone to review their patches.
  • Tim: commit access is given easily. Not to be earned. Criteria are: demonstrated request is in good faith, have to have skills appropriate for what you're doing, that you are a programmer of some kind. This works.
  • Danese: want to know more about this.

The end...[edit]