IRC office hours/Office hours 2018-11-01

From Meta, a Wikimedia project coordination wiki

17:00:02 <Keegan> #startmeeting Structured Data on Commons
17:00:02 <wm-labs-meetbot> Meeting started Thu Nov 1 17:00:02 2018 UTC and is due to finish in 60 minutes. The chair is Keegan. Information about MeetBot at
17:00:02 <wm-labs-meetbot> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:00:02 <wm-labs-meetbot> The meeting name has been set to 'structured_data_on_commons'
17:00:29 <Keegan> Welcome everyone to our IRC office hours for Structured Data on Commons!
17:00:44 <Keegan> I'm Keegan, hosting again
17:01:23 <Keegan> We have a lot that the development team can talk about, and we're open to whatever topics the community might want to talk about
17:01:24 <Keegan> I
17:01:44 <Keegan> I'll list some things we have, and feel free to start writing your questions/comments/concerns
17:02:22 <Spinster> I'm also here, and so is abittaker and risler
17:02:30 <Keegan> This past month we ran a discussion over the search prototype
17:02:31 <abittaker> heyo heyo
17:03:26 <Keegan> We've also continued to have discussions around property creation on Wikidata
17:03:46 <Keegan> Which is still a high priority thing to work on
17:04:23 <Keegan>
17:04:48 <Keegan> Are there questions from anyone here yet?
17:04:58 <Keegan> Otherwise I'll make someone ramble for a bit :)
17:05:19 <Steinsplitter> We have to port wikitext into the new wikibase, do you plan to create some kind of parser which does that automatically?
17:05:51 <Steinsplitter> There are millions of files and just "a few" active users in that area :-D
17:06:02 <Keegan> Good question, there's an answer to that
17:06:51 <Keegan> While the answer is being written, y'all can feel free to ask other questions
17:08:15 <risler> Hello! The WMF won't be building a tool specifically for that conversion, but a couple of prominent community members are already in the process of building tools to do it (including Magnus, who will probably be sharing his plans with everyone at some point in the relatively near future)
17:08:31 <Steinsplitter> Magnus, perfect.
17:09:09 <Steinsplitter> We have to automate as much as possible, the amout of files can't be edit by hand. :)
17:09:48 <Keegan> When we first started talking about this project in 2014, the 25 million files then seemed like a lot :)
17:09:54 <Steinsplitter> Firstly i was a bit skeptic, but looking at the beta tool it seems good to me. Especially the search function.
17:10:34 <Steinsplitter> yes, now we have 50 millions \O/
17:11:36 <Keegan> Definitely glad you like what you see so far
17:11:54 <Keegan> Since there's a pause, I'll spam the properties table again
17:12:30 <Keegan> Is anyone here active in Wikidata property discussions?
17:13:51 <nikki> as a wikidata admin, I am, sometimes
17:13:58 * nikki continues lurking
17:14:28 <Keegan> Fair enough, thanks for the lurk
17:14:51 <Spinster> I propose properties too, in my volunteer capacity. But in this area I'd prefer to keep work and volunteering separate. I can totally give feedback on property proposals by other people though
17:15:32 <Keegan> I ask in part because we're going to start a second licensing conversation later today (North American time), and Wikidata properties for licensing need to be sorted out in the near future
17:16:53 <Keegan> The development team is looking to see what the workflow for people on Commons requesting properties on Wikidata will look like, to make sure everything is supported in the software
17:19:16 <Keegan> (Ramsey is writing out something, I think)
17:19:54 <Steinsplitter> some people having own license templates (not really a fan of), this is also something which should be considered. :/
17:20:08 <Keegan> While he does, hello to all who have joined since the start. Feel free to write out your questions or comments, no need to wait for a topic to come up
17:21:06 <Keegan> Steinsplitter: Those are great, aren't they?
17:21:27 <Keegan> That's definitely worth discussing in the licensing conversation.
17:21:37 <Steinsplitter> yepp
17:22:12 <nikki> will the licensing conversation be in here?
17:22:22 <Steinsplitter> (btw, i saw "Commons file's OTRS ticket " <-- we have to change abusefilters then to restrict editing that prob by the global otrs usergroup etc but that is something we can talk about once it is build)
17:23:02 <Keegan> The development team is making it so that the platform is flexible enough to deal with special licenses, buuutttttttt the community does decide modeling, so it will be important in the licensing conversation to figure out how to use those special cases
17:23:20 <Keegan> nikki: Licensing conversation is onwiki
17:23:32 <nikki> ok
17:23:40 <Keegan> Definitely holding all decision-making on-wiki
17:23:47 <Steinsplitter> :)
17:24:26 <risler> For AbuseFilter and SpamBlacklist, we're working on ensuring those work with the new MediaInfo extension and its features. You can track that in the phab ticket here:
17:25:02 <Steinsplitter> perfect
17:26:44 <Keegan> So a question for y'all then, related to the licensing templates. Keep in mind this is not the WMF's decision, there's no opinion here. What do you all think about keeping templates in the future?
17:27:23 <Keegan> If we move licensing over to structure that works in a way suitable for the community, do you see yourself wanting the two environments to live side by side?
17:27:50 <Keegan> They can, the only harm is diverging data in the future if things change in one place but not the other
17:28:33 <Steinsplitter> we can't have a system that parses the Template: (the license shortcut) into the wikibase? We have a lot (omg, a lot!) of licensing templates.
17:29:08 <Steinsplitter> has such a parser currently (a old, bad and hacky one)
17:29:28 <risler> such a system could certainly exist, but what about if users manually change the structured data? Should that be reflected back in the template?
17:31:06 <Steinsplitter> good question. Or we do no allow "by hand" edits (or at least restrict them to trusted user groups; a lot of templates are full or semiprotected) for that stuctured data field?
17:32:50 <Steinsplitter> e.g. if i move a page on commons, it is updated automatically on wikidata (using my account). So people don't have to edit two times.
17:32:59 <Keegan> (risler is writing)
17:33:28 <mpeel> I’d be quite interested in seeing the file templates modified so that they read the structured data and present additional information around it (or one single template along the lines of the wikidata infobox)… related to that, is it going to be possible to access the structured data easily, e.g. using or the lua functions?
17:34:14 <risler> From an implementation standpoint it might be a bit tricky to restrict editing on a per statement basis (it's pretty much all or nothing right now), but if it's something the Community feels is necessary we can look at it.
17:35:42 <Keegan> (more writing!)
17:36:33 <risler> as for how much Lua we'll be able to do with statements, that's still under exploration from a technical consequence perspective. What we really want to know at this stage is which functionality/capability the community really wants and will use.
17:38:23 <Keegan> mpeel: What kind of additional information would you like to present?
17:38:27 <mpeel> I’d suggest looking at - it should be possible to use that without having to manually specify the QID, instead using the depicts property in structured commons.
17:38:55 <mpeel> and you know I already have waiting in the wings to do something similar for none-artworks. ;-)
17:41:12 * Keegan nods
17:41:46 <mpeel> it should be feasible to do something similar with licensing as well, to take the license info from the structured data and use that to display extra information about that license (and do auto-categorisation etc. if we still want that then).
17:42:20 <Steinsplitter> Maybe we need a "Legal disclaimer" as well?
17:42:22 <Keegan> If the structured information is already well displayed on a page, what additional value are the templates to readers?
17:42:42 <Keegan> What can we still do with templates that structured data isn't achieving?
17:42:57 <Keegan> (With the same information, that is)
17:43:28 <Keegan> (this discussion is a prelude to the licensing discussion for sure)
17:43:32 <mpeel> Keegan: I’m assuming it’s similar to Wikidata, where you have the basic info about the file stored in the structured data, and then tap into the linked items to show more info about it, rather than repeating that information all the time.
17:44:34 <mpeel> in my ‘Depicts’ template linked above, just saying ‘depicts: lovell telescope’ means that the template can then pull out a whole load of info about the thing that’s depicted, and show that to the user. Unless I’ve misunderstood things, the stuctured data won’t be doing that on its own?
17:44:38 <James_F> The structured data will be displayed on the file's page, of course, unlike with Wikidata where the statements are not shown alongside the article.
17:46:19 <Keegan> 14 minute warning
17:46:20 <risler> actually mpeel, Structured Data will at least *partially* pull some info about things depicted
17:46:47 <mpeel> James_F: I’m assuming that the structured data displayed would be more like than ?
17:47:18 <risler> we have what we're calling the "Artwork Scenario", where if a file depicts a painting Q item with its own depicts statements for instance, we will show those on Commons and also add them to the search index
17:47:24 <James_F> What risler said.
17:47:42 <mpeel> (where “category’s main topic” is the same as ‘depicts’)
17:47:52 <risler> that process *could* potentially extend to other properties as well. we would love Community feedback on that
17:48:46 <mpeel> hmm, ok, I look forward to seeing examples of that then. I’d worry about the level of reconfigurability that it has if it’s embedded in mediawiki rather than done as a template, though.
17:48:46 * Keegan makes a note of that
17:49:08 <mpeel> (e.g., look at the number of changes that have been made to the wikidata infobox over the course of this year)
17:49:35 <Keegan> mpeel: so templates leave more editorial control, yes?
17:50:48 <James_F> mpeel: By "reconfigurability", what kinds of variation are you thinking of? What statements would appear at all? What order they would appear in?
17:53:05 <mpeel> James_F: particularly changes in the logic that fetches info from wikidata items, either directly or as a chain. E.g., ‘location’ has a chain-like approach on wikidata, and you have to follow through different items to get the full chain, and figure out where to stop. Taxon is similar.
17:53:22 <mpeel> Keegan: yes. ;-)
17:54:10 <Keegan> There's a little over five minutes left, any other questions or comments out there?
17:54:40 <Keegan> Any lurkers that have a contribution to make here as we wrap up?
17:57:47 <risler> thanks mpeel, that's the kind of feedback we're looking for more of. That approach has its downsides as well, which we have to consider. Hopefully in near-future communications we'll see what the general community consensus on this is.
17:58:24 <Keegan> All right, I'm going to go ahead and wrap this up, because by the time a new topic comes up and is answered we're past time :)
17:58:37 <Keegan> Thanks everyone for participating/reading along
17:59:18 <Keegan> Sign up for the structured commons focus group if you have not to receive occasional, brief messages about consultations, updates, and IRC office hours
17:59:54 <Keegan> Otherwise, keep a lookout for the licensing discussion coming today that I've mentioned several times already :)
17:59:55 <mpeel> Maybe think of templates as the ‘quick-and-dirty’ approach - it’s much easier to change them around to do different things, than something that’s embedded in mediawiki and changes all have to go through staff members, code review etc. Once things are working pretty much as-desired, though, then the logic could be moved into mediawiki if desired (and if you want to do that to the wikidata infobox at some point, I wouldn’t object too muc
17:59:56 <mpeel> ;-) )
18:00:11 <Keegan> Right right
18:00:20 <Keegan> Thanks all!
18:00:24 <Keegan> #endmeeting