Wikimania 2023 WANZ

From Meta, a Wikimedia project coordination wiki

To go back to the WANZ meta page

A place for Wikimedia Aotearoa New Zealand members to add their notes and take outs from Wikimania 2023.

Pre-Conference Day (Tues 15th Aug)[edit]

Notes from Mind the Gap[edit]

This was a full day session that took place at Google Asia Headquarters

  • Panel: Empowering Women in Wikimedia. Call to action: 1) Build courage, not confidence. 2) Collaborate with like-minded people to support your local community. 3) Take small steps to make big ripples. 4) Lift yourself and others. This Google course might be useful https://iamremarkable.withgoogle.com/
  • Yael Weissburg gave some stats: <20% of Wikipedia biographies are about women, <25% of active Wiki editors are women (up from 9% in 2011), c. 33% of readers of Wikipedia are women
  • Check out WikiStories https://www.mediawiki.org/wiki/Wikistories_for_Wikipedia, a new Wikimedia Project that may be a good way to get new editors engaged as it is quite short form writing
  • Make sure to regularly check out the Wikimedia Highlights on the Diff blog: https://diff.wikimedia.org/
  • This article, "How academic institutions can help to close Wikipedia’s gender gap" published in 2022 in the journal Nature is worth reading: https://www.wikidata.org/wiki/Q112123581. There is a paywall on it but you can access it via the Wikipedia Library.

Read this paper on Wikipedia Library

  • Google and Wiki projects point to each other frequently and have similar vision statements

Asaf's presentation on Wikidata Basics and Sparql queries[edit]

  • See resources on sparql queries on wikidata User:Ijon page
  • Control space whilst hovering in the wdt: or wd: area in sparql gives you a search function!!! This was my conference highlight. Einebillion (talk)
  • Asaf's talk on Sparql queries https://youtube.com/watch?v=eEfTTODS_8E Einebillion (talk)
  • An easy way to remember that wdt: = property and wd: = item is "wdt: = properT".
  • An important reminder: Wikidata can only tell you what people have put into it. SPARQL query results are not the full answer to the question (e.g. there may appear not to be many hospitals in certain parts of Singapore but it could just be that people haven't added all Singaporean hospitals to Wikidata).
  • Learning that 'SPARQL' is pronounced "sparkle" was one of my conference highlights. :) Chocmilk03 (talk) 03:56, 29 August 2023 (UTC)[reply]

ESEAP Strategy Summit 2023. Day One[edit]

Frictionless data[edit]

  • Frictionless data talk from Open Knowledge Foundation: Seemed to think they had a simpler way for people to contribute data to Wikidata than OpenRefine. This attendee was not convinced that was a great idea (their way does not include a reconciliation function and didn't at first glance look especially simple). Will link YouTube session when available. DrThneed (talk)

Hackathon and GLAM metrics[edit]

  • Mention at Hackathon of breakdown of analytics tools and impact on GLAMs willingness to invest in actively share on wiki (unsure if impacting overall Open Access). There was ongoing discussion between a number of GLAM people from Wikimedia Aotearoa New Zealand about this topic with Andrew Lih (Smithsonian Wikimedian at Large) and others in person and in the telegram group GLAM-Wiki Global. The main issues discussed were the depressing news that the Met are no longer supporting the wiki work they had previously invested in. A contributing factor may have been the breakdown of the GLAM analytics tools that proved impact. Further discussion after the conference on the telegraph group included a response from Fiona Romeo and Giovanna Fontenelle from the Culture and Heritage team at the Foundation. They are not a product team but do collaborate with the product team. Initiatives for 2023/34 are outlined in the WMF annual plan https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/Goals/Equity#Culture_&_Heritage This includes:
  • First of a few chats with people that came down to WAI262

Pre-Wikimania Workshop: Using Wikibase and Wikidata for Community Archives[edit]

  • Cool presentation about My Community, a Queenstown-centered community archive that collects, exhibits, holds community tours etc. Extremely ground-up, supporting residents with volunteers and heritage professionals. Built collection by door-knocking, getting to know residents and recording stories/making acquisitions as relationships were developed. Now working on building a public-facing collection management system on Wikibase, drawing on Wikidata for data model/many items and properties. Aims to launch June 2024

Day One (Weds 16th Aug)[edit]

Diego's talk on Web2cit[edit]

List of sites that need to be entered into Web2cit[edit]

It was getting a bit long so I moved it - see list in DrThneed workspace (please contribute any you can find!). Also if anyone wants to have a go at using Web2Cit together I would love to collaborate, my first efforts have gone awry! DrThneed (talk)

Papers Past[edit]

Competition between User:Einebillion and User:DrThneed. User:Einebillion has contacted the webmaster of Papers Past to request them to review the webpage metadata so that Zotero can ingest the metadata appropriately and therefore feed it accurately through to Citoid - the automatic citation tool. This will mean we don't need to use Web2cit to fix the issue. User:DrThneed has talked with Diego and has asked him to work on the tickets associated with the bug related to using Web2Cit on Papers Past. Let's see what happens first. (It's a win either way!)

Wikisource and Transkribus[edit]

  • OCR in Wikisource uses Google OCR and Tesseract, now they're piloting Transkribus for handwritten material
  • Transkribus is an ML tool to develop models that recognise handwriting in whatever language and character set, can be trained on a small set of pages. The models can then be used within the platform or ported over to Wikisource (I think)
  • There are 100 public models available, and 11 of these are integrated. The next languages to be integrated will be Balinese and Java. Wikisource Loves Manuscripts is interested to hear from anyone wanting support in this area.
  • You provide around 5000 words (images and accurate transcript), takes a couple of days to build the model, there is a video on how to do this.
  • Provides around 0.5-2% error rate on printed text, 2-4% on handwritten (single hand, same as source), 4-6% on handwritten (multiple hands, same as source), etc.
  • Feels like something that might be less useful for us in terms of te reo [although you can train a model for any language and any script] but may be great for English-language archival material

Visible Wikiwomen[edit]

  • It's important to not only upload images of women into Wikimedia Commons, but to make sure the images have structured data so they can be found. Over 8,000 images have been added to [VisibleWikiWomen].
  • The project has a particular focus on images of black, brown, queer, trans, non-binary and indigenous women.
  • Be aware of the potential risks for people in being visible and try to seek consent for uploading any images (e.g. being visible on Wikimedia Commons as an LGBT+ person could be a safety risk for people, maybe not so much in NZ as in other countries but still possible). The VWW project has resources available to help with getting images speedily deleted if there are trust and safety issues. Chocmilk03 (talk) 05:18, 29 August 2023 (UTC)[reply]

Using large language models to overcome bias: An experiment[edit]

  • Talk description: members of the Women in Religion User Group discuss how large language models (LLMs) could counteract gender bias. Could the use of LLMs accelerate the improvement of gender representation on Wikipedia by assisting with common editorial tasks, such as transcribing oral histories, summarizing academic papers and news articles, and drafting initial versions of Wikipedia articles about notable women?
  • The Women in Religion group has worked to design a process where reliable sources are ingested by an LLM which they claim has "nearly eliminated misinformation and hallucinations". I am sceptical, but interested to read their findings; need to track down the slides...
  • Other uses/applications for AI could include identifying knowledge gaps and what is missing from existing articles. Can also be used to assist with translating (but don't rely on AI only!). Chocmilk03 (talk) 05:18, 29 August 2023 (UTC)[reply]

Mentoring new editors on Wikipedia[edit]

  • The majority of new accounts never edit, the majority of first time editors never edit again. There are Growth membership tools available in multiple languages. Everyone who signs up to Wikipedia (*except* in English or Spanish, because there are simply not enough mentors in those languages) gets an online mentor. At the moment there are 800 mentors in 50 wikis.
  • Barriers to becoming an editor: Technical ("What is an infobox?"), Conceptual ("What is notability?"), Cultural ("Why are people so mean?")
  • Anyone who signs up to be a mentor will get a Mentor Dashboard to help and encourage newcomers, and stay in touch with them. The idea is that positive reinforcement will mean new editors are more likely to keep editing, however the stats regarding retention, productivity and revert rate were not convincing the mentoring actually had this effect. Mentors also help reduce the workload of patrollers and admins. See [Mentoring] and also [Experiment Analysis 2020]
  • Time commitment to mentor: 1-3 questions per week from 1-3 mentees (can be configured).
  • You can enroll to be a mentor here: [[1]]

Kiwix4schools Project[edit]

Presentation by Ruby D-Brown and Eugene Masiku (see youtube recording - starts 5hr 59min)

  • Really brought home the digital divide and the cost of accessing information on the internet.
  • emphasises the issue of cost of data/internet access in comparison to monthly income (1 GB = 24% of average monthly income!) in the Central African Republic. Also devices are expensive.
  • discussed the implementation of this pilot project to attempt to overcome some of these issues.

Opening Ceremony - Wikimania - and Wikimedians of the Year[edit]

Day Two (Thurs 17th Aug)[edit]

WMF legal team[edit]

Provided a great best practice for affiliates for text to use on their website to provide gap between affiliate and WMF. See etherpad by Einebillion (talk) Einebillion (talk)

Wikilearn[edit]

see Etherpad

https://learn.wiki/courses This is a course platform that is in early days. We are including any or everything that supports Wikimedians in their work. Community engagement needed to develop high quality content, determine project scope, and topics of interest. Einebillion (talk)

Secrets of editing workshops[edit]

see Etherpad - had some tips for organising workshops e.g. how to get around the account creation limit for IPs.

Training the Trainers Programme[edit]

see Etherpad notes by Einebillion (talk) and DrThneed Really great presentation from Dr Sara Thomas about process for developing a train the trainer programme. Really useful and helpful course material and design information in this etherpad https://etherpad.wikimedia.org/p/BLLNMH which includes a link to her own etherpad https://etherpad.wikimedia.org/p/Buildingcapacity

OpenStreetMap[edit]

Etherpad notes by Einebillion (talk) and ? Interesting session about how OSM works, particularly in connection with Wikidata. Notable that certain values can't be imported/exported - eg Wikidata can't import OSMs coordinates as they're licensed under the Open Database License, and OSM won't import coordinates from Wikidata due to data provenance issues (eg can't use data coming from Google Maps. Was interested to learn that the EU/UK's database rights conflict with the USA's 'facts are not copyrightable' doctrine. Note from Einebillion (talk) : facts aren't copyrightable in the EU/UK either. It's just that the compliation / assembling of facts may include a database right. It's a "copyright is an onion" thing. There's a layer of copyright provided for the compilation of facts in the EU/UK. New Zealand has this too.

Apparently paid mapping is very common and totally a-ok, compared to paid Wiki editing.

There's an OSM/Wikidata matcher!

WikiFunctions[edit]

Early days, but WikiFunctions aims to be the central location for a whole lot of technical stuff (platform level not general user), rather than re-implementing across dozens of different projects. Specifically mentioned that they're not looking at handling templates but I don't have a ton of examples of what this does cover.

Teaching with Wikipedia - Wikimedia Argentina[edit]

In Spanish, English etherpad notes by User:Stitchbird2 here: https://etherpad.wikimedia.org/p/9PXLE9

Teaching with Wikidata - Wikimedia Chile[edit]

In Spanish, English etherpad notes by User:Stitchbird2 here: https://etherpad.wikimedia.org/p/CDKL3E Zenodo link: https://zenodo.org/record/7971984

Creative uses of Wikimedia Commons for a better understanding of heritage[edit]

  • Discussion of what is "built heritage" - should include tangible and intangible; emotional, sensoral and visual; textures, sounds, memories, smells, stories, rituals, emotions...
  • Wikimedia Commons projects "Brent is you" and "Brent through your heart" documenting built heritage in Brent, England [note I can't seem to find links to these now]
  • I had no idea you could put sound into Wikimedia Commons! Has to be converted from phone/camera to ODG. On wikidata, there is a property called "audio" which contains this file.

Day Three (Fri 18th Aug)[edit]

Improving the Event Organizer Tool[edit]

Notes added to the events page by Einebillion (talk) as no etherpad had been set up. Event management / registration tool advancement https://wikimania.wikimedia.org/wiki/2023:Meetups/Event_Organizer_Tools_Meetup Einebillion (talk)

Hackathon Results[edit]

Mobile application for museums using Wiki projects[edit]

First stage developed a tool where users add a sparql query and add it to the tool to extract museum data, wikimedia commons files and wikipedia articles as linked information. Next steps is creating a mobile application to upload this information for museums. Einebillion (talk)

QID on Navigation Popups[edit]

Request to show Wikidata QID as a link in the navigation popup. When you hover over a Wikipedia page it doesn't show the QID. What took so long? the original maintainer of Navigation Popups has not been active in a long time and code has aged. 58,000 users have Navigation Popups on Wikipedia enabled but no one is maintaining it.

How to turn it on - Wikipedia, Preferences, Gadgets, Navigation Popups checkbox. Einebillion (talk)

10 Research Findings and How You Can Use Them in Your Work[edit]

by Einebillion (talk) Presented by Leila Zia Very interesting results. Some will be useful for content gap prioritisation work. https://arxiv.org/pdf/2008.12314.pdf Einebillion (talk)

What makes some wiki communities more subject to governance capture?[edit]

10 mins on why/how Croatian Wikipedia was captured by right wing editors when similar sized Wikipedias in the same part of the world were more resistant to bias. Really interesting and relevant to us as a usergroup as we have some small language Wikipedias in our part of the world. While we may not contribute to them it might be worth our while making sure we know how they are governed and organised, so that we don't find something like this happened on our doorstep. DrThneed (talk)

Open Climate Campaign[edit]

Open Access research campaign. Lots of angles including working with funders to try and make OA more of a norm in contracts too. Think Stitchbird2 looked more closely at this, particularly the tool for making your own papers open? This video does a good job of explaining how it works: https://blog.oa.works/how-to-unlock-your-paper-and-help-the-climate-crisis/ Basically you end up uploading the *accepted version of your manuscript* (e.g. Word doc, not the final pdf) which is how you are able to legally self-archive it regardless of where it was published.

Speaker noted that a lot of historical research is still locked up and is needed, plus GLAMs hold a ton of the context that help make sense of the findings. (EG using collections to track physical changes in species over time.)

Open sharing of research outputs is NOT the default: c. 40% of papers are open access in the USA & Canada. However this improved regarding Covid19 research papers, 77% of which were published as open access and also the adoption of preprints. About 47% of Climate change papers are currently open access.

100,000 institutions[edit]

Etherpad by Einebillion (talk) Wiki Italia has been piloting an effort to get basically every GLAM onto Wiki by systematically making contact and providing a framework for providing at least 10 CC0 images - collections, interpretation panels, building photos, whatever. They have a ton of structural stuff in place including a contact/getting started form they'll be translating and some neat mapped analytics.y

The best place to start is with Wikidata: Improve Wikidata with data on museums, create database of museum contacts, contact all museums (email), support museums (training, centralised support, form, large uploads, survey), and documenting case studies. It can take 2 years for a museum to go from knowing nothing about Open Access to achieving it!

See [[2]] for maps, visualisations and Wikidata.

They've improved Wikidata items for/related to thousands of orgs, 91 started the process, 22 have shared images, and they've worked up 5 case studies. Avocadobabygirl will be staying in touch with them to consider applying this in Aotearoa. According to UNESCO there are 236 museum in Aotearoa!

Wikidata tools[edit]

It would be worth watching this talk again. There is a translation tool called [Wikidata Terminator] to check out.

When on a Wikipedia page, click on "page information" to see which Wikidata items contribute to the article. When on a Wikidata item, click on "page information" to find out which pages use this Wikidata item.

There's a new team in Wikimedia Deutschland who is working on Wikidata tools for improving other wiki projects. Make sure to use CiteQ template for citing Wikidata references in Wikipedia.

Wikimedia UK - Wikimedian in Residence in Climate/Environment[edit]

  • Wikimedia Visiting Fellow @ Global Systems Institute. Externally funded 1 year position Oct 2022 - Oct 2023 to provide internal advocacy, events and training, student engagement, expert outreach, external advocacy, partnerships, content releases, imagery. So far: 8 events, 70 editors, 248 articles, 7 million page views.
  • Challenges: no metric for *quality* of edits; expert availability/interest; retaining interest after edit-a-thons.
  • Ideally, experts would get trained to edit and get their pubs/experience into wiki via edit-a-thons etc. Barriers to getting experts involved: Workload, lack of understanding/legitimacy of wiki.
  • Reach out if you are an institution looking to do something similar.

James Taylor, Auckland Museum - Wikipedia and the Aotearoa NZ History Curriculum[edit]

  • Etherpad link
  • AK has 400,000 open images across 20 partners (wiki, GBIF, BHL etc.)
  • 2021 Wikimedia workplan (Commons, Wikidata, Wikipedia) and GLAM wiki project page
  • Teachers & students in NZ secondary schools use wikipedia but local historical resources are lacking.
  • Creating wikipedia articles for Auckland region suburbs that teachers can use in curriculum.

Tamsin DrThneed - New Zealand's dissertations[edit]

  • Etherpad link
  • 2022 - 2023: 66,000 items, 13 types of theses from 1907 - 2022, full schema, round trip QIDs to institutions, match authors with advisors and main subjects.
  • Wikidata visualisations: SPARQL queries are both interesting and good for error checking!
  • Scholia profiles (taxa, people, places)
  • [Wikidata Thesis Toolkit]

Wikispecies and Wikidata[edit]

  • User:pigsonthewing
  • Open discussion about the future of Wikispecies, which is unstructured data, could be improved - is this the best platform because many people (including most in the room) edit in Wikidata, not Wikispecies. Also, which models of taxonomy are used/valid?
  • Steps toward solutions: Using CiteQ, images, authority control (Wikidata) for references, images and people; visualisations; and Scholia links to authors.

Empowering Wikipedia and Wikidata with AI: Scholarly publications[edit]

Wikimedia Commons - Gender and cultural diversity in images[edit]

  • For every 1 image of a woman on Commons, there are 3 images of men
  • There are content gaps regarding food and clothing, particularly for the "Global South"
  • Improved descriptions for images woulud help increase discoverability
  • There's a Commons tool called ISA that can help make adding structured data to Commons images easy: https://commons.wikimedia.org/wiki/Commons:ISA_Tool

Day Four (Sat 19th Aug)[edit]

Duplicating everywhere all at once - the issue of multiple entries for geographical points[edit]

Etherpad notes Alex Lum is working on duplicates in Wikidata of geographic data. He's currently focussed on New Zealand and cleaning everything up for New Zealand using the New Zealand Land Information open dataset.

Improving the community wishlist[edit]

Etherpad notes by Einebillion (talk) Work is being done to improve the community technical wish list process see https://meta.wikimedia.org/wiki/Community_Wishlist_Survey Einebillion (talk)

The Future of Wiki in Education: What we've learned from the EDUWiki Conference in May and next steps[edit]

Much discussion about the theoretical hub that could be set up for education and what it might do (everything apparently). But more questions than answers, e.g. how it would work with affiliates, where the boundaries are between them and GLAM (they don't seem to have considered Wikimedia & Libraries boundary). I would like to see the three groups work together! etherpad DrThneed (talk)

Wikeys, a pedagogical game to discover Wikipedia[edit]

This session introduced a board game developed to teach about Wikipedia concepts to 12+ year olds. Free downloadable printable boardgame, for small teams, collaborative, 15-30 min game. The concept is working together to place your pieces, balancing symbols on the board to produce a score. The final score tells you the quality of your article (we reached GA in the session game), introduces concepts of using sources, balancing points of view, correcting errors, not making too many changes too quickly (or edit conflict and the game ends!). It was fun and interactive and although I think it needs a rename in English, I liked it as a way of teaching about Wikipedia. etherpad DrThneed (talk)

The challenges of volunteer committees in the movement[edit]

Etherpad notes by Einebillion (talk) This discussion / round table was about the challenges of volunteering on WMF and movement committees outside of or in addition to local regional / geographical affiliates. The challenges to contribute include timezone, availability of translators, lack of focus on equity. e.g. quotas to insist on the equitable participation of women were pushed back by the community as undemocratic when it's been proven that quotas can improve democratic representation. See etherpad notes for further discussion points. Einebillion (talk)

10 things Basques have done in the last 5 years: a review of our Education initiatives[edit]

Interesting session on a wide-ranging education programme, including university students and primary school students and a kids' Wikipedia. etherpad DrThneed (talk)

Data Partnerships and the Future of Linked Open Data[edit]

Alan Ang and Kris Litson talked about Wikimedia Deutschland and the outreach this well-resourced affiliate is doing to onboard global institutions to the linked open data web via Wikidata and Wikibase projects. Einebillion (talk) met with Alan previous to this talk and has been cc'd into the preliminary conversation with the National Library of New Zealand staff. Einebillion (talk)

OpenClimate: Tracking greenhouse gas emissions[edit]

  • OEF - Open Earth Foundation, US non-profit "our planet first" https://www.openearth.org/
  • The first "Global Stocktake" voluntary self-evaluation of tracking of emissions will take place in Dec 2023 at COP28.
  • It's a huge challenge to track gas emissions at all levels from countries down to local emission sites. We all need to have this information of who is tracking well so we can offer rewards and accountability. Data is distributed on websites of local entities, regulators, academic journals, in different formats and using different methods.
  • OEF has developed DIGS DIGS: Digitally-enabled Independent Global Stocktake see https://www.openearth.org/projects/openclimate for background and https://openclimate.network/ to explore the data.
  • Tracking 150,000 actors, 360,000 annual emissions, 4,000 targets worldwide

Capacity development for underrepresented communities[edit]

  • Etherpad link
  • Worked with 3 communities: Aromanian, Romani, Macedonian sign language for community building, building resources in underrepresented languages and new methods of engagement.
  • Challenges: these languages not taught in schools (written form not standardised), written materials are scarce.
  • June 2022 - June 2023 to research - implement - evaluate project
  • Research phase is very important: survey community members, as well as detailed interviews with community experts (NGOs, community organisers). Implementation: workshops, editing contests, seminars, recording sessions.
  • 10 lessons learned from evaluation phase. Highly relevant to Aoteaora for engaging with our communitites (Maori, Pasifika, NZ sign language, etc.)

Structured data in Wikimedia Commons[edit]

Day Five - Extra library session (Sun 20th Aug)[edit]

https://etherpad.wikimedia.org/p/Wikimania2023Libraries