Research:Wikidata Gender Diversity

From Meta, a Wikimedia project coordination wiki
00:11, 27 September 2022 (UTC)
Marta Fioravanti
Duration:  2022-09 – 2024-03
Wikidata, gender, modeling, data, community

This page documents a research project in progress.
Information may be incomplete and change as the project progresses.
Please contact the project lead before formally citing or reusing results from this page.

Wikidata Gender Diversity (WiGeDi) studies gender diversity in Wikidata, focusing in particular on marginalized gender identities. It examines how the current Wikidata ontology model represents gender, and the extent to which this representation is fair and inclusive. It analyses the data stored in the knowledge base to gather insights and identify possible gaps. Finally, it looks at how the community has handled the move towards the inclusion of a wider spectrum of gender identities. A web application has been created to share the results publicly in a user-friendly way.


The Wikidata Gender Diversity (WiGeDi) project aims to investigate the issue of gender diversity in the Wikidata knowledge base, focusing in particular on the marginalized identities of trans, non-binary, and gender non-conforming people. All previous studies about this subject in Wikimedia projects have focused on the gender gap, defined as the gap in the representation of women versus that of men. Some of these studies (e.g. the ones by Konieczny and Klein) have acknowledged the existence of trans and non-binary people, but no research has looked specifically at how marginalized gender identities are represented, or how accurate and complete the current representation is.

Our initial study about this subject (see #Non-binary gender representation in Wikidata) shows that gender modeling in Wikidata has a very complex history, from which important lessons can be learned about the representation of marginalized gender identities has been approached by the community, and which steps remain to be taken to make Wikidata a more inclusive project.

The WiGeDi project aims to center marginalized gender identities by performing a broad analysis of gender diversity in Wikidata, from three different — and complementary — perspectives:

  • the modeling question, looking at how the Wikidata ontology has evolved to support a more inclusive representation of gender, e.g., by updating the properties that directly or indirectly express gender; we aim to analyze the Wikidata ontology to identify representational issues and potential areas of improvement;
  • the data question, computing statistics about non-binary gender representation in the knowledge base, and analyzing its effectiveness and accuracy from a quantitative point of view;
  • the community question, looking at how the Wikidata community has handled the evolution towards a more inclusive gender representation, looking in particular at user discussions about the topic.

Our project aims to answer all these questions by publishing a web application containing a real-time dashboard about gender diversity in Wikidata, an annotated timeline of gender modeling since the launch of Wikidata in 2012, and a corpus of gender-related user discussions.

Next events[edit]

  • May 2024 – The project is over but the research and dissemination continue!
  • June 2024 – Presentation at Trans-inclusive Seminar Series, UCL Institute of Education, London
  • July 2024 – Presentation at Social Media & Society 2024, Sheffield, London — NEW
  • August 2024 – Presentation at Digital Humanities Conference 2024, Arlington/Washington, USA — NEW
  • September 2024 – Presentation at Data Power Conference 2024, Graz, Austria — NEW


The WiGeDi project conducted a comprehensive examination of gender diversity within Wikidata, focusing on three distinct research perspectives: modeling, data, and community. By adopting this multi-faceted approach, the project intends to gain a deeper understanding of the various aspects related to gender representation and inclusivity within the Wikidata ecosystem.

Through an in-depth analysis of the modeling aspect, WiGeDi explored how gender-related information is structured and represented in Wikidata. This examination helped uncover any potential biases or limitations in the current modeling practices, enabling the project to propose enhancements and refinements to improve gender representation and accuracy.

The project also delved into the data itself, examining the gender-related content and characteristics within Wikidata. This analysis involved the assessment of gender distribution, coverage, and representation across different domains and cultural contexts. By doing so, WiGeDi aimed to identify any disparities or underrepresentation, shedding light on potential gaps and biases within the data.

Additionally, the project recognizes the importance of the community aspect in shaping gender diversity within Wikidata. The project thus investigated the engagement and participation of contributors, exploring motivations and challenges faced by the users involved in the creation and maintenance of gender-related data and ontology.

By addressing these different dimensions of modeling, data, and community, the WiGeDi project aims to contribute valuable insights and recommendations for improving gender diversity within Wikidata. These insights not only benefit Wikidata itself but also have the potential to inform and inspire advancements in other Wikimedia and third parties’ projects. The following sections aim to present more specifically the approaches and perspectives that are studied within the project in a complementary manner, namely that of the model, the data and the community.


The Wikidata ontology model is analyzed with the tools of the Semantic Web. The goal is to analyze the model and its evolution over time in a critical way, and visualize the model to enable further analysis. The history of the model can be explored through the #Wikidata Gender Timeline.

Model perspective research questions[edit]

  • How is gender currently represented within the Wikidata ontology model? This question seeks to gain a comprehensive understanding of the existing structure and components that define the portrayal of gender within the platform. By examining the classes and properties related to gender, we aim to identify the specific elements that contribute to its representation.
  • Does the Wikidata ontology model provide a fair and inclusive representation of gender? This question delves into the examination of the existing model’s fairness and inclusivity. We will assess whether the current ontology adequately captures the diversity of gender identities, ensuring that it does not reinforce biases or perpetuate exclusionary frameworks.
  • How has the model evolved to accommodate gender identities that extend beyond the traditional binary view of gender? This question focuses on understanding the evolution of the model over time, particularly regarding its adaptation to recognize and encompass gender identities that do not conform to a strictly binary understanding. By tracing the changes in the model, we aim to evaluate the efforts made to incorporate and represent diverse gender identities.


The biographical data is analyzed following a critical data studies approach. The data is displayed through the Wikidata Gender Dashboard, which will offer statistics about gender identities in Wikidata, focusing in particular on marginalized gender identities.

Data perspective research questions[edit]

  • Which marginalized gender identities, such as those pertaining to transgender and non-binary individuals, are documented within Wikidata, and how are they represented? This question aims to identify and understand the presence of gender identities beyond the traditional binary framework. By examining the specific ways in which these identities are described and categorized, we seek to evaluate the comprehensiveness of the gender data within Wikidata.
  • How are these gender identities distributed across different dimensions, such as time, space, and other relevant metrics, and how do they compare to the broader biographical dataset? This question delves into the exploration of how gender identities are dispersed within the dataset. We will examine variations in representation over time, geographical regions, and other relevant factors to gain insights into the visibility and recognition of marginalized gender identities within Wikidata.
  • Does the current gender data present in Wikidata accurately reflect broader societal reality? This question aims to assess the representativeness of the gender data within Wikidata in relation to the overall demographics and diversity of society. By comparing the data within Wikidata to real-world demographics, we aim to understand how well the platform captures the multifaceted nature of gender representation.


The community discussions are analyzed using computational linguistics techniques such as topic modeling and critical discourse analysis. The anonymized discussions will be included in the Wikidata Gender Talk corpus. The corpus will be made available on request for other researchers to study.

Community perspective research questions[edit]

  • How did the current gender model within Wikidata come to be adopted? This question aims to unravel the factors and dynamics that led to the acceptance and implementation of the existing gender ontology. By examining historical records, community discussions, and relevant sources, we seek to understand the evolutionary path that shaped the current model.
  • What issues related to gender modeling have been discussed by the Wikidata community over the years, and how have these discussions evolved? This inquiry aims to explore the key topics and concerns raised by the community regarding gender modeling. We will trace the evolution of these discussions to identify the emerging themes, shifts in perspectives, and potential resolutions proposed over time.
  • How has the multilingual nature of Wikidata influenced user discussions related to gender? This question considers the impact of the diverse linguistic framework within the Wikidata community on discussions surrounding gender. By analyzing discussions across different languages, we aim to identify any variations, similarities, or unique perspectives that emerge from the multilingual context.

Project timeline[edit]

Date Event
19 September 2022 Official project start date
October–December 2022 Planning, setup of contracts, data protection & ethics registration
January 2023 Official start of contracts
January–March 2023 Early design and prototyping phase, data collection
27 February 2023 Paper proposal accepted for Internet Histories special issue
2 March 2023 Abstract accepted at Data Justice Conference
9 March 2023 Abstract accepted at IGALA12 Conference
23–25 March 2023 Mid-project meeting in London
30 March 2023 Abstract accepted at Nordic STS Conference
4 April 2023 Abstract accepted at Challenging the Binary Conference
10 April 2023 Abstract accepted at Queering Wikipedia 2023
17 April 2023 Abstract accepted at Wiki Workshop 2023
13 June 2023 Challenging the Binary presentation
19 June 2023 Panel accepted at Wikimania 2023
20 June 2023 Abstract and panel accepted at LD4 Conference
20 June 2023 Paper proposal accepted for Communication, Culture & Critique special issue
21 June 2023 Wikimedia Research Showcase presentation
4 July 2023 IGALA presentation
12 July 2023 LD4 presentation
18 August 2023 Wikimania presentation
1 September 2023 Project extended until 31 December 2023
29 October 2023 WikidataCon presentation
6 November 2023 Ethics in Linked Data book chapter officially published (Postprint)
2 December 2024 Data Modelling Days presentation
1 January 2024 Project extended until 31 March 2024
8 January 2024 Paper proposal accepted for Bulletin of Applied Transgender Studies special issue
24 January 2024 Invited presentation at Queering Big Data, Algorithms and AI, University of Manchester
5 March 2024 Abstract accepted at Digital Humanities Conference 2024
6 March 2024 Panel accepted at Social Media & Society 2024
13 March 2024 Abstract accepted at Data Power Conference 2024
15–16 March 2024 Queer Data Days project event

Research Ethics[edit]

The main ethical issues stemming from our project are as follows:

  • Handling of sensitive data. The data is already published on Wikidata following a Biographies of Living People (BLP) policy and a Notability policy, which should decrease the risk of us accidentally republishing sensitive information. However, to minimize any risks to the users, we are running queries on the SPARQL endpoint and aggregating the results without storing any of the individuals’ data on the server. Instead, we are storing only aggregated data that is displayed for visualizations and we are linking directly to Wikidata (but not to any individual biographical item) for additional information.
  • Reliance on user discussions. The discussions are already published on Wikidata, but the risk to user discussion comes from the indexing and publication of subsets of user discussions about sensitive topics which may make it easier to track the edits of specific users, making them vulnerable. To prevent this, and after consultation with our institution, we have decided not to openly publish datasets of user discussions. Instead, we are providing the tools to gather discussions from Wikidata APIs and annotate them via GitHub (temporarily unavailable pending code review), and publishing the methodology to gather relevant user discussions. However, we are willing to provide additional data to interested researchers under request. We are also avoiding direct publication of user discussions in research papers, relying instead on paraphrases whenever feasible.
  • Publication of usernames. Wikidata users are usually pseudonymous, but to minimize possible risks to them (e.g., harassment based on our findings), we are not publishing any usernames in any project outputs. This practice is consistent with the applicable Wikimedia policies.

Project results[edit]

Wikidata Gender Timeline[edit]

The Wikidata Gender Timeline displays the history of the Wikidata gender model, featuring significant events, changes in the model, and important discussions among the users. Development of the timeline is at an advanced stage. A prototype has been shown at the conferences listed below, starting in May 2023. The Wikidata Gender Timeline is thus a dynamic, user-friendly interface designed to present the evolution of Wikidata’s ontological model in relation to gender representation.

The goal is to provide an intuitive browsing experience through the key events that have influenced the conceptualization of gender within the platform. In particular, the timeline includes a number of different events that have helped shape the discourse on gender on Wikidata. These events include the creation of the gender-related items (from those focused on the binary system to the more inclusive ones for non-binary and trans identities), the significant moments involved in creating, refining, and modifying the P21 property (sex or gender), and the user discussions that have taken place to improve, optimize, and adapt the gender model since Wikidata’s launch in 2012. Each of these events plays a key role in revealing the complex interplay between the collective decisions of the Wikidata community, evolving perspectives on gender, and broader societal changes that have influenced the knowledge base.

The Wikidata Gender Timeline is thus a dynamic visual representation that encapsulates the progression of gender-related discussions and events within the Wikidata platform. At present, the timeline covers most events from 2012 to 2016, i.e., the years during which the most fundamental decisions about gender representation were taken, and most discussions around gender happened. Additional data about events from recent years has been collected and will be published in due course along with a planned publication about the timeline itself.

Within the timeline, each individual row corresponds to a specific day: in particular, the timeline also integrates empty days into the narrative to emphasize periods of increased activity or relative passivity, thus accentuating the flow of engagement and alternation with periods of substantial inactivity. In this visual scheme, each column serves as a distinct category, embodying a particular form of contribution that has shaped gender representation in its entirety.

Wikidata Gender Dashboard[edit]

The Wikidata Gender Dashboard displays statistics about gender identities in Wikidata. The dashboard offers several visualizations allowing the user to explore the data from different perspectives. A plan of features has been drawn up and development of the dashboard has begun. The goal of this output is to understand the statistics and possible semantic and qualitative relationships present in the data that, within the Wikidata knowledge base, relate to items of gender-diverse people. We designed a complex SPARQL query to retrieve information about individuals (specifically, those identified as instances of human – wd:Q5) and various attributes related to their gender, birth, and other biographical details. The data collected as a result of the query covers a total of 4,020 people who can be defined as the gender-diverse population recorded within the Wikidata knowledge base. The type of data collected covers different viewpoints and possible directions of analysis. The quantitative analysis sheds light on the representation of non-binary individuals in the Wikidata knowledge base, offering several key insights:

  • Issue related to underrepresentation: it is evident that the number of individuals explicitly described as trans, non-binary, gender diverse, or intersex in Wikidata is significantly smaller than the corresponding prevalence of these populations in the real world. This finding suggests that there may be significant underrepresentation of these individuals in the dataset.
  • Geographical distribution: the geographic distribution of gender diverse individuals within the Wikidata dataset shows a strong bias toward the global North, particularly within the Anglosphere. This is noteworthy considering that Wikidata is a multilingual project, which is supposed to represent transnational knowledge and experiences. Several factors could explain this phenomenon, including the possibility that gender diverse people are more open about their gender identity in these regions, leading to a greater likelihood that such information will appear in Wikidata. In addition, it is conceivable that Wikidata contributors from these regions are more proactive in representing the identities of gender-diverse individuals.
  • Occupational distribution: Analysis of the occupational distribution of gender diverse individuals suggests that a significant percentage of them are engaged in the creative arts or gender rights activism. Interestingly, some occupations that are more common among cisgender men and women are significantly underrepresented among gender-diverse people. This disparity could be attributed to various factors, such as social gatekeeping that limits gender-diverse representation in fields such as politics and sports. Alternatively, gender-diverse people in these professions might be less open about their gender identity, leading to their underrepresentation in Wikidata-and possibly in the real world as well. It is also possible that Wikidata contributors who focus on specific occupational categories are more committed to representing the identities of gender diverse individuals.
  • Temporal distribution: our analysis reveals a skewed temporal distribution, with a substantial majority of gender-diverse individuals born in the 20th century. This skewness is likely the result of historical factors, including the likelihood that fewer individuals identified as gender-diverse in the past. In addition, the attribution of gender-diverse identities to historical figures is a complex and somewhat controversial undertaking.

Wikidata Gender Talk[edit]

The community analysis aspect of the WiGeDi project focuses on understanding how the Wikidata community has contributed to the development and evolution of the current gender ontology. By examining the processes and discussions that have shaped the model over time, we aim to gain insights into the community’s perspectives, challenges, and decision-making processes. The project has identified a set of gender-related user discussions that we call "Wikidata Gender Talk", allowing us to analyze discussions that have occurred among Wikidata users, focusing specifically on issues related to gender. This work was achieved through the use of Wikidata APIs to identify the discussion through a search for 79 keywords of significant relevance, and a specialised semi-automated tool to classify their relevance for our purposes.

These keywords cover a broad spectrum, embracing gender identities such as «nonbinary» and «female» as well as terms related to the concept of sex, such as «male», «female», «AMAB» (assigned male at birth), «AFAB» (assigned female at birth), along with relevant entity IDs such as Q1052281 (transgender female) and Q1097630 (intersex). The keywords also included noteworthy general terms that could relate to the gender-diverse community, such as «LGBT», «LGBTQ», «LGBTQIA+», and similar. The results produced a substantial set of 232,688 discussions. A comprehensive review was undertaken, covering a total of 2,511 Wikidata discussion pages.

However, it became apparent that within this output, only a portion truly revolved around gender discussions of interest to the project and linguistic analyses. For this reason, we introduced a semi-automated annotation step for the annotation of each discussion (future publication via GitHub pending peer review of the associated publication). The discussions were very useful to reconstruct the timeline of gender in Wikidata (see Section #Wikidata Gender Timeline). Moreover, a topic modeling study was conducted to identify trending topics over the years. The results show a clear trend towards more extensive discussion of trans identities over the years, while the discussion of non-binary identities remains limited.

Suggestions for improving Wikidata practices and policies[edit]

Throughout the project, we have identified several changes that could be made to improve the Wikidata model, data practices, and policies.

The main issue stems from the ambiguity of property P21. The property is called sex or gender (in English), but it actually represents at least four data points:

The idea of splitting the property into four different ones is unfeasible for different reasons. First of all, there are some people for whom recording the sex assigned at birth is harmful, especially when a person has undergone a gender transition and they do not want this information to be listed. Changing the property to represent only gender identity would be less risky, especially if the sourcing policies are improved (see below).

Gender expression (feminine, masculine, etc.) is likely too subjective to be represented and reliably sourced. It could theoretically be possible to represent it, but since this has not been done until now and to the best of our knowledge it has not been requested, it is probably unnecessary. In any case, mixing up gender expression with gender identity, such as using P21 with values such as feminine, should be avoided.

On the other hand, gender modality could be easily separated from gender identity to represent trans people. This has previously been proposed by the Personal Pronouns project (Crystal Yragui & Arielle Rodriguez), and also by us, in multiple venues. In our view, this is a very sensible solution which would enhance the current representation, make it less vague, and at the same time reduce the othering that currently affects trans identities on Wikidata.

Another problem with P21 pertains to language, whereby labels and descriptions of the property in different languages are not consistent and may create confusion among users. The case of Italian is particularly glaring, because the description of the property explicitly states that "for people, it must be: male (Q6581097), female (Q6581072), or intersex (Q1097630)", while the truth is that many more values are allowed by the existing constraints (e.g., non-binary). All labels and descriptions for such sensitive properties should be made consistent and possibly protected to avoid any further mistakes to be made.

Class taxonomy[edit]

The class taxonomy has seen much improvement compared to the beginnings of Wikidata, but there are still some glaring issues.

On Wikidata, transgender people are explicitly othered, meaning that they are represented in a different way compared to cisgender people. A cisgender woman is “female”, but a transgender person is “trans woman”. This suggests that “female” is somehow the default value and “trans woman” is the exception when it comes to women. This is not true, as trans women are women and trans men are men. The issue is also evident in the class taxonomy, which keeps male/female in an entirely separate branch of the tree. In our view, this othering of trans people is highly problematic.

The separation in two branches appears to be due to a legitimate need to separate gender identity from sexual identity, but if this is the case, the ambiguity of P21 should be resolved so that these two different kinds of identities are not mixed up when used in statements. Moreover, the current separation of the two branches stemming from “gender” and “gendered” is very difficult to understand.

For more information, see the relevant page about the model on our website, which reports the full taxonomy of both gender identities and sexual identities (to be published on 1 May 2024). Moreover, our publication "Non-binary gender representation in Wikidata" reports the taxonomy as of 2022.

Gendered properties[edit]

Another important issue is represented by gendered properties, meaning those properties that implicitly contain gender data. Such is the case of the properties "brother" and "sister", which were replaced by the property "sibling" after multiple requests and years-long deliberation.

On the contrary, the properties "mother" and "father" have survived multiple attempts at replacing them with a gender-neutral property. Eventually a property called "parent" was created, but it is used only, effectively creating a separate system that is only applied to anyone who is not a cisgender man or a cisgender woman (again, the default). As of April 2024, this property is used on just 100 Wikidata items.

Consensus and NPOV and Reliable sources[edit]

Consensus is good, until it isn’t. When it comes to marginalized communities, consensus can be harmful because it imposes a very high threshold to get much needed policy changes approved. The case of the properties "brother" and "sister" discussed above is emblematic, because it took more than three years to get from the original proposal to the final implementation. Consensus often protects the status quo.

Neutral point of view (NPOV) can also be problematic when it is wrongly intended as a sort of middle ground, e.g., between trans people who are simply asking to exist and trans-exclusionary people who discriminate against them. We are not arguing for a change to the NPOV policy that underpins Wikimedia projects, including Wikidata, but care should be taken to ensure that neutrality does not become inaction.

Moreover, when it comes to sensitive data, it should not be left up to a cisgender majority to decide how transgender people should be represented, modeled, or described. It is important that the data subject (i.e., the person who is described in Wikidata) is given the opportunity to challenge a view that is purportedly neutral or based on existing sources which for example may be reliable but completely outdated.


The other important issue is self-determination. We believe that Wikidata and other Wikimedia projects should adopt a strong self-determination policy that goes beyond current BLP guidelines. This should cover all sensitive properties about people (sex or gender, sexual orientation, ethnicity, etc.), overcoming the current practices which are defined vaguely and different for each property. For example, sexual orientation data is treated in a significantly more restrictive way than gender, but they are both equally sensitive. A review of all sensitive properties should be conducted to make sure that the applicable policies are applied equally.

Page protection[edit]

In the course of the project, we found many instances of vandalism which affects transgender people significantly more than the average. An important tool to prevent vandalism is page protection. At the moment, page protection is limited to full pages. In our view, a technical improvement to allow the protection of individual statements would be very useful to safeguard sensitive biographical data while minimising disruption for Wikidata users.

Use of bots[edit]

Our research has shown that the use of bots to add gender data is harmful, as it makes it difficult to identify the accuracy of the original data and the a posteriori manual review is a daunting task. For this reason, we believe that use of bots should be banned for all sensitive properties, and technical improvements should be made (possibly combined with page protection, see above) to make sure that users with the bot flag are not allowed to edit any of such statements.

For unregistered bots without a bot flag and web-based automated tools that perform mass additions of data, per-statement or per-property protection could be a sensitive solution. A possible alternative solution would be to keep track of mass-editing users by applying counter-bots that automatically revert any such edits to sensitive properties. However, this would require adequate policies that explicitly allow such reactive measures.

Major publications[edit]

Non-binary gender representation in Wikidata[edit]

Note: This publication was submitted before the start of the project. Part of the peer review and final publication were conducted during the project

Metilli D. & Paolini C. (November 2023) "Non-binary gender representation in Wikidata". In: Provo A., Burlingame, K. & Watson, B.M. Ethics in Linked Data, Litwin Books [1]. Postprint available at: [2]

What does it mean to be queer in Wikidata? Practices of gender representation within a transnational online community[edit]

Abstract. The continuing digitisation and datafication that our society is undergoing are having a significant impact on our daily lives, giving rise to new possibilities but also entailing significant risks for people who are discriminated against or marginalised. Queer communities are particularly affected by these processes, therefore it is crucially relevant to research transnational digital projects that involve them. In the Wikidata Gender Diversity project (WiGeDi), we are looking at practices of gender representation in the Wikidata knowledge base, a collaborative online project managed by a worldwide community. Working from the idea that gender is a complex social construct, we investigate how the Wikidata community has approached the complex issue of modelling and populating gender data, progressing from a very narrow interpretation of gender as a binary to a representation that is more inclusive of a multiplicity of gender identities.

To be published in Communication, Culture & Critique, Special Issue on Transnational Queer Cultures and Digital Media.

What does it mean to be queer in Wikidata? Practices of gender representation within a transnational online community[edit]

Abstract. The Wikidata Gender Diversity (WiGeDi; Metilli & Paolini, in press) project aims to study gender diversity in the Wikidata knowledge base. Wikidata is a collaborative, multilingual project edited by an international community of users, and one of its goals is to represent biographical knowledge about people, including gender (Vrandečić & Krötzsch, 2014). Since Wikidata's launch in 2012, the community has been discussing the best approach to conceptualise gender within a shared ontology that would be faithfully reflective of reality, even more so considering the difficulty of delineating such a complex construct (Butler, 1999). Although there is still no complete consensus on the treatment of gender identities in Wikidata, it is interesting to reconstruct the history of the decisions made by the community in this regard, which are often influenced by historical events happening in the real world. Since the Wikidata ontology of gender has evolved significantly over time, the WiGeDi project aims to build a timeline of events that led Wikidata to evolve from a very narrow interpretation of gender as a binary, to a representation that is more inclusive towards marginalised identities such as trans and non-binary people. Being able to see this side of history under the lens of critical data studies (Iliadis & Russo, 2016) allows us to understand how the community overcame linguistic, cultural and generational barriers to reach a shared understanding of the complexity of gender.

To be published in Internet Histories, Special Issue on Gender and Internet/Web History.

Full list of publications and presentations[edit]


  • Metilli D., Melis B., Paolini C., Fioravanti M. "Can I be Queer in Wikidata? Representing Queer Identities in a Collaborative Knowledge Base". Presentation at Queering Big Data, Algorithms & AI, Manchester, UK.
  • Rodriguez A., Melis B., Paolini C., Yragui C., Metilli D., Weathington K., Fioravanti M. & Samuel J. (December 2023) "Modelling Gender on Wikidata". Panel at Data Modelling Days, online.
  • Metilli D. & Paolini C. (November 2023) "Non-binary gender representation in Wikidata" [Previously Published Work Track]. Presentation at Wikidata Workshop, co-located with International Semantic Web Conference, Athens, Greece [3]
  • Metilli D., Melis B., Fioravanti M. & Paolini C. (October 2023). "Are you modeling my gender? Results from the Wikidata Gender Diversity project". Presentation at WikidataCon 2023, online [4]
  • Metilli D., Paolini C., Melis B. & Fioravanti M. (August 2023) "How does Wikidata model our gender? Findings from the Wikidata Gender Diversity project". In: "Modelling, mapping and bridging knowledge gaps in gender and diversity". Panel at Wikimania 2023, online [5]
  • Metilli D. & Paolini C. (July 2023) "Ethics in Linked Data Book Panel". Panel at the LD4 Conference, online [6]
  • Metilli D., Melis B., Fioravanti M. & Paolini C. (July 2023) "How do you model my gender? Studying gender representation in the Wikidata knowledge base". Presentation at the LD4 Conference, online [7]
  • Paolini C., Metilli D., Melis B. & Fioravanti M. (July 2023). "Who decides my gender? A corpus-based analysis of Wikidata’s community discussions around trans and non-binary identities". Presentation at the Biennial Conference of the International Gender and Language Association, Brisbane, Australia [8]
  • Metilli D., Melis B., Fioravanti M. & Paolini C. (June 2023). "Early results from the Wikidata Gender Diversity project" (provisional title). Presentation at the Wikimedia Research Showcase, online.
  • Metilli D., Melis B., Fioravanti M. & Paolini C. (June 2023). "Can you model my gender? How the Wikidata community developed a shared ontology of gender". Presentation at the Data Justice Conference, Cardiff, United Kingdom [9]
  • Paolini C., Metilli D., Melis B. & Fioravanti M. (June 2023). "Are you discussing my gender? A corpus-based analysis of Wikidata’s community discussions around non-binary identities". Presentation at the Challenging the Binary Conference, London, United Kingdom [10]
  • Metilli D., Melis B., Paolini C. & Fioravanti M. (June 2023). "Who cares about my gender? Analysing practices of data care and repair in Wikidata". Presentation at the Nordic Science and Technology Studies Conference, Oslo, Norway [11]
  • Metilli D., Melis B., Fioravanti M. & Paolini C. (May 2023). "Queering Wikidata: Early insights from the Wikidata Gender Diversity Project". Presentation at the Queering Wikipedia Conference, online.
  • Metilli D., Melis B., Paolini C. & Fioravanti M. (May 2023) "How does Wikidata shape gender identities? Initial findings and developments from the WiGeDi project". Presentation at Wiki Workshop 2023, online [12]
  • Metilli D. & Paolini C. (October 2021) "Non-binary gender identities in Wikidata". Presentation at WikidataCon 2021, online.


  • "Do you care about my gender? The long and winding road to data justice in Wikidata’s representation of gender". Accepted to Digital Humanities Conference 2024, Washington/Arlington, USA.

Under review[edit]

  • "Can I be queer in Wikidata? Practices of queer representation in a collaborative knowledge base". Submitted to AoIR 2024, University of Sheffield, UK.
  • Melis B., Paolini C., Fioravanti M. & Metilli D. (expected 2024). "What does it mean to be queer in Wikidata? Practices of gender representation within a transnational online community". Paper proposal accepted for publication in Communication, Culture & Critique, Special Issue on Transnational Queer Cultures and Digital Media [13]
  • Melis B., Fioravanti M., Paolini C. & Metilli D. (expected 2024). "How have you modelled my gender? Reconstructing the history of gender in Wikidata". Paper proposal accepted for publication in Internet Histories, Special Issue on Gender and Internet/Web History [14] [dead link]

See also[edit]

External links[edit]