Jump to content

Visual Analytics for Sustainability and Climate Change/Tool/Co-design activities

From Meta, a Wikimedia project coordination wiki

Participatory design

[edit]

Second iteration

[edit]

It started on January 2026. According to the feedback collected through the present page, the online survey, and a set of interviews to expert, we selected the following features:

  • 1.1: Filter by quality assessment
  • 1.2: Read the terms occurring frequently within an article
  • 1.5: Compare different linguistic versions of all the articles
  • 1.6: Compare the articles at date A and date B
  • 1.7: See the articles' list and timelines

1. Articles' overview

[edit]

The page allows user to get an overview of the articles. It includes the filters by quality assessment and protection type.

Article page of the visualization tool developed under the framework of the Visual Analytics for Sustainability and Climate Change project
Article page of the visualization tool developed under the framework of the Visual Analytics for Sustainability and Climate Change project

Comments and suggestions

[edit]

...

2. Language version comparison

[edit]

The page allows user to compare an article in multiple linguistic versions. It includes the button to translate the missing articles.

Page for linguistic version comparison of the visualization tool developed under the framework of the Visual Analytics for Sustainability and Climate Change project
Page for linguistic version comparison of the visualization tool developed under the framework of the Visual Analytics for Sustainability and Climate Change project

Comments and suggestions

[edit]

...

3. Time travel

[edit]

The page allows user to see the evolution of an article over a selected period of time.

Page for comparison of articles over time of the visualization tool developed under the framework of the Visual Analytics for Sustainability and Climate Change project
Page for comparison of articles over time of the visualization tool developed under the framework of the Visual Analytics for Sustainability and Climate Change project

Comments and suggestions

[edit]

...

First iteration

[edit]

It ran in the period September — December 2025.

General comments

[edit]
  • Accessibility of the displays is poor. In particular colors are too similar
  • The displays need to be screenshotable by the users (to easily share in reports for example). That means some attention should be put on display credits somehow
  • the only variable displayed on the Y axis is page views. This is looking at data from a skewed perspective, and not one that would come necessarily in our favor in the future given the page views trends. The Y axis should be able to display something different than the page views. Most editors actually do not care at all about the page views of an article. Mary Mark Ockerbloom pointed out the importance of effective visualization and the shortcomings of current metrics like page views. She emphasized the need for alternative measurement methods
    • During the French meeting, it was very very strongly advised to propose the current X axis parameters both on the X and on the Y axis. So for example, the Y axis could display « number of images » whilst the X axis would display « size of the article"
  • Mary Mark Ockerbloom acknowledged the significance of references and the potential insights they can provide when properly designed. What about adding « number of references » to the elements that could be displayed ? Is that technically feasible ?
    • Participants at the French meeting second that suggestion. They would propose « Number of references » as well as the ratio « Number of references/size of the article »
  • Participants at the French meeting suggested to add more « display parameters ». It is unsure if this information is technically available. For example,
    • they suggested « entering links » and « outgoing links » as display parameters
    • Another suggestion is simply « number of editors » (not to be limited by the ratio « number of editors/size of articles »
    • Another suggestion is « number of editors watching the article » (is that information available ?)
  • A lot of attention was on the potential for tools to annotate key events that could influence editing activity. Key moments, impacting the predicted quality estimates. Participant was seeking clarification on this matter
  • Participant suggested that these tools should not only visualize data but also enable users to take action, such as starting translations directly. So the title of the article should be clickable
  • What is the meaning of « token » in the impact tool ? What is it about ?
  • During the French meeting, the topic of « hot articles » was raised several times and meet the feedback given during the English meetup. The later was interested in ways to evaluate whether an article is just a long time abandoned piece (for example, an article with less than xx bites added over a certain period of time - except for immediate reversion showing vandalism would show that no editing activity is happening on the article). This would allow to identify articles in potential need of revision. The first group rather saw the opportunity to visualise hot topics, with lot’s of edit sessions over the article over a short recent time frame.
  • The space dedicated to « protection » at the top of the design is overkill (too much space for something which is not so important). In comparison, everyone keeps asking « so what is displayed in the middle, the bubble and the circles ». In short, the legend of what is inside the graphic is not sufficiently visible

Sketch n. 1.1: Filter by quality assessment

[edit]

A filter allows users to get all the articles with a specific quality assessment.

Updates

[edit]
  • 06.02.2026: We reviewed the top and bottom bars. The top bar is now meant as a filter panel, while the bottom bar is meant as panel for setting vertical and horizontal axes.
  • 10.11.2025: We added the possibility to change the vertical arrangement (not only by number of page views but also by number of editors, or any other relevant parameter). The filter on the vertical axis is now close to the setting of the vertical arrangement.
Filter by quality assessment
Filter by quality assessment

This sketch is based on the following proposed features:

  • Display "quality assessment", by Florence Devouard, Iolanda Pensa, Matt Vetter

Comments and suggestions

[edit]
  • Iolanda Pensa: I think the alerts / issus about the articles need to be made more visible. Also the quality articles or high ranking should be hilighted. I think we can add more voices to the visualisation: some screams...
  • Sage Ross: Easiest to implement (data is already available) and obviously useful, necessary for later features
  • "sort by title" is important to add information on the X axes to make it clear, like A, B, C...
  • Florence: I have not been able to figure out how to actually see the difference of quality amongst articles. Am I missing something ? I see the circles for the size of the text, of the introduction, of the talk page. I also see the little icon to indicate that an article is locked... but where do I see the quality ? Wikipedians use Content assessment scales. There are two different ways to see the difference. Either the color scheme or the two letter scheme. Such as FM, FL and FA are purple blue article, whilst quality C is bright yellow. For the system to be easily adopted by wikipedians, you need to keep the same letter and color system. Which means the menu should propose to check boxes to select whether you want to display, for example, FA, A and GA articles only. Or you could check boxes to see all articles except stubs. Depending on what you checked, the article that appear should mention the letter (such as FA) and should use the color (such as purple blue). I do not think this is what is happening at the moment. I do not see letters. I do not see a logical color scheme and I can not see what the selection process allows to select.
  • Florence: from a pure user-oriented perspective, I would suggest that the menu people would use the most do display what they want displayed (in this case quality), should be on the right. Not on the left.
  • Florence: I am not sure I understand the locker well. First because we have different level of locks. So we need several types of visuals to displays Locks. See protection policy. We may not want to have all protection levels. But at least the usual Full protection, semi protection, extended confirmed protection etc. But just one type of lock is not really helpful. Also, purely from a user-oriented perspective, I believe a drop down menu saying « Protection: ALL » would rather mean « all protected articles », not « all articles ». I may not really understand the point tof the protection dropdown menu though
    • Giovanni has updated the sketch. It is way clearer. Gio also noted that From the Wikipedia API we know that the articles under investigation of the project can have only the move and/or edit restriction. . Anthere (talk) 09:18, 18 October 2025 (UTC)[reply]
  • About the parameters, the terms used may need clarification. For example, « title » could be « title alphabetical order »

Sketch n. 1.2: Read the terms occurring frequently within an article

[edit]

By clicking on the article bubble, and then on the button "Show details" within the content data section (within a sidebar), a user gets the list of all terms occurring more frequently within the selected articles. A dropdown menu allows the user to see the recurring terms consisting of one, two or three words.

Updates

[edit]
  • 10.11.2025: Added the peacock term tab, and the visualization of incoming and outgoing links.
1-2 Visualizing-climate - Terms occurring frequently
1-2 Visualizing-climate - Terms occurring frequently

This sketch is based on the following proposed features:

  • Display Significant co-occurring terms, by Gitanjali Yadav

Comments and suggestions

[edit]
  • Florence: I am not fully sure what the feature is displaying. From a visual perspective, it looks like we can only select one article. Then its data get displayed in the middle. But then... what does it display on the right ? I think I do not understand at all what is being displayed on the right. Is it 1 word we type in ? Such as I would type in « gender » ? But if this is one word in one article... what does that mean to see whether this one word is displayed more frequently ? In one article, it would only tell me that « gender » is displayed 22 times. What is the frequency about ? Or is it the tool choosing the « 1 word » to analyse in the text ? How does it choose the « 1 word » ? Does it analyse ALL words in an article ? That could be a very long list... And what is « two words » then ? Attached ? Hyphen ? One next to another ? In a long article... this a lot of combinations... what about a word could system ? And words can be downloaded for further analysis ?
    • Gio updated the sketch to clarify the output. Above questions are fixed. The list of words does not include stop words. Anthere (talk)
  • Florence: on the middle screen (2), I see « how to improve the article ». Ok, what is supposed to be found here ? Where does it come from ?

Sketch n. 1.3: Read terms occurring frequently within the whole set of articles

[edit]

By clicking on the "Content analysis" button, the user can access to the section "Recurring terms". The "Recurring terms" section consists of a list of terms occurring more frequently over the whole corpus of articles.

Updates

[edit]
  • 10.11.2025: We added the tab containing the peacock terms.
  • 20.10.2025: We removed the "Curated clusterization" section, namely a page showing the articles according to a clusterization made by the individual user, and we add this feature to the sketch 1.8.
1-3 Visualizing-climate - Content analysis
1-3 Visualizing-climate - Content analysis

This sketch is based on the following proposed features:

  • Display Significant co-occurring terms, by Gitanjali Yadav
  • See total number of views, by Iolanda Pensa

Comments and suggestions

[edit]
  • Florence: 2A is pretty clear to me. Looks good. But I do not understand what 2B is representing. It looks like 2A could be work exactly the same way for « all instances » or « one cluster » or « a selection of clusters «  (those would ideally be available by checkboxes rather than unique choice. BUT that calls the question of how will « clusters » be determined in the tool. Is is pre-upload ? Post up-load ? How are clusters defined ? I think that as long as clusters are defined, we can display 2A with or without clusters. But the key issue right now is how do we select clusters based on a set of article. Maybe missing a sketch here
  • I really like this idea -- but its going to be important to have a disallow list for words that are common for a language, or removing the 1000 most common words across languages. Sadads (talk) 16:34, 24 October 2025 (UTC)[reply]
    • This could be useful for identifying Peacock language or AI generated content that is using unusual language. The other signal that could be useful in word frequency tracking: seeing what percentage of a word or term is being linked or not (could be used to identify overlinking, underlinking or missing "common knowledge" terms that don't yet exist on the wiki), Sadads (talk) 16:40, 24 October 2025 (UTC)[reply]
  • French team noted that this feature could increase the risk of edit-wars (teams willing to replace in mass certain terms over others in a set of articles)

Sketch n. 1.4: Compare different linguistic versions of the same article

[edit]

By clicking on an article bubble, and then on the "Compare language versions" button, the user gets the comparison of the main article metrics among the different linguistic version of the article. The user is also provided with the link to the article in other languages.

Updates

[edit]
  • 12.11.2025: We added a button to create a new language version.
1-4 Visualizing-climate - language version comparison
1-4 Visualizing-climate - language version comparison

This sketch is based on the following proposed features:

  • Display articles from several languages, by Florence Devouard
  • Sage Ross: Potentially feasible to implement by pulling info about other languges for a particular article on-demand from the frontend

Comments and suggestions

[edit]
  • Florian Meier: I really like the idea and the sketch. One problem I see is to develop a metric that balances/combines the different aspects (bytes, number of images, sources, etc.) because bytes alone don't do it (different languages have different bytes, i.e. Chinese vs. Arabic. vs latin
  • Florence: I rather like this sketch. We would have to define what the metrics are to draw the comparison (I would suggest... article size, number of images, protection level, article assessment, etc...). My main regret would be that a quick view comparison between several articles would be very cool.
  • ...

Sketch n. 1.5: Compare different linguistic versions of all the articles

[edit]

By clicking on the "Language comparison" button, the user gets a visualization showing the different linguistic versions of the same articles.

Updates

[edit]
  • 12.11.2025: We swapped the articles list and the language versions. Now the articles are displayed as rows and language versions as columns.
  • 17.10.2025: The user can move among the articles by clicking on the pagination buttons.
1-5-1 Visualizing-climate - linguistic versions comparison
1-5-1 Visualizing-climate - linguistic versions comparison

This sketch is based on the following proposed features:

  • Display articles from several languages, by Florence Devouard

Comments and suggestion

[edit]
  • Florence: I rather like this sketch. I noted though that there is no information related to what is displayed on the visual. Is it still by default the size of the article ? If that is the case, this is insufficient. We need to be able to toggle different types of information to display (to be defined). From a user perspective, we probably want to display languages horizontally and articles vertically. Would have to put a limit to the maximum of language displayed though.
  • Florence: probably easier than the previous one indeed (she refers to the version with the pagination)
  • So one of the biggest gaps right now for the movement is spontaneous generation of Language tables that are easy for folks to sorts articles, and then help people select from that list of articles into the Translation tool (i.e. per https://www.mediawiki.org/wiki/Translation_suggestions:_Topic-based_%26_Community-defined_lists), Sadads (talk) 16:18, 24 October 2025 (UTC)[reply]
    The CEE template can be found here: https://meta.wikimedia.org/wiki/Module:WikimediaCEETable -- and an example of it can be seen in use on these pages: https://meta.wikimedia.org/wiki/Wikimedia_CEE_Spring_2025/Structure/Aromanian, Sadads (talk) 16:24, 24 October 2025 (UTC)[reply]
  • It would be useful to be able to compare the section headers between different language articles (there is a model by WMF research for this).Sadads (talk) 16:41, 24 October 2025 (UTC)[reply]
  • Mentionned during a workshop: to help the output being immediately actionable: provide the links to access it (have the articles titles clickable)
  • Navigation: we should have bread crumb bar at the top to be able to see immediately where we stand in the tool and be able to go back to the general view.
  • Current display issues
    • This will become unreadable when we have more than 4 languages. The languages should be horizontal (because even when all languages are available, I suppose editors will have the choice to select a number of languages they can work on) and the articles vertical. Plus the practices of the community at this point is to have languages horizontal and articles vertical. That would seem more logical. Navigation between set of articles could be at the bottom of the table (display 10 articles at a time for example, and provide a link to see 10 next)
    • It is unclear why the name of the article is displayed at the bottom and only in Spanish. The title of the article should be very visibly available.
    • The title of the article displayed in the main table should be clickable and open a new window with the wikipedia article in the chosen language to facilitate immediate action
    • the display is not very logical and raises questions. The click on the 2nd visual is done on the Italian line; but the content displayed on the right is about the English version. It should be the Italian version to avoid confusion. Just to make sure we are aligned... that means the content displayed in that box must be related to the linguistic version chosen of course (so if we click the Italian button, we need to see the info displayed about the Italian article)
    • it is unclear at this point what is displayed in the middle. We recognize the quality assessment. But what does the size of the bubble represent ? I think it could represent several different things (such as the size of the content, the page views, the number of editors etc.), but at this point, we do not know what is being displayed at all
    • the sort by being top right is kinda illogical. If it is about what is being displayed on the bubble thing, it is not « sort by » but « display ». And then the display is not « title », it could be « size of article » or « number of editors »
  • French team member shared this tool (stopped working some months ago) : https://dicare.toolforge.org/lexemes/challenge.php. It was meant to fix lexemes, not wikipedia articles.



Sketch n. 1.6: Compare the articles at date A and date B

[edit]

By clicking on the "Time travel" button, the user have the possibility to set the start and end time, and to a visualization of the changes occurred during that timespan.

1-6 Visualizing-climate - article time travel
1-6 Visualizing-climate - article time travel

This sketch is based on the following proposed features:

  • See the protection history of the article, by Florence Devouard
  • Comparing the beginning to the end (date A to date B), by Florence Devouard

Comments and suggestion

[edit]
  • Iolanda Pensa: I think to see the differences, the exercise we did with Wikipedia Primary School was more efficient https://commons.wikimedia.org/wiki/File:Wikipedia_Primary_School_20150821_articles_network.jpg (it was based on the number of internal links). Maybe we can look for data which make the visualisation more simple (i.e. number of linguistic editions, number of item on Wikidata...)
  • Iolanda Pensa: i think it needs a visualisation more similar to a tree which grows
  • Florian Meier: could have a slider that lets you allow to see when articles make a "jump" i.e. significant improvements have been made... maybe not for the complete collection but just for selected articles
  • Florence: I do not think it answers the need. First, the one thing it puts the most visibility on is the Y axis, which represents page views. I doubt not that this might interest some people but I doubt it would be the primary element of interest to most users. There is also chances that this figure does not change much. So unlikely to be useful. What I would like to see is the Y axis change depending on criterias I chose. For example size of the article. In this case, I would like to see not only the beginning and the final size, but also the evolution over time. So as Florian indicated, we may need something more like a slider. A good display may indeed be more achievable with a selection of articles maybe. Perhaps some criterias might display all articles whilst other criterias would display a selection. It is hard to be very specific here as we do not know which criterias might be displayed. If I go back to one of the original request here... seeing the protection history of articles... this proposition would not serve that.
  • It would be useful to be able to filter out articles that haven't had major edits from the set -- so that you only see articles edited by more than x number of bites, or by certain users -- this would make it easier to view the impact of recent, Sadads (talk) 17:03, 24 October 2025 (UTC)[reply]
    • Clarification: he means for example, « no major edits for 6 months ». And "major" could be a figure between 500-2000 bytes (excluding cases where the big edit is reverted within 24 hours, which usually sign « vandalism ».
  • Second meeting participants did not find that very convincing. They liked the idea of exploring history. But not the design proposed

Sketch n. 1.7: See the articles' list and timelines

[edit]

By clicking on the sidebar button, the user have access to all the articles as a list and as timelines.

1-7 Visualizing-climate - list and timeline
1-7 Visualizing-climate - list and timeline

This sketch is based on the following proposed features:

  • See the protection history of the article, by Florence Devouard
  • See timelines with multiple variables (number of edits, page views, main events related to the topic), by Chiara Somajni

Comments and suggestion

[edit]
  • Nicole Schwitter: I really like the timelines and it's showing important info. Are they all ending at the present or at the last time there was an edit? Can the date of article creation be easily displayed? Does the same height of the bar mean the maximum size of the article, or is it comparable across articles?
  • Chiara Somajni: timelines could be linked to events (e.g., interest spikes following an event, decreasing after the popular adoption of ai tools, or to explore potential relationships link in case of the war in Ucraine/interest in climate change)
  • Florence: this visual is more convincing than the previous one. I think it could be cool. The main elements to report are 1) we need information about the scale of the timeline display (when does it start, when does it finishes). 2) the timeline should be able to display different elements, such as « size of article, size of discussion pages, so we would need a system to choose what to display 3) it would possible need a slightly different visual in terms of which information is displayed. For example, if we were displaying protection level... a difference of color of the curve to display the moment it has been semi protected or fully protected
  • Second meeting team member thought the elements displayed in the timeline could be most of the ones currently proposed for the X axis (number of edits, size of article, size of intro etc. or references)

Sketch n. 1.8: Read terms occurring frequently within a cluster of articles

[edit]

By clicking on the button "Curated clusterization", the user can see, one at a time, the articles belonging to his own custom lists of articles.

Updates

[edit]
  • 20.10.2025: The whole sketch is added. it is based on a previous version of the sketch 1.3 - Read terms occurring frequently within the whole set of articles
Visualize clusters of articles
Visualize clusters of articles

This sketch is based on the following proposed features:

  • See clusters of articles, by Evelin Heidel


Comments and suggestion

[edit]
  • Thanks for dissociating this sketch from the above. It obviously still raise the question of how « clusters » are technically defined by the user. As a reminder, the list of articles may come from different sources (such as the activity dashboard or a page pile). The clusters are sub categories into which the articles of the full list are further described. In this case, the list is made of articles about climate and climate change, and a subcategory (cluster) would be meteorology. We do not know yet how clusters would be practically defined. But assuming there are, the visual looks good to me. Warning that this suggests perhaps that the articles were created by the user. Nope... the list and the clusters were created by the user. But the articles not necessarily so. Anthere (talk) 09:55, 21 October 2025 (UTC)[reply]
  • It would be really useful to have adhoc-strategies for clustering that don't use the existing data model on the wikis (i.e. LLM or something clustering by the words/content that is present). Sadads (talk) 17:04, 24 October 2025 (UTC)[reply]
  • ...
Comparative analysis
[edit]

Is it possible to have a before / after visualisation similar to work we implemented before? iopensa (talk) 09:36, 14 November 2025 (UTC)[reply]



Please, rank the three most relevant features

[edit]

Please, add your name, surname, and target group [among volunteer, institution, researcher, journalist, teacher, other (please, specify)]


Cristian Scapozza, researcher

  1. Sketch n. 1.1: Filter by quality assessment
  2. Sketch n. 1.4: Compare different linguistic versions of the same article
  3. Sketch n. 1.6: Compare the articles at date A and date B


Iolanda Pensa, researcher / Wikipedian

  1. Sketch n. 1.1: Filter by quality assessment
  2. Sketch n. 1.4: Compare different linguistic versions of the same article
  3. Sketch n. 1.6: Compare the articles at date A and date B


Florian Meier, researcher

  1. Sketch 1.4
  2. Sketch 1.6
  3. Sketch 1.2

Comments:

  • General: I thought it would be really cool to see the impact of specific sources ... i.e. where was IPCC cited and how often


Sage Ross, Wikimedian

  1. Sketch 1.1
  2. Sketch 1.4
  3. Sketch 1.6


Evelin, Wikimedian

  1. Sketch n. 1.1: Filter by quality assessment
  2. Sketch n. 1.3: Read terms occurring frequently within the whole set of articles and curated clusterization
  3. Sketch n. 1.4: Compare different linguistic versions of the same article


Nicole Schwitter, researcher

  1. Sketch n. 1.4
  2. Sketch n. 1.6
  3. Sketch n. 1.2


Chiara Somajni, journalist

  • interested in understanding trends, less so individual items
  • the following features would be useful to me, but ranking would be variable depending on the occasion and on some relevant details that are currently missing (e.g. metrics to compare Wikipedias)


Recap (last update 2.10.2025)
Sketch 1st 2nd 3rd count
1.1: Filter by quality assessment 4 - - 4
1.2: Read the terms occurring frequently within an article - - 1 1
1.3: Read terms occurring frequently within the whole set of articles and curated clusterization - 1 1 2
1.4: Compare different linguistic versions of the same article 2 3 1 6
1.5: Compare different linguistic versions of all the articles - - - -
1.6: Compare the articles at date A and date B - 2 3 5
1.7: See the articles' list and timelines - - - -
1.8: Read terms occurring frequently within the whole set of articles (new) - - - -