Grants:Programs/Wikimedia Alliances Fund/Rapid Fund/Enhancing wiki-ecosystems to support Wiki-editors and Researchers. (ID: 21982228)
This is an automatically generated Meta-Wiki page. The page was copied from Fluxx, the grantmaking web service of Wikimedia Foundation where the user has submitted their application. Please do not make any changes to this page because all changes will be removed after the next update. Use the discussion page for your feedback. The page was created by CR-FluxxBot.
- Please provide your main Wikimedia Username.
- Please provide the Usernames of people related to this proposal.
Lane Rasberry (username: bluerasberry) Daniel Mietchen (username: Daniel Mietchen)
- Are you a member of any Wikimedia affiliate or group, including informal groups like Wiki Fan Clubs, emerging language communities, not recognized Wikimedia groups etc.? Please list them all.
Currently in a non-official relationship with researchers developing https://scholia.toolforge.org/ and with librarians from the GLAMs at University of São Paulo.
- M. Please state the title of your proposal. This will also be the Meta-Wiki page title.
Enhancing wiki-ecosystems to support Wiki-editors and Researchers.
- Q. Indicate if it is a local, international, or regional proposal and if it involves several countries? (optional)
- Q2. If you have answered regional or international, please write the country names and any other information that is useful for understanding your proposal.
Wiki-ecosystems that cluster information about specific quality aspects of scientific articles are planed to be further improved at this stage. These platforms are world widely free offered, and can be used for checking references used in a Wikipedia articles (e.g. Wiki-editors generating or updating Wikipedia articles) and in scientific papers (e.g. Researchers writing papers). In other words, this project is boundless regarding geographic locations.
- R. If you would like, please share any websites or social media accounts that your group or organization has.
- 1. What is the change that you are trying to bring about and why is this important.
Planed change and its relevance
Wikipedia's articles built upon science would largely benefit from scientific publication that are better checked. Similarly, science would also largely benefit from using information concerning quality aspect of publications during the peer-reviewed stage of scientific articles.
We are proposing to improve the capacity of the following wiki-ecosystems:
We are also proposing to make wisely use of WikiData to accommodate some relevant information from these platforms into publication items (e.g. https://www.wikidata.org/wiki/Q58493233) , so that the project Scholia can make good use of this information.
These wiki-ecosystems act as individual but complementary databases for checking the quality aspects of scientific publications. A brief descriptions of their individual purpose are as follows:
WikiRetracted, a database displaying articles that have been retracted (i.e. retractions); WikiCitingRetracted, a database displaying articles that have cited retractions, so maybe compromised to an unclear degree; WikiComments, a database displaying full peer-reviewed work about another article. Full comments can be very important but often lost in search engines; WikiErratum (or WikiCorrigendum), a database displaying minor corrections to the level of thesis and papers. This databases is also available to receive corrections regarding rectifications of citations within articles.
What are the main challenges or problems you are trying to solve?
This information can benefit both wiki-editors and researchers in a very transparent manner via wiki-systems.
How has the importance of this change been assessed?
WMF has often cared about improving the quality of Wikipedia's articles, and articles based on science may represent 10-20 % of Wikipedia's content. https://en.wikipedia.org/wiki/Science_information_on_Wikipedia
- 2. Describe your main approaches or strategies to achieve these changes and why you think they will be effective.
The technology is currently available (MediaWiki codes) and successfully tested in a number of environments (i.e. Wikipedia). However, wiki-ecosystems demand specific infrastructure, such as servers (CPUs and RAM) that meets requirements from codes and databases. Currently, the four wiki-ecosystems before-mentioned are running over the exact same limited individual server, which causes very slow query results. The proposed approach is very simple, because it requires the purchase of modest effective Virtual Private Servers to speed-up each wiki-ecosystems. That means the improvement of internet infra-structure to a level between shared and dedicated hosting, which seems reasonable. This change to internet infra-structure is very effective regarding components intrinsic to Agile Software Methods (https://en.wikipedia.org/wiki/Agile_software_development) and Scrum (https://en.wikipedia.org/wiki/Scrum_(software_development)), relating to fast query results. It is worth mentioning that we are observing and learning from other projects such as the Wikidata improving its speed capability. Although these four wiki-ecosystems are in their very early stage, and thus requiring minimum but meaningful support. https://addshore.com/2020/10/faster-munging-for-the-wikidata-query-service-using-hadoop/
- 3. What are the activities you will be developing and delivering as part of these approaches or strategies?
Current codes from wikicomments.org, wikiretracted.org, wikicitingretracted.org and wikierratum.org will be transferred into Virtual Private Servers so that database expansion and code implementation are possible.
- 4. Are your activities part of a Wikimedia movement campaign or event? If so, please select the relevant campaign below. If so, please select all the relevant campaigns from the list below. If "other", please state which.
- 5. Do you have the team that is needed to implement this proposal?
Fernando Andutta (andutta.com) is Post-doctorate in the Department of Computational Linguistic at University of Sao Paulo (FFLCH), and has been doing this exercise, which is demonstrated by the four websites up-running (wikiretracted.org, wikicitingretracted.org, wikicomments.org and wikierratum.org). What is needed now are better resources regarding servers. Lane Rasberry (https://datascience.virginia.edu/people/lane-rasberry) is a Researcher in Data Science and a Wikimedian in Residence at the University of Virginia. Lane is following up and providing advice towards expected deliverables and reports to be further handed to WMF. Daniel Mietchen (https://meta.wikimedia.org/wiki/User:Daniel_Mietchen) is a biophysicist, open science advocate and Wikimedian with multi-affiliations. Similarly to Lane, Daniel is also following up and providing insights towards expected deliverables and reports to be further handed to WMF.
- 6. Please state if your proposal aims to work to bridge any of the identified CONTENT knowledge gaps (Knowledge Inequity)? Select up to THREE that most apply to your work.
Geography, Language, Other global topics for impact (topics considered to be of global importance)
- 6.1 In a few sentences, explain how your work is specifically addressing this content gap (or Knowledge inequity) to ensure a greater representation of knowledge.
The current and very few platforms concerned with quality aspects of science and information in general are often scattered and difficult to find on the Web. Worth stating that a good fraction of Wikipedia's contents rely upon scientific information (https://en.wikipedia.org/wiki/Science_information_on_Wikipedia). This work can advance wiki-sci-bridges for quality checking information that benefit both the wiki-community and the scientific community.
- 7. Please state if your proposal includes any of these areas or THEMATIC focus. Select up to THREE that most apply to your work and explain the rationale for identifying these themes.
Education, Open Technology, Diversity
- Open Access Information
- 8. Will your work focus on involving participants from any underrepresented communities?
Geographic , Digital Access
- 9. Who are the target participants and from which community? How will you engage participants before and during the activities? How will you follow up with participants after the activities?
Wiki-editors and researchers from all geographic locations are offered full digital access to all wiki-ecosystems supported in this project. Furthermore, explanation about these wiki-ecosystem was given in conferences and a number of other events: https://br.wikimedia.org/wiki/WikiCon_Brasil_2022/Programa/WLSR https://wikiconference.org/wiki/Submissions:2021/WikiLetters_Systematic_Review https://eventos.congresse.me/29cofab
- 10. In what ways are you actively seeking to contribute towards creating a safer, supportive, more equitable environment for participants?
The contents to be transferred into new servers and some important information to be further backed up into Wikidata. Therefore, participants making use of these sources of information are allowed and encouraged to contribute to a sustained growth of these platforms.
- 11. Please tell us about how you have let your Wikimedia communities know about the planned activities and this proposal. Use this space to describe the processes you carried out to make the community more involved in planning this proposal. Please link the on-wiki community discussion(s) around the proposals.
This proposal relates to a number of complementary ideas displayed to the scientific and wiki-community. https://br.wikimedia.org/wiki/WikiCon_Brasil_2022/Programa/WLSR https://wikiconference.org/wiki/Submissions:2021/WikiLetters_Systematic_Review https://eventos.congresse.me/29cofab
- 12. Are you aware of other Rapid Fund proposals in your local group, community, or region that are being submitted and that align with your proposed project?
- 12.1 Did you explore the possibility of doing a joint proposal with other leaders in your group?
- 12.2 How will this joint proposal allow you to have better results?
- 13. Will you be working with other external, non-Wikimedia partners to implement this proposal? Required.
- 13.1 Please describe these partnerships and what motivates the potential partner to be part of the proposal and how they add value to your work.
- 14. In what ways do you think your proposal most contributes to the Movement Strategy 2030 recommendations. Select a maximum of THREE options that most apply.
Improve User Experience, Manage Internal Knowledge, Innovate in Free Knowledge
Learning, Sharing, and Evaluation
- 15. What do you hope to learn from your work in this fund proposal?
This work will be used to provide databases that can support wiki-editors and researchers.
- 16. Based on these learning questions, what is the information or data you need to collect to answer these questions? Please register this information (as metric description) in the following spaces provided.
|Main Open Metrics||Description||Target|
- 17. Core quantitative metrics.
|Number of participants||Total target participants is an uncountable variable at this stage, but this project is opened to support all wiki-editors willing to check the quality of references they aim to insert into Wikipedia articles based on science. It is evident that this project in the medium and long run will benefit from workshops to expose and teach wiki-editors and researchers to all this information.|
|Number of editors||Lane Rasberry and Daniel Mietchen are currently long term editors contributing to wiki-ecosystems under the umbrella of WMF. Fernando Andutta will start editing and backing up information from the aforementioned databases into Wikidata. Consequently, the project Scholia (https://www.wikidata.org/wiki/Wikidata:Scholia), can also re-transmit this information via Sparqls (https://en.wikipedia.org/wiki/SPARQL).||3|
|Number of organizers||Fernando Andutta will be developing the proposed task with advice from Lane Rasberry and Daniel Mietchen, who are both very experienced in the context of wiki-projects.||3|
|Wikipedia||This work aims to support wiki-editors generating and updating Wikipedia's article based on science.||N/A|
|Wikidata||Contents from the tens of thousands of elements amongst the databases Wikiretracted, Wikicitingretracted, WikiComments are to be further backed-up inside WikiData as a secure measure and to allow easy access to services such as Scholia (https://scholia.toolforge.org/). Scholia is a bridge between researchers and WikiData, and we have been collaborating towards exposing researchers to the wiki-state-of-mind.||N/A|
- 17.1 If for some reason your proposal will not measure these core metrics please provide an explanation.
Some core metrics are a bit complex to be currently estimated, because it mostly relate to technology development and not a workshop to teach a target public. However, the technology here been developed is a valuable source of information fully opened to wiki-editors and people willing to contribute towards improving contents in Wikipedia's articles based on science. Furthermore, contents of these databases add up to tens of thousands of elements, and growing, which are meant to be further transferred into their respective publications currently located into WikiData. Scientific publications in WikiData are accessed via Q-values (e.g. https://www.wikidata.org/wiki/Q58493233). Consequently, such information in Wikidata can be accessed by Scholia, which is the bridge between researchers and WikiData.
- 18. What tools would you use to measure each metric selected? Please refer to the guide for a list of tools. You can also write that you are not sure and need support.
The query throughput that relates to the efficiency of a database can be roughly assessed via a number of online services for checking website speed-only, but also through additional parameters (https://en.wikipedia.org/wiki/Query_throughput).
- 19. & 19.1 What is the amount you are requesting from Wikimedia Foundation? Please provide this amount in your local currency.
- 19.2 What is this amount in US Currency (to the best of your knowledge)?
- 20. Please upload your budget for this proposal or indicate the link to it.
4 individual servers at Enhanced VPS (each $ USD 1,080.00, and total $ USD 4,320.00) ENHANCED, Save 50%$29.99/mo*, 36/mo term Top Features, 2 Cores, 60 GB SSD Storage, 4 GB RAM, 2 TB Bandwidth, 2 IP Addresses
- We/I have read the Application Privacy Statement, WMF Friendly Space Policy and Universal Code of Conduct.
Endorsements and Feedback
Please add endorsements and feedback to the grant discussion page only. Endorsements added here will be removed automatically.
Community members are invited to share meaningful feedback on the proposal and include reasons why they endorse the proposal. Consider the following:
- Stating why the proposal is important for the communities involved and why they think the strategies chosen will achieve the results that are expected.
- Highlighting any aspects they think are particularly well developed: for instance, the strategies and activities proposed, the levels of community engagement, outreach to underrepresented groups, addressing knowledge gaps, partnerships, the overall budget and learning and evaluation section of the proposal, etc.
- Highlighting if the proposal focuses on any interesting research, learning or innovation, etc. Also if it builds on learning from past proposals developed by the individual or organization, or other Wikimedia communities.
- Analyzing if the proposal is going to contribute in any way to important developments around specific Wikimedia projects or Movement Strategy.
- Analysing if the proposal is coherent in terms of the objectives, strategies, budget, and expected results (metrics).