What is the problem you're trying to solve?
Verifiability is one of the cornerstones of Wikipedia, and a key to its continued success. The resources that Wikipedia citations link to establish the reliability of an article, and help knit Wikipedia into the fabric of the Web. When content providers aligned with the Wikimedia community's interest in disseminating knowledge can see how their content is being referenced by Wikipedians, and also Wikipedians can see what websites provide the foundations for our content, we create a more transparent environment for understanding how Wikipedia effects the web.
Currently, this relationship between Wikipedia and other websites is hard to recognize: Special:LinkSearch provides only a basic list of links, with no ability to analyze those links in relationship to their context, such as the pages they are on or who contributed the links to the page. Linkypedia V.1 created one method for collecting that data; however, the current tool only provides a basic level of metrics, and because of where it is hosted and how it collects information, it cannot support a large user-base or collect information across multiple language Wikipedias, thus the tool doesn't provide sufficient information for effective community usage. Moreover, none of the current options for collecting link data allow a robust visualization of the connections between Wikipedia and the Web; thus a large amount of time is spent by both volunteers, researchers and other members of the community in collecting this information, and representing it for both community members and potential outside partners.
This problem creates a number of reporting issues when the community looks to engage outside parties. For example:
- the GLAM-Wiki community relies heavily on the ability to show interested partners what resources are being used on Wikipedia and which editors are adding references to Wikipedia pages. More targeted collection of link data would help create better relationships with these partners and create more focused outreach to already engaged members of the community.
- the Wikipedia Library, offers another use case: publishers interested in donating accounts to the Wikipedia community are strongly motivated by Wikipedia's centrality for accessing scholarly knowledge on the web. Providing better metrics and reporting allows the Wikipedia Library to better engage partners, and highlight the work of individual volunteers within the community, so that they are better supported in their needs for accessing partner resources.
What is your solution?
Linkypedia v1 was a prototype tool to see if GLAM organizations were interested in seeing how their content was being referenced on Wikipedia. It was limited to English Wikipedia, and the positive response combined with its underlying architecture made it difficult to scale. Linkypedia v2 will build on what was learned in the prototype to establish a similar service in the Wikipedia Labs environment. Using the labs infrastructure will mean the application can be expanded to all language Wikipedias, and can use the replicated databases rather than being limited to API queries. I plan to work with members of the WIkipedia Library project to redesign the user interface, which will be built on top of API calls that will make Linkypedia data usable in other WMF projects.
- break text into features
Linkypedia will be a new tool that will let editors and other interested parties see how content from particular regions of the Web is used on Wikipedia. The initial focus will be on developing metrics and visualization of data that strengthens GLAM-Wiki and Wikipedia Library projects, so that the management and communication of those projects can be more efficient and effective. It will also let interested website owners track how their content is used on Wikipedia, and potentially engage with Wikipedia editors. The hope is that this will encourage knowledge holding communities, including cultural heritage institutions and publishers, to be more involved in the Wikipedia communities that leverage those organization's holdings for public knowledge dissemination.
The deliverable for this project is a new website at http://tools.wmflabs.org/linkypedia/ that lets users log in, and begin viewing high level statistics about the links between Wikipedia and the Web. They will also be able to select particular websites, or collections of websites to monitor for new updates.
Provide an indication of the timeline.
- web application: 60 hrs
- authentication: 4 hrs
- database work: 20 hrs
- labs setup: 8 hrs
- automation with open grid: 8 hrs
- graphic design: 8 hrs
108 hrs, at $75/hr = $8,100.00
I plan on working with the GLAM-Wiki and Wikipedia Library projects along the way as versions of the software become available. I plan on an iterative software development model where working versions of the software are made available to interested stakeholders from those communities as soon as they are ready, and not waiting until they are finished.
The big advantage to running the application in the Labs environment is that it can piggy back on existing WMF infrastructure. Once the application was up and running I would accept pull requests for functionality, fix bugs and add features as needed as a volunteer. If substantial work was needed I would apply for another grant, or help others do so.
Measures of success
Success for Linkypedia would be making it a useful tool for the GLAM-Wiki and Wikipedia Library projects. If most members of those communities were able to regularly use it to gather metrics and monitor web properties for development of partnerships, I would consider it a success. Another angle to the project is other website owners. In Linkypedia v1, I had interest from large organizations such as the National Institutes for Health, Yale University, and the New York Times who were interested in seeing how their content was being used on Wikipedia. I think if just a handful of them started relying on Linkypedia reports to help guide them in their engagement with the Wikipedia community that would be even more of a success.
Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. Need notification tips?
Do you think this project should be selected for an Individual Engagement Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page).