Research:WUA

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search
Current event marker.svg This project page documents a research project currently in progress.
Information may be incomplete and can change rapidly as science advances.
Research project
WUA (Wikipedia Use Analysis)
Main contact
Co-investigators
WMF contact Dario Taraborelli
Start 2007-Mars
End 2020-12
Status in progress Icon 66 percent.png
Fields computer science
human–computer interaction
sociology
statistics
Open data This project has published open-licensed data
Open access This project has open access publications
WMF support
Wikimedia research projects Wikimedia research projects

Contents

Key Personnel [edit]

  • Antonio J. Reinoso (ajreinoso_at_libresoft_dot_es)
  • Jesús M. González-Barahona (jgb_at_libresoft_dot_es)
  • Felipe Ortega (jortega_at_libresoft_dot_es)
  • Israel Herraiz (israel_dot_herraiz_at_upm_dot_es)
  • Rocío Muñoz-Mansilla (rmunoz_at_dia_dot_uned_dot_es)

Project Summary [edit]

The main goal of this project is the analysis of the use given to Wikipedia by its users. In particular, we aim to determine both temporal and behavioral patterns resulting from the different types of interactions between the Encyclopedia and its users.

Methods [edit]

Our methodology is based on the analysis of the requests submitted by users to Wikipedia. These requests are made available to us through the corresponding log lines registered by Squid servers once they have been served. The analysis consists in, both, a parsing and a filtering process devoted to extract the relevant fields, first, and to filter and store the ones considered of interest, after. As a result, a populated database is ready for statical examinations. The WikiSquilter tool has been developed to perform all these tasks. It is released under a free license and is available at http://sourceforge.net/projects/squilter/

Dissemination [edit]

Scientific publications Free access to files containing the received log lines (Due to the huge amount of data we can not maintain all the received log information but only the most recent. Nowadays are keeping the records of the last two years.

Wikimedia Policies, Ethics, and Human Subjects Protection [edit]

Benefits for the Wikimedia community [edit]

Detailed characterization of the traffic directed to Wikipedia. Possibility of traffic forecasting based on the temporal patterns found. Determination of different users' behaviors when browsing the Encyclopedia. Differentiation of the most requested resources and services in each language edition. Possibility of obtaining a geo-locatation of users' requests.

Time Line [edit]

Funding [edit]

References [edit]

Antonio J. Reinoso's doctoral thesis is completely based on this study. Several publications also based on this study can be found at http://gsyc.es/~ajreinoso/papers

External links [edit]

Contacts [edit]