Research:How Cultural Identification Affects Wikipedia Creation And Experience

From Meta, a Wikimedia project coordination wiki
Jump to: navigation, search
Current event marker.svg This project page documents a research project currently in progress.
Information may be incomplete and can change rapidly as science advances.
Research project
How Cultural Identification Affects Wikipedia Creation And Experience
main contact
co-investigators
Mari-Carmen Marcos
WMF contact Milos Rancic
start 2012-1
end 2013-12
status in progress Icon 66 percent.png
fields multidisciplinary
human–computer interaction
ethnography
usability
open data This project has published open-licensed data
open access This project has open access publications
WMF support {}
Wikimedia research projects Wikimedia research projects

Contents

[edit] Key Personnel

This research is carried out by a team of researchers based in Universitat Pompeu Fabra, Barcelona, with the collaboration of members from the Universitat Politècnica de Catalunya. Key personnel on the project include:

  • Marc Miquel i Ribé, Universitat Pompeu Fabra (UPF)
  • Mari-Carmen Marcos, phD, Universitat Pompeu Fabra (UPF)
  • Horacio Rodríguez, phD, Universitat Politècnica de Catalunya (UPC)

[edit] Project Summary

The objective of this project is to understand the impact of cultural identification in Wikipedia users, both editors and readers, in order to improve the user experience and participation.

It is known that although free Encyclopedia project shares a fundamental set of ideological pillars (Neutral Point of View,...) among language editions, each one of them has its unicity due to their cultural contexts.

Not just the information treatment is different but some articles are unique to a specific language. In particular, each language covers better than anyone else all the content related to its own homeland history, society, traditions and idiosyncrasy (local content).

Therefore, Wikipedia is culturally configured by its users editions and ultimate background, which we assume fundamental to the whole reading and writing experience. Their ultimate repercussions may reach more or less neutrality, edition activity and quality of content.

To understand the effects of users' identification with a cultural background we propose three goals:

  1. to identify what is the scope of 'local content' from several Wikipedia language editions and what are their main characteristics compared to the rest of content.
  2. to understand motivation to write and therefore if a greater effort in terms content creation means a greater motivation for a particular topic likewise 'local content'.
  3. to experiment with new user interface addons which can give visibility to multicultural difference in order to understand 'user experience' from both readers and writers from different communities in order to see how it can engage them in the community.

[edit] Background Information

Among the most studied aspects of Wikipedia and other free culture projects - like open source software - there is the motivation source of editors to contribute. In order to give an answer some studies propose the dichotomy internal-external motivation, explained by a self-gratification or a physical compensation (Heng-li 2006). Others, depending on the discipline - psychology, sociology, etc. - present different categories in which users answers are classified in an empirical study. They include reasoning such as 'ideology', 'for the fun of doing it', 'for being part of a community,...', etc.. Nov 2007.

However, none of them proposes the identification user-content as fundamental in the engagement part as well as the further commitment. On contrary to free software, in Wikipedia content is a very important issue. Topical coverage has been studied by several methodologies from information and library sciences to natural language processing and came to the conclusion some topics have a much larger number of articles than others. Kittur et al. (2009) could prove that around 85% of the content are social science related (Culture and Arts 30%, People 15%, Geography and places 14%, Society 12%, History and events 11%...).

Also, other studies started analyzing users' behavior in terms of edits in order to see cultural issues (Pfeil 2006). Depending on some parameters it was possible to compare edit patterns and see well established social conclusions on dominance and collaboration. In this same multillingual perspective, Hecht (2010) studied the geographical location of articles and concluded that those located in the Wikipedia language edition speaking-countries where given much more importance.

All in all, the current study starts from these results and wants to open new questions and conclusions. Because not just geographical articles are related to a language, our concept of Local Content includes all the articles which refers to the land, society activities and cultural heritage. As presented in Wikimania 2011 it was tested in 20 language editions giving an average result of the not negligible number of 25%. It is remarkable as well that one every three anonymous editors choose this content to edit.

In the analysis, we could obtain their main characteristics which were shared among language editions. This was non-replicated content (no interwiki links), very well categorised and endogamic in itself - despite being from very different topics. In the broad data extraction, Wikipedia data was analyzed in edits patterns, content length, text particularities and graph structures. The degree of interest for this 'local content' is what we called Autoreferentiality (Miquel 2011) and is measured by several factors of impact among Wikipedia structures and information.

After the discussion and conclusions it was possible to quantify the scope, yet not to answer the same motivational questions. Such as: are editors motivated to write in Wikipedia and so they write about this 'local content' because this is what they know or editors involve in Wikipedia to give it more visibility? Daniel Pink (2010) suggests motivation is an inner drive actioned through autonomy, mastery and purpose by being part of a bigger thing than they are. Which is the bigger thing, Wikipedia or their cultural background?

In the following phases of research, we propose Ethnography and User Experience (UX) to complement the Data Analysis. First, by studying two different Wikipedia communities in order to see how motivational concepts match reality. We want to understand this proposed cultural motivation expressed by their editors and we propose Catalan community as an example plus another one to determine. We must add that using current data results we can reach the most interesting individuals in terms of editing content.

Last, in order to see how editors and readers from a Wikipedia language edition relates to the local content, we will propose alternatives to the Wikipedia interface and Information Architecture (IA) and create a MediaWiki addons as tactical experiments. How are users going to react to difference? Can it improve writing motivation and push a new wave of editors to Wikipedia? Local content such as Wiki Loves Monuments, a contest based on contributing to local monuments, has been a success in many communities. Local content matters.

Our hypothesis is that highlighting difference and cultural complementation will surely improve neutrality by adding multiple points of view and also help to create a more consistent cultural dialogue. First, it will help deepening the engagement of contributors, and after it will attract a new wave of editors which will understand the multiculturality as challenge for both understanding and point of view adopting. This will make Wikipedia more appealing to different society groups and profiles.


[edit] Methods

Data Processing (Data Analysis and Natural Language Processing) In the first phase, Wikipedia was mainly the research object and thus it had to be approached by computational means and methodologies. In order to obtain general conclusions we chose 20 language editions from the most edited to very small ones. We proposed an analytical model using different Wikipedia informational structures - textual, relational and quantitative. This implied all kind of elements such as edits, links, text and categories among others. And we did use techniques like Tf-Idf, Page Rank or Semantic Relatedness.

Etnography and Qualitative Research In the second phase, we propose using qualitative means to approach different Wikipedia language communities. We already did two informal surveys to the Catalan community (through Amical Viquipèdia association) which gave good indicator of a cultural motivation to collaborate in local content. However, it is necessary to use qualitative techniques to obtain fundamented conclusions. This will imply a recruitment, interview and analys periods.

User Testing: Eye Tracking and Think Aloud Methods In the third and last phase, we propose 'Eye Tracking' and 'Think Aloud' as mature User Experience methodologies to understand how users relate in the present moment to 'local content'. It will be necessary to take into account all degrees of interest in 'local content'. This implies a recruitment, testing and analysis period. Later, it will be also possible to continue the research by proposing and using metrics in order to see if the interface changes users behaviors.


[edit] Dissemination

The research will be presented at relevant conferences, seminars and journals. This could be at:


[edit] Wikimedia Policies, Ethics, and Human Subjects Protection

Our foremost priority is to conduct our research in an ethical, respectful, and non-disruptive manner. We will ensure that we conform to strict standards of informed consent and transparency in data collection methods. All participants will be informed about our affiliation, purpose and research goals. We will make all efforts to address any risks associated with participation in this study.

[edit] Benefits for the Wikimedia community - Fit to Strategy

This work will help to:

  1. Identify motivation of editors, both registered or anonymous.
  2. Give a better understanding on the Wikipedia content, their strenghts and lacks.
  3. Reinforce a new multicultural neutral point of view (MNPOV).
  4. Help in the overall goal of spreading all human knowledge to all languages.
  5. Propose useful engaging guidelines in 'User Experience' for newer MediaWiki versions.

[edit] Time Line

January-March 2012

  • Develope the API for Wikipedia analysis
  • Process new data from Wikipedia languages
  • Draft report

April - June 2012

  • Start Catalan and other community recruiting process
  • Interviews and ethnographic field work
  • Data analysis

June-September 2012

  • Draft report
  • Information Architecture Prototyping
  • Data Analysis

January-March 2013

  • User Experience recruiting process.
  • MediaWiki Prototyping
  • User testing

April - December 2013

  • User testing
  • Data Analysis
  • Drafting, writing and publishing


[edit] Funding

This project is at the end of phase 1. It needs funding to be carried on to phase 2 and 3.


[edit] References

Alexander Halavais and Derek Kaclkaff. 2008. An analysis of topical coverage of Wikipedia. Journal of Computer-Mediated Communication. 13(2):429-440.

Brent Hecht and Darren Gergle. 2010. The Tower of Babel meets web 2.0: user-generated content and its applications in a multilingual context, 291-300. ACM.

Brent Hecht and Darren Gergle. 2009. Measuring self-focus bias in community-maintained knowledge repositories. In C38;T’09: Proceedings of the fourth international conference on Communities and technologies, 11-20, New York, NY, USA, 2009. ACM.

Brian Butler, Elisabeth Joyce, and Jacqueline Pike. 2008. Don’t look now, but we’ve created a bureaucracy: the nature and rules of policies and rules in Wikipedia. CHI ’08: Proceedings of the twenty-sixth annual SIGCHI conference on Human Factors in computing systems. pages 1101-1110. ACM, New York, NY, USA.

Felipe Ortega and Jesus M. Gonzalez Barahona. 2007. Quantitative analysis of the Wikipedia community of users. WikiSym ’07: Proceedings of the 2007 International symposium on Wikis. Pages 75-86. ACM. Montreal, Québec, Canada.

Gabrilovich, E. andMarkovitch, S. (2007). Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. Twentieth Joint Conference fo Artificial Intelligence (IJCAI ’07), 1606-16

Kittur, Aniket and chi, Ed H. and Suh, Bongwon. 2009. What’s in Wikipedia?: mapping topics and conflict using socially annotated category structure. CHI’09: Proceedings of the 27th international conference on Human factors in computing systems. pages 1509-1512. ACM. Boston, MA, USA.

Miquel, Marc, Rodríguez, Horacio. Cultural Configuration of Wikipedia: Measuring Autoreferentiality in Different Languages. Recent Advances in Natural Language Processing, 12-14, September, Hissar, Bulgaria 2011.

Nastase, Vivi and Strube, Michael. 2008. Decoding Wikipedia categories for knowledge acquisition. AAAI’08: Proceedings of the 23rd national conference on Artificial intelligence. Pages 1219-1224. AAI Press. Chicago, Illinois.

Oded Nov. What motivates Wikipedians? 2007. Communic. ACM. 60-64. New York, NY, USA. Pfeil, Ulrike and Zaphiris, Panayiotis and Ang, Chee S. 2006. Cultural Differences in Collaborative Authoring of Wikipedia. Journal of Computer-Mediated Communication. 12(1).

Yang, Heng-Li and Lai, Cheng-Yu. 2010. Motivations of Wikipedia content contributors. Computer Human Behaviour. 26(6).


[edit] Contacts

Marc Miquel – MSc in Telecommunication and degree in Humanities - marcmiquel @ gmail.com

Personal tools
Namespaces

Variants
Actions
Navigation
Community
Beyond the Web
Toolbox