Research:Main/sub-article relationship

From Meta, a Wikimedia project coordination wiki
Bowen Yu, Brent Hecht
Duration:  2015-09 – 2016-05
This page documents a completed research project.

The project focuses on understanding and identifying the main article and sub article relationship in a purpose of better serving the Wikipedia article structure. (For example, main article:United States and sub-article History of the United States). Accordingly to the definition of {Main} Template, "(w)hen a Wikipedia article is large, it is often written in summary style. This template is used after the heading of the summary, to link to the subtopic article that has been summarized." However, the use of the {main} template is not well executed and contains a lot of misclassification. This relationship is important for artificial intelligent system that uses concept level Wikipedia content and for reducing the language barrier on multilingual wikipedia.

The project will contain two parts: 1) from the users/wikipedian's perspective, we conducted survey analysis about the editor's understanding of the main/sub-article relationship and their decision process when creating this relationship. The purpose is to explore why the quality of main/sub-article relationship is currently low. 2) from the artificial intelligent system's perspective, we use machine learning algorithm to automatically classify the true and false main/sub-article relationship.


Part 1[edit]

For the 1) part, we will conduct a survey analysis: Survey overview': The purpose of the survey is to understand editor's understanding of the main/sub article relationship and their decision criteria when they create sub-articles. We want to recruit 30 editors to fill out an online survey, which can be completed in about 5 mins.

Participant: We want to recruit a mixture of editors who have created main/sub-article relationship (expert) and who haven't done so. We will identify the expert by mining the edit history of the article that contains correct use of the Template{Main}.

Recruiting method: We will leave message in the editors' user talk page.The message will include greeting, brief introduction of this study, appreciation and link to the survey.

We will make sure we contact the editors and conduct the survey in a respectful manner to the Wikipedian and the community. Please feel free to let us know if you have any concerns!

Specific survey questions:

1. How long have you been a wikipedia editor (give approximate # of months)?

2. How many edits have you done approximately? (give selection range 1-25, 25-50, 50-100, 100+)?

3. Throughout your Wikipedia editing experience, do you realize there is main-article and sub-article relationship?

For example, a main/sub article relationship can be United States to be a main article and History of United States to be a sub article

4. In your understanding, what is main article? what is sub article? What is a main/sub article relationship? (open-ended)

5. Under what circumstances, would you create or delete a sub article? (open-ended)

6. Did you have conflicts about sub article creation or deletion with other editors? If yes, please describe how did you resolve the disagreement.

7. Are you satisfied with the current quality of the main/sub-article relationship on Wikipedia?

8. How to improve the main/sub article relationship quality based on your experience?

9. Do you think the main/sub article relationship is important to readers and editors?

Link to the survey: You are welcome to fill it out!

Part 2[edit]

TODO: description for part 2)


Policy, Ethics and Human Subjects Research[edit]

The survey analysis is currently part of a course project which is approved by the IRB of the University of Minnesota. We will get separate IRB approval if it goes into the publication.