Grants:IEG/Automated generation of questions and answers from WikiData

From Meta, a Wikimedia project coordination wiki
statusnot selected
Automated generation of questions and answers from WikiData
summaryImplementation of a system that will automatically generate hundred of thousands of questions, including the correct answer and a number of wrong (3).
targetEnglish Wikidata
strategic priorityincrease participation
amount20000 USD
contact• giorgos_chri(_AT_)hotmail.com
this project needs...
volunteer
join
endorse
created on20:55, 11 April 2016 (UTC)

Project idea[edit]

What is the problem you're trying to solve?[edit]

All manually generated quizzes suffer from the limited number of questions they propose to the user. Moreover the time that is needed for the creation of the questions and even more the creation for the proper wrong answers requires a great amount of time and manpower.

What is your solution?[edit]

We propose the automated generation of questions from wikidata, leading to the generation of hundred of thousands of questions (including permutations of wrong answers to avoid learning by repetition). First version of the application will generate simple questions, like "When Alexander the Great was born", with potential answers 356BC (the correct one), and three more wrong answers based on the random choice of integers close to 356 within the range of (+/-)fifty.

A more advanced version will pose more elaborate questions, like "When and where was Alexander the Great was born". Another example would be: "What is the population of San Francisco" or "In which state is San Francisco located", with wrong answers chosen from states other than California. The application logic will guarantee that questions are formed correctly in English and that wrong answers will be generated in a way that will make the question both interesting and a learning tool, e.g. by traversing the wikidata graph up to some distance and picking suitable wrong answers. Because of their high numbers, questions may generate infinite different quizzes on the same topic, once a simple mechanism for creating random combinations of questions is put in place. Quizzes will be generated by randomly choosing 10 questions out of the existing rich pool of questions. In addition, thematic random quizzes will be generated upon request, e.g. San Francisco related quizzes, provided that enough questions may be generated for San Francisco from the wikidata.

Project goals[edit]

The proposal application (Automated WikiQuiz) will have a high impact on attracting and maintaining the interest of wikipedia users, since, through the availability of a practically infinite number of questions, it will allow them to test their knowledge about the topics they are interested in and will highly motivate them to continue visiting wikipedia pages, especially those related to wrong answers they gave while taking a test.In this way Automated Wiki Quiz will increase participation of quiz takers and wikipedia visitors, and will improve quality of quizzes (increased interest and repeatability of quizzes).

Project plan[edit]

Activities[edit]

Phase 1: We will create a suitable database that will be as handy as possible in order to generate the answers of the questions simple and accurate.
Phase 2: Create the program that will generate the questions and the answers from the database created in phase 1.
Phase 3: Test the result of our project.

Budget[edit]

  • Senior Software Analyst: 10000 USD
  • Software Developer: 10000 USD

Community engagement[edit]

We'll survey our target community at the start and end of the project and at the end of each month we will give feedback sessions.
Volunteers from the Greek Society for Free/Open Software (GFOSS) will be invited to join the testing and assessment activities mentioned above.
We will ask University students and other wikipedia users and learners of different profiles to test the project and give their feedback regarding quiz interest and fun.

Sustainability[edit]

Our project will focus on English questions, but because of its adaptability it will be very easily to generate questions in other languages. This is due to the connection that already exists between wikidata of different languages. So for the exact same question in a different language all that is needed is the same wikidata in that language.

Measures of success[edit]

  • The creation of thousands of questions with no syntactic errors.
  • Creation of different wrong answers for the same question changed by some criteria (e.g. difficulty)
  • Creation of more elaborate questions.

Get involved[edit]

Participants[edit]

  • Software Engineering scientific coordinator: Dr Ioannis Stamelos (username: Ioannis.Stamelos). He is a Professor of Software Engineering at the Department of Informatics, Faculty of Sciences, AUTh. He has managed or partecipated in 30 research and development projects related to information systems. He researches and supports actively open source software and open technologies in general, and he is member of the Board of Directors of the Greek Free/Open Source Software Organization (GFOSS).
  • Software Developer: George Christakis (username: GrWayfarer). He is currently an undergraduate student at the Department of Informatics, Faculty of Sciences, AUTh. He is also working as Java Developer and QA.
  • Community notification[edit]

    Please paste links below to where relevant communities have been notified of your proposal, and to any other relevant community discussions. Need notification tips?

    Endorsements[edit]

    Do you think this project should be selected for an Individual Engagement Grant? Please add your name and rationale for endorsing this project below! (Other constructive feedback is welcome on the discussion page).