Grants:IEG/Editor Behaviour Analysis/Midpoint
Welcome to this project's midpoint report! This report shares progress and learnings from the Individual Engagement Grantee's first 3 months.
In a few short sentences or bullet points, give the main highlights of what happened with your project so far.
In the first phase I've been looking deeper into the behavioral dynamics of an editor cohort. I've looked at them at normal edit levels & at high edit levels. I've explored different ways to graph the data from simple line graphs to complex correlation matrices.
Methods and activities
How have you setup your project, and what work has been completed so far?
Describe how you've setup your experiment or pilot, sharing your key focuses so far and including links to any background research or past learning that has guided your decisions. List and describe the activities you've undertaken as part of your project to this point.
The current focus is to look at the editor/article cohorts to understand better and quantify values like editor longevity etc. The prior work done can be found at Research:Editor Behaviour Analysis & Graphs.
- The scripts to generate the graphs were running on the public db at toollab. The data has been copied over to a Hadoop instance & the scripts have been ported to run on them.
- A lot of different graphs have been built that look at the activity in a cohort in detail.
What are the results of your project or any experiments you’ve worked on so far?
Please discuss anything you have created or changed (organized, built, grown, etc) as a result of your project to date.
- Article Creators - who create articles & when? Are newcomers & experienced folks equal?
- Bytes Added - How adds the content in a month?
- The longevity graphs are now weighted to make them more accurate.
- Graphs that look at the relationship between the number of edit sessions of an editor and the articles that have been edited by the editor.
- Similar graph have been generated for editors with very high levels of editing.
Please take some time to update the table in your project finances page. Check that you’ve listed all approved and actual expenditures as instructed. If there are differences between the planned and actual use of funds, please use the column provided there to explain them.
Then, answer the following question here: Have you spent your funds according to plan so far? Please briefly describe any major changes to budget or expenditures that you anticipate for the second half of your project.
The funds are being spent as per the plan and no changes are anticipated.
The best thing about trying something new is that you learn from it. We want to follow in your footsteps and learn along with you, and we want to know that you are taking enough risks to learn something really interesting! Please use the below sections to describe what is working and what you plan to change for the second half of your project.
What are the challenges
What challenges or obstacles have you encountered? What will you do differently going forward? Please list these as short bullet points.
- Handling large amounts of data, like the entire edit history of the english wikipedia can be quite a challenge. Running scripts on it directly in the db can take a very large amount of time.
- The time consuming scripts especially the ones related to english wikipedia have been moved to a Hadoop cluster.
- Analyzing and interpreting all the graphs that are generated is also quite a challenge. Eg There are longevity graphs with different levels of edit activity, 5+ edits/month, 100+ edits/month etc, now doing the same for the other big wikis result in about 20 different graphs. Now the same is being done with other metrics. Then all of this is done with the article cohorts too. Resulting in a huge number of graphs that all have to be manually analyzed.
What is working well
What have you found works best so far? To help spread successful strategies so that they can be of use to others in the movement, rather than writing lots of text here, we'd like you to share your finding in the form of a link to a learning pattern.
- Your learning pattern link goes here
- Got a lot of ideas & help from the research folks at the foundation, especially Aaron & Jonathan.
Next steps and opportunities
What are the next steps and opportunities you’ll be focusing on for the second half of your project? Please list these as short bullet points. If you're considering applying for a 6-month renewal of this IEG at the end of your project, please also mention this here.
- The second half of the project will have more focus on the edit activity of articles.
- How often are the articles getting edited?
- How edits the articles etc?
- There will also be a greater focus to share the results and the data that has been generated so far.
- Blogging about the results, creating a website showing the results interactively etc.
- There will also be focus on building scripts to keep the graphs up to date.
We’d love to hear any thoughts you have on how the experience of being an IEGrantee has been so far. What is one thing that surprised you, or that you particularly enjoyed from the past 3 months?
Working on a research project is always a roller coaster ride, but its been fun till now. And always one research question leads to another, I'm never short of questions to answer.