IBM History flow project

In February, 2005, +sj+ visited the IBM group that did the history flow study, and spent a while talking with Martin Wattenberg about their group's interest in wikis and some of their ongoing projects.

They have packaged up a history flow tool that they are planning to release sometime in March, for anyone to use. This should help people extract statistics from MediaWiki databases, and generate their own images. Sadly, the code will not [initially] be open-source.

One of the upshots of the meeting was, that the group is very excited about having us use history flow to generate regular metadata about articles. They would love to help us to overcome any related development bottlenecks.

They also encourage the idea of people building useful tools on top of this. For instance, something which uses their sentence-by-sentence diff engine to estimate how old each sentence of an article is, or what % of the article was contributed by each user in its history. I think it would be fun to work out a way to generate weekly stats with such a tool; even if it had to be done on an off-site machine from the latest dump.

Jack Lutz:

> I emailed Martin Wattenberg about using an image in their paper and did
> not receive a response...  The requested image was:
< http://researchweb.watson.ibm.com/history/images/capitalism_group.gif

sj: I asked MW about this again, thinking any copyright release would involve forms filled out in triplicate and would generally be hard to get. The response:

[L]et me check and get back to you, it probably is not a big issue, believe it or not, but I need to make a call first.

Could you generate history for a different Wikipedia page to avoid using the image in the copyrighted paper? Tom Brown 21:05, 6 Jun 2005 (UTC)

External links[edit]

IBM Research: History Flow