Trust and crowd-sourced applications
One of the topics to which Thomas Vander Wal and I turned our attention during yesterday’s podcast was that of ‘trust’. Specifically, we were talking about the complex multi-dimensional relationships at play in enabling one to fully leverage the power of a folksonomy.
To take a simple example, a student just embarking upon a new course of study might choose to treat tags (and other assertions) placed on a resource by their tutor in a different manner to that in which they treated tags placed by their peers. That is not to say that the tutor is ‘right’, and the students ‘wrong’; simply that a sensible evaluation of the whole might very well lead the student to give some weight to the thoughts of someone more experienced in the field.
This very quickly becomes complex, of course, as an ‘expert’ in one area is certainly not also expert in another field. We need easy ways to assess ‘expertise’ and ‘authority’; ways that may have no relation to such traditional concepts as tenure, seniority and rank.
I was therefore interested when Paul Gearon, with whom I recorded a further podcast this evening, independently sent me a link to some work from the University of California, Santa Cruz; their Wikipedia trust coloring demo.
“In this demo, the text background of Wikipedia articles is colored according to a value of trust, computed from the reputation of the authors who contributed the text, as well as those who edited the text.”
“In order to compute text trust, we first compute the reputation of all Wikipedia authors at all points in time. The goal is to be able to answer all questions of the kind ‘at 7:04 am UTC on Jan 23, 2006, what was the reputation of the user with ID 3546?’. See below for the computation of author reputation.
Once the reputation values for all authors for all times are available, we compute the trust of each word of each revision. We compute the trust value of each word of a revision according to the reputation of the original author of the word, as well as to the reputation of any authors that have edited the page, especially if the edit is in the proximity of the word. We are still fine-tuning the algorithms, which will be described in a forthcoming publication.”
Quoting from the abstract to a paper (which I didn’t manage to see) they presented to the WWW2007 conference in Banff,
“We present a content-driven reputation system for Wikipedia authors. In our system, authors gain reputation when the edits they perform to Wikipedia articles are preserved by subsequent authors, and they lose reputation when their edits are rolled back or undone in short order. Thus, author reputation is computed solely on the basis of content evolution; user-to-user comments or ratings are not used. The author reputation we compute could be used to flag new contributions from low-reputation authors, or it could be used to allow only authors with high reputation to contribute to controversial or critical pages. A reputation system for the Wikipedia could also provide an incentive for high-quality contributions.
We have implemented the proposed system, and we have used it to analyze the entire Italian and French Wikipedias, consisting of a total of 691,551 pages and 5,587,523 revisions. Our results show that our notion of reputation has good predictive value: changes performed by low-reputation authors have a significantly larger than average probability of having poor quality, as judged by human observers, and of being later undone, as measured by our algorithms.”
It will be fascinating to see how this work pans out.
Technorati Tags: Folksonomy, Talis, Talking with Talis, Thomas Vander Wal, Trust, Wikipedia, WWW2007



