On Biased Samples and Quality Measures

Check out the top stories that I noticed on my “NPR Most Emailed” module on My Yahoo, today (third and fourth stories from top).

Email stories about email? Turns out that people who email NPR stories have interest in email. Who would have guessed?! And what does it say about “most emailed” as a true measure of quality or interest?

A broader question: what’s the news story equivalent of “interestingness“? Is “most views/comments” better than “most emailed”? It’s definitely not “most Digged” [shivers]. This is one wisdom that I don’t think we have quite figured how to extract from the crowds in a few-to-many publishing system.

I have a hunch that In many-to-many systems like Flickr, Delicious and Slideshare, where (theoretically) the same people who create the content are also the viewers, there is a link from author/participation motivations to the participants’ actions on the site. Therefore, we get better measures of quality from activity. Why exactly is that the case? Maybe because there are many different avenues of feedback (comments, favs, views, links). Rashmi of Slideshare talked about this when she visited us at Y!RB long time ago. But despite Surowiecki, I don’t think we have a good general understanding yet of deriving quality from mass participation, only several successful attempts. Maybe personalization should be more rooted in these quality measures? Only Ayman knows.

I promised to write more about the Brooklyn Museum – that forthcoming post will certainly be related to this question…

Leave a Reply

Your email address will not be published. Required fields are marked *