The NYT turns to Crowd Science

Screen Shot 2014-10-15 at 7.29.07 AMSome years ago, what is now called Zooniverse, had the great idea to involve the crowd in assisting in scientific endeavours. It started with the classification of galaxies before moving on to other topics such as the classification of cyclones, the identification of Antarctic penguins and coding weather from shipping logs. This was pure science and has been immensely helpful in providing clean, big data sets to scientists. As Michael Nielsen noted in his great book, it has also generated new scientific discoveries by non-scientists.

Today, the New York Times has taken a leaf from Zooniverse’s book and launched its own crowd data classification project, Madison. This one asks people to identify ads, tag them and even transcribe them; all from old New York Times newspapers. It is apparently the first of one of several projects that will launch on a platform it has developed called Hive.

But there is a wrinkle here: the New York Times is a commercial company. It may be that this is purely a historical exercise. What they say is “Your contributions will aid researchers and projects both inside and outside of The New York Times for years to come.” But at the same time, this work could all be used for a commercial product. Indeed, maybe it will be used to power the New York Times’ archival search which is a paid product. It is hard to tell. And I’m not necessarily saying that there is anything wrong with that but if you are asking for “contributions” there is surely an obligation to spell all that out.

What is also disappointing is that the New York Times does have an opportunity to do more. One possibility is that if you do enough “contributing” you get some free months subscription to the New York Times. And you might be worried about quality if there is an extrinsic reward but because this is a crowd project, one data point will be classified several times by different people (at least it should be). So you can reward contributions in terms of verified quality.

I played around with it for a bit and classified about 10 things. That was fun. But when it came to transcribing I got a ‘big one’ and decided I had had enough. But if everyone does 10 …

Leave a comment