Big data is big news for journalists
Big data is big business.
Management professionals have known this for some time; this report
published by the management-consulting firm McKinsey & Company
back in May notes that “the amount of data in our world has been exploding, and analyzing large data sets–so-called big data–will become a key basis of competition.” The need to engage with large sets of data is growing all the time for leaders in all sectors, claims the report, as the rise of new business practices, multimedia and social media means that more data than ever is being recorded.
This has already proved itself to be a major opportunity for journalists. Take the news application Dollars for Docs
. This feature from ProPublica
allows users in the US to search their local doctor to find out whether he or she has accepted payment from drug companies. Journalists who constructed the app assembled payment disclosures from 12 companies and pulled them into a single, easy-to-use database.
The results were significant. Scott Klein, editor of news applications at ProPublica, told WAN-IFRA earlier this year: “the search your doctor feature on Dollars For Docs is the single most popular feature on Propublica ever: bigger than any story, bigger than any other news app.”
Writing for the Nieman Journalism Lab, CEO of Webbmedia Amy Webb names ‘big data’ as her first prediction of a journalism trend that will take off in 2012.
“We’re recording our daily activity with BodyMedia arm bands and syncing our biometrics with our Android phones. Hacker-journalists are converting huge datasets for use by everyday newsroom reporters. Hyper-creative data visualization teams, such as JESS3, are creating stunning charts and graphs appealing to the non-geeky set,” she writes. And this is on top of all the data collected by both the public and private sectors that is easily accessible to journalists.
But how can journalists approach large sets of information? The McKinsey report predicts: “there will be a shortage of talent necessary for organizations to take advantage of big data”. This is a challenge that will face journalists too. However, there are tools and initiatives to help them out. Last month we wrote
about the Data Journalism Handbook,
put together at the Mozilla Festival in London
as part of a project co-ordinated by the European Journalism Centre
and the Open Knowledge Foundation
. The handbook was written with the aim of “teaching the world how to work with data”.
More recently, Google
has released a tool called Refine
for cleaning up ‘dirty’ data, i.e. data sets that contain inaccuracies or that have been put together inconsistently. In an article in Poynter
about the tool, Matt Wynn writes that “dirty data is a constant thorn in the sides of data journalists” but that Refine “does not disappoint” as a tool to clean it up and make it more manageable.
So while big data might still present a challenge for journalists, we’ve already got some of the equipment to overcome it.