Big data is one of the signature issues of our time—and also one of the more poorly understood.
Executive Summary
Digital breadcrumbs that people leave behind from a range of activities can be mashed up and used to make powerful inferences about future behaviors, financial position and insurance risk, writes Deloitte Consulting's Chief Data Scientist James Guszcza, who also suggests that insurers will need to be at once innovative and socially responsible to win the tug-of-war over the use of behavioral data.Discussions of the topic often are clouded by what I call the “two dogmas of big data.” The first is that “bigger is better.” This is the idea that aspects of data volume, variety and velocity are what make big data valuable. But measures of raw data size are at best an imperfect proxy for amount of relevant, usable information the data can offer.
The second dogma might be paraphrased as “bigger is different.” This is the idea that beyond a certain scale—and with the help of powerful machine learning algorithms—big data can “speak for itself” without the need for traditional scientific methodology. (See, for example, “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete” by Chris Anderson, Wired magazine, June 23, 2008.) The story of the Google Flu Trends algorithm, deservedly a poster child for big data innovation, illustrates the problem with this idea. The algorithm is based on the clever idea that localized upticks in Internet searches involving certain keywords (such as “fever” or “cough”) likely are a leading indicator of flu outbreaks. While the algorithm worked well at first, it began overestimating flu outbreaks.