Data Data Blah Blah – It Ain’t That Easy
March 27, 2012 2 Comments
Like most investors, I’m hearing more and more entrepreneurs tell me that they’re very excited to have a portion of their not-yet-proven business model be a “data play.” It usually goes something like this:
“we’re going to have thousands (or millions) of customer on our platform, interacting and doing X again and again, and that’s going to lead us to capturing a database of customers and their activity in this marketplace that we’ll be able to sell access to and make boatloads of money.”
Sorry, not nearly good enough.
This is indeed the era of Big Data. We are simultaneously witnessing the massive proliferation of smart devices creating a tidal wave of new data collection, and the maturation of cloud computing and virtualization, combined with the steady advances of Moore’s Law, making the tackling of computational challenges that would have seemed unfathomable just a decade ago suddenly feasible and cost-effective.
We’re generating and collecting radically more data than ever before. One of my favorite statistics – 90% of all of the data that exists in the world has been created in the last two years. That means that 10% of all the world’s data has been created since New Year’s Day this year. Try to get your head around that one! Fortunately, we have computing platforms and paradigms that give us a chance at actually doing something with it all this stuff.
One of the great things about the Big Data revolution (and if you want to stay up on big data and the startup world, read Roger Ehrenberg, who started a firm built entirely around a data thesis), and the dawn of powerful startups tackling massive data sets in healthcare, finance, search, retail, etc. is that it has made almost every entrepreneur and technology executive aware of the opportunities presented by data. In our portfolio Ticketfly, for one, understands that it is building a powerful and insight-rich database of consumer music tastes and purchase behavior, and they are working with partners to marry their data to other datasets to build even richer profiles of consumers, all of which will help Ticketfly help its venue customers sell more tickets, and help consumers get better information about, and not miss opportunities to see and hear, the music they love.
The opportunities presented by all of this data are, indeed, important. But entrepreneurs need to be careful not to get seduced by a blanket promise of big data without carefully contemplating the development of a true data strategy. Simply generating massive amounts of data is nowhere near enough to build a data-driven business model. Some of the biggest and best companies in the world are struggling with data strategies – developing algorithms that can wrestle insights out of raw data and being thoughtful about the costs of processing, the challenges of privacy, and designing data insights and interfaces that customers will actually use.
This last challenge is a critical one – companies who are going to generate meaningful revenue from data strategies need to ensure that they are developing data and insights that their would be customers are going to be able to use. A powerful illustration of this challenge comes from the grocery industry. For almost 20 years grocers have been collecting loyalty program data that tracks incredibly rich data about consumer purchase behavior – they can answer “Who bought what and when?” for the majority of their customers. This data could in theory be used to help tailor offers to consumers, to shape store and chain-level merchandising strategy, and to provide valuable insights to the Krafts, Kellogs, and P&G’s of the world –companies that sell billions of dollars of groceries per year but have limited access to data on who actually purchases their products.
Yet while this POS data has been collected for well over a decade, and the data has been aggregated and stored, grocers have only just begun to scratch the surface of the value of that data. They didn’t have the systems, the computing power, the database structures or the organizational mindset to enable mining the billions of pieces of data they were collecting. They were sitting on gold, but they had neither the know-how nor the resources to dig for it. And just having the data was nowhere near enough.
Last week I had a conversation with some folks from the Amazon Web Services group. This business, which now powers most of the web companies in our portfolio, and most web startups around the world, has had an amazing impact on the internet landscape by simplifying and decreasing the costs of launching a web service.
When you think about it, the incredible market position of AWS – which powers not just the lion’s share of the startup world but also massive web properties like Zynga and Netflix and even the online presence of some banks – gives Amazon unparalleled insight into aggregate level consumer behavior on the internet and, increasingly, mobile devices.
As the folks who own and run the servers that sit behind all of these businesses, they have massively valuable, macro-level data on where traffic is coming from, what devices are using it and when, etc. They are sitting on a data set that is likely far more valuable than the web usage data collected by Comscore or Compete.
But like the grocers, Amazon has done nothing to monetize this treasure trove. I’m sure they use the data internally to guide their own planning, but they haven’t yet even made the data available to their AWS sales team to help inform their discussions with prospective customers. Think about how valuable that aggregate data would be about things like iOS vs. Android mobile traffic.
While they’re sitting on pure gold, it’s just not that simple. For one thing, they’re not structurally oriented towards utilizing that data. They’d have to build an organization to mine, manage, and package the data. But they also have considerable privacy concerns – folks like Netflix are happy to be able to run their business on AWS, but do they want their usage data, even aggregated with others, to be shared? It’s not a straightforward question.
The point is this – when you’ve got both Amazon and the entire grocery business failing to optimize their use of the data they generate in their businesses, it points pretty strongly to the challenges of building and implementing an effective data strategy.
As an entrepreneur, when thinking about your business, just be careful what you’re pitching and promising. If you’re committing to a data strategy as an important near or medium term revenue source, then make sure you know what you’re talking about and have thought carefully through the array of issues that could get in your way. Data can’t just be the convenient, hand-waving answer to how you’re going to monetize some cool but otherwise revenue-free consumer application.
On the other hand, if you don’t think of yourself as a “data first” business, be sure to give some thought as to where your business model might be creating or unearthing unique data. Even if you’re not sure how you’ll use it, starting to think today about how you collect it and structure it could open up important opportunities down the road. Try Roger’s post on paradigms for creating competitive advantage through data. It’s a great framework for starting your thinking.