Monday, November 14, 2011

The Big Business of Big Data

GE Appliances & Lighting, via Business WireA GE data center.

Is Big Data a Bubble?

In case youre in a hurry: Of course it is. And that is good.

Longer version: Last week there were several events that convinced me that one of the great tech bubbles inflating right now is around what people have agreed to call Big Data. Basically the term reflects the fact that its now so easy to digitize and put on the Internet all kinds of information — things as diverse as the measurements of passive sensors,  most or all the worlds books, 200 million tweets a day and most of the worlds significant financial transactions — that the data is growing enormously.

Big Data is really about, however, the benefits we will gain by cleverly sifting through it to find and exploit new patterns and relationships. You see it now in things like Facebook ads, which are put in front of you because the posts you have read and contributed to (which Facebooks algorithms get to examine as the price of this free service) indicate you might be ready to buy the advertised good.

Other companies look at air and soil data to write insurance about crop production. Further out, people want to seek patterns in raw medical data for possible causes and cures for disease, bypassing much of the old hypothesis-experiment model; this article from Wired tells of how the Google co-founder Sergey Brin used this in Parkinsons research.

Last weeks gathering of the tech tribes, the Web 2.0 conference, focused heavily on the benefits of the ubiquity of Big Data — ad placement at Google, Coca-Cola vending machines that develop a personal relationship with the buyer, or what Facebook algorithms are doing to the cultivation of our souls. Microsoft held a one-hour session for developers on all the big, reliable databases it would offer them to make new products.

Sometimes there were overreaching conclusions.

In a memorable 10 minutes, Alex Rampell, the chief executive of TrialPay, made a case that credit card companies should not charge their 2 percent fees on a transaction, since the value of the transaction isnt in the fees, its in the data that is generated. When you know what someone has purchased, you can make a case of what ad to put in front of them next. Citing Amazon.coms relentless upselling approach (people who bought X also bought Y), Mr. Rampell said, Theres an Amazon.com for everything, its called Visa, its called American Express.

Mr. Rampell may be right, but there was no proof in his admittedly brief talk that this is actually true. Is it really easier and better to move a 2 percent business, with relatively fixed costs of technology and insurance, over to a much more variable ad-based business? If all advertising heads toward this model, and we dont purchase particularly more stuff, doesnt the value of the technology start to diminish, and simply turn from a competitive edge into a must-have?

This is not to pick on TrialPay, but to point up a common problem in the Big Data proposition: Often people wont know exactly what hidden pattern they are looking for, or what the value they extract may be, and therefore it will be impossible to know how much to invest in the technology. Odds are that the initial benefits, as it was with Googles Adwords algorithm, will lead to a frenzy of investments and marketing pitches, until we find the logical limits of the technology. It will be the place just before everybody lost their shirts.

This is a common characteristic of technology that its champions do not like to talk about, but it is why we have so many bubbles in this industry. Technologists build or discover something great, like railroads or radio or the Internet. The change is so important, often world-changing, that it is hard to value, so people overshoot toward the infinite. When it turns out to be merely huge, there is a crash, in railroad bonds, or RCA stock, or Pets.com. Perhaps Big Data is next, on its way to changing the world.

Another Web 2.0 speaker was Josh James, who founded Omniture, a Web click-tracking and ad placement service that is now part of Adobes Big Data play. Mr. James, a somewhat pragmatic Mormon who lives in Utah, far from Silicon Valley, has started a company called Domo. Rather than search for new patterns in the big piles of data, Domo will focus on delivering to a top executive simple existing data, like how large a banks deposits are on a given day, or how many employees a company has, that are still hard to locate. Everyone is saying that the team with the best data analysts will win, he said.

We have all the data we need. The focus ought to be on good design, and telling the vendors the simple things you really need to see.

Big Data is clearly big business, adding a new level of certainty to business decisions, and promoting new discoveries about nature and society. That is why over the past two years I.B.M., E.M.C. and Hewlett-Packard have collectively invested billions of dollars in the field. This past week, Oracle bought Endeca, a company to manage and search through large volumes of things like e-mail, for a rumored $750 million. H.P. paid $10.3 million for Autonomy, which does a much bigger version of the same thing. The first H.P. products with Autonomy technology, along with pattern-finding algorithms from an outfit called Vertica, which H.P. bought earlier this year, will most likely be out next month.

There are an uncountable number of data-mining start-ups in the field: MapReduce and NoSQL for managing the stuff; and the open-source R statistical programming language, for making predictions about what is likely to happen next, based on what has happened before. Established companies in the business, like SAS Institute or SAP, will probably purchase or make alliances with a lot of these smaller companies.

Expect to see a lot more before it all gets sorted out.

Related Articles

0 komentar :

Post a Comment