web analytics

3 Things Every Hadoop Startup Should Know: Mike Olson, Cloudera CEO

A report by the CEO of Cloudera about the three things that every Hadoop Startup should know. They are:

One: There’s great money to be made building apps that run on Hadoop, so please do that! We need that innovation to drive business adoption of the platform.

Two: Hiring is brutal. Your best bet is to hire great people and teach them Hadoop. Deep Hadoop internals skills are awfully thin on the ground.

Three: Cloudera loves you, and our Cloudera Connect program is how we show it, so check it out!

In other words – an opportunity well, talent crunch and Cloudera loves you…

 

Facebook To Speed Up Biz Analytics Tool Insights To Report In Real-Time

I believe that this sub-domain of Big Data is where a significant portion of the future’s wrangles and intense competition is going to be. TechCrunch has an article on how Facebook (which by the way is one of the big Hadoop users) is using Analytics tools more or less in real-time in order to “glean” information about the activity on their site.

Facebook to Speed up Biz Analytics Tool Insights to Report in Real-Time is the article.

Facebook’s analytics tool Insights will soon begin showing Page performance data in real-time or near real-time rather than on average 48 hour delay, the company Facebook plans to announce at Wednesday’s Facebook Marketing Conference in New York City according to our sources.

and

Making real-time Insights data available through the API “will give Page owners an opportunity to see how their Page actually lives and breathes,” says Facebook analytics tool provider EdgeRank Checker‘s founder Chad Wittman

And that’s the brave new world…

How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did

Or how Big Data is the wild wild west – where the saloon owners knew a lot more about who walked into their saloon than they themselves knew.

This article: How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did explores a real incident about the way that Target’s marketing and information based corporation targets its customers (and dare I say keeps even its own store managers at the front line in the dark.)

So Target started sending coupons for baby items to customers according to their pregnancy scores. Duhigg shares an anecdote — so good that it sounds made up — that conveys how eerily accurate the targeting is. An angry man went into a Target outside of Minneapolis, demanding to talk to a manager:

“My daughter got this in the mail!” he said. “She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?”

The manager didn’t have any idea what the man was talking about. He looked at the mailer. Sure enough, it was addressed to the man’s daughter and contained advertisements for maternity clothing, nursery furniture and pictures of smiling infants. The manager apologized and then called a few days later to apologize again.

and

On the phone, though, the father was somewhat abashed. “I had a talk with my daughter,” he said. “It turns out there’s been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.”

What Target discovered fairly quickly is that it creeped people out that the company knew about their pregnancies in advance.

“If we send someone a catalog and say, ‘Congratulations on your first child!’ and they’ve never told us they’re pregnant, that’s going to make some people uncomfortable,” Pole told me. “We are very conservative about compliance with all privacy laws. But even if you’re following the law, you can do things where people get queasy.”

So we’ve gone POS (Point Of Sale) to gleaning information about clients from the patterns they exhibit. Information has been so free in the past that it sounds like it is going to become a whole lot more expensive.

Though what I loved best about this article is the way that the Forbes writer included a picture of Andrew Pole of Target Corp (linked from LinkedIn) in the article itself. A bit of reverse information gleaning…

The Age of Big Data

The Age of Big Data is an article at NY Times Sunday Review by Steve Lohr. Some of the key takeaways,

They help businesses make sense of an explosion of data — Web traffic and social network comments, as well as software and sensors that monitor shipments, suppliers and customers — to guide decisions, trim costs and lift sales.

And of course, something like this means jobs as welll…

A report last year by the McKinsey Global Institute, the research arm of the consulting firm, projected that the United States needs 140,000 to 190,000 more workers with “deep analytical” expertise and 1.5 million more data-literate managers, whether retrained or hired.

As for the impact

The story is similar in fields as varied as science and sports, advertising and public health — a drift toward data-driven discovery and decision-making. “It’s a revolution,” says Gary King, director of Harvard’s Institute for Quantitative Social Science. “We’re really just getting under way. But the march of quantification, made possible by enormous new sources of data, will sweep through academia, business and government. There is no area that is going to be untouched.”

and

Research by Professor Brynjolfsson and two other colleagues, published last year, suggests that data-guided management is spreading across corporate America and starting to pay off. They studied 179 large companies and found that those adopting “data-driven decision making” achieved productivity gains that were 5 percent to 6 percent higher than other factors could explain.

Finally, if you thought that our lives are going to get easier, be aware

Big Data also supplies more raw material for statistical shenanigans and biased fact-finding excursions. It offers a high-tech twist on an old trick: I know the facts, now let’s find ’em. That is, says Rebecca Goldin, a mathematician at George Mason University, “one of the most pernicious uses of data.”

It’s like the Wild wild west all over again.

Big Recognition for IBM Big Data

IBM’s smarter computing blog talks about Big Recognition for IBM Big Data. From the blog post,

IBM was among the select companies that Forrester invited to participate in The Forrester Wave™: Enterprise Hadoop Solutions, Q1 2012, (February 2, 2012). Technologies evaluated were IBM InfoSphere BigInsights (IBM’s Hadoop-based offering), and IBM Netezza Analytics. In this evaluation, IBM was placed in the Leaders category of the Wave and achieved the highest possible score in both the Strategy and Market Presence segments. In the third segment, Current Offering, IBM received the second highest score.

The Forrester report on the current players in the Big Data space can be downloaded from IBM’s site here.

How Hadoop is revolutionizing…

Business Intelligence and Data Analytics.

This presentation:  How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics by Dr. Amr Awadallah (CTO at Cloudera) delves into how a business can structure Hadoop in its Business Intelligence (BI) and Data Analytics efforts.

The below was the initial thesis:

Pre-Hadoop (and Hadoop like infrastructure) – BI applications access the data that is available in a data store such as a database and a data warehouse and produce actionable items from this data. As time moves on, the data from the data storage gets archived and essentially disappears or dies or gets aggregated/reduced for offline storage.

The below is the new anti-thesis:

Post- Hadoop – The approach here is to have live data available at all times in the raw and/or processed data form.. The Hadoop approach is to take the application to the data – distributed data and distributed applications as well acting and exploring this data.

The reason why the anti-thesis has this form is largely because as data storage has become commoditized (and rather large), data pipes enlarged and data computation rather fast, both computation and pipes have not (and perhaps need not) expanded as much as storage has. At the same time, it has become a human imperative to put out as much junk as possible er,.. be more creative and big data apps and their providers (Facebook, Google etc) have followed suit.

The synthesis – Yet to be.

But here’s a guess. Right-Compute and Right-Data. The premise of Big Compute and Big Data is that in the pile of horse manure, there must be a pony in there somewhere : a white stallion to be sure. As many past Masters (Who is a Master?  – Think Sun Tzu, Newton) will tell you – the objective of dealing with Big Data (and is there any bigger Data out there than the human, natural and metaphysical world) is to elicit the laws that underlie them.

Now in the past, we as the curious ones have depended on intuiting and hypothesizing about the Big Data out there. Today, it seems that we’re done with the hypothesizing and are jumping straight into letting the Data speak for itself.

Right Compute and Right Data is about getting back to the hypothesize and test scheme that has proven remarkably successful in our developmental journey,

Stay tuned…

The Coming Tech-led Boom

The Coming Tech-led Boom is a recent article published in the Wall Street Journal.

In January 2012, we sit again on the cusp of three grand technological transformations with the potential to rival that of the past century. All find their epicenters in America: big data, smart manufacturing and the wireless revolution.

Now, that’s what I call timing because I’ve been staking out the ground on two of those technological transformation – Smarter Manufacturing (on my @ Supply Chain Management blog) and Big Data here on this blog. My views on Smarter Manufacturing are here.

As for Big Data, this is what the authors have to say,

Information technology has entered a big-data era. Processing power and data storage are virtually free. A hand-held device, the iPhone, has computing power that shames the 1970s-era IBM mainframe. The Internet is evolving into the "cloud"—a network of thousands of data centers any one of which makes a 1990 supercomputer look antediluvian. From social media to medical revolutions anchored in metadata analyses, wherein astronomical feats of data crunching enable heretofore unimaginable services and businesses, we are on the cusp of unimaginable new markets.

While much of this is true, does it sound like a prediction? To me, it sounds like the inevitable inference. Except that I think a different actualization of the potential of Big Data. While we’re at the point of Big Data storage, retrieval, analyses, manipulation etc that is not the point of Big Data <anything>.

I see Big Data as a Resource like water or oil for example – a vast landscape to be discovered, molded and valued. That is the new economy – powered by a new resource altogether…