Reproduction permitted for personal use only. For reprints and reprint permission, contact firstname.lastname@example.org.
Editor's Note: Luke Lonergan will be a speaker at the 2012 Fusion CEO-CIO Symposium on March 7, 2012. The session is "How Big Data and Analytics are Changing Organizations".
Luke Lonergan, co-founder and chief technology officer at Greenplum, wasnt exactly looking to start a big data analytics firm in 2002. At the time he was focused on helping companies with online transaction processing (OLTP).
We had built technology focused on transactions, but then we began hearing from people who said they had reporting issues that were killing them. They had to find a way to leverage new data sources," said Lonegran.
Starting in 2006, Greenplum began to focus on analytics because thats what companies were concerned with.
They wanted to get insights out of data that looked to be opaque, said Lonergan. Our analytics techniques became very powerful and that set our direction. It placed Greenplum at the front of change in the way business is done.
When we talk to customers, that is where we spend our time -- working with people who are facing profound change in the way they engage with customers.
A second major challenge for Greenplum customers is changing their organizations to accommodate new patterns and new roles.
The data scientist shortage is temporary, said Lonergan. Its no longer new and now everyone thinks they need them. But there are many more people involved in this kind of work than data scientists. Organizations that leverage real-time data streams in the way they engage with customers will have to change the way they operate. He thinks a whole crew of people will have to work around a layer of analytics that responds to real-time data to make decisions on ads and offers as they watch the results flowing from online actions and transactions. That will require a change in an organizations infrastructure.
Business clients want analytics that will tell them what is happening with their customers, their competitors and the advertising landscape. They have to work across different channels of lead generation and analyze data that often comes in from new and different sources such as Twitter, Facebook and new ad mechanisms.
They want to get into those data sources for better leverage around customers. They are familiar with the ideas of what needs to be done, but not necessarily on how to do it. They may have done some experiments which indicate how powerful it would be for their revenue stream if they could get these data sources into their normal business flows.
He described people in IT as concerned about how to leverage their existing investments in technology to meet business demands.
When we talk to enlightened IT organizations, they realize they have to make an investment in big data and storage, and storage is probably the number one expense because it grows as the line of business wants new sources of data. They want to know what storage requirements will be in the future.
Greenplum has answers for them; in 2010 is was acquired by EMC, the storage giant.
Usually the business users have a good idea of what data they are interested in mining when they meet with a Greenplum data scientist. Often they have a potential fountain of information and need to find a way to analyze it. Sometimes they have valuable log data but dont use it, and Greenplum can show them how.
And sometimes Greenplum data scientists can show them how not to approach big data. A European customer brought in Hadoop and hired a consultant to write the Java code to connect to it. After 18 months they called Greenplum.
They had a mountain of code. They didn't understand it because it was too voluminous, and we couldnt understand it either, said Lonergan. We see that with some customers; they get very excited about Hadoop and over-commit to a strategy without really understanding how to do it, and how to stay out of the Hadoop trap. People are starting to learn about big data, but putting it into a product is where they can get trapped.
Lonergan likes to compare Hadoop and some of the tools to use it to a box of knives left in kindergarten classroom. The tools are powerful but when used improperly they can become a bloody mess. Many of them were developed quickly to solve issues Internet companies faced, and the edges havent been rounded off.
Companies which want to do what Internet companies have done will take the tools from Hadoop and find themselves drowning in source code because Hadoop requires writing a large number of tools from scratch, he added.
For companies which are used to infrastructure that doesnt require writing a lot of source code, that can lead to some very ugly results.