Saturday, April 4, 2015

Opportunities for tomorrow

I had a tough time building this post. On the one hand, i wanted to elaborate the role each of the players play and on the other wanted to build a framework of success- thank you MBA for one great skill. After extensive research the past couple of weeks, let me start with a bold but controversial statement.

Bold controversial statement

In a perfect data enabled world, perfect information will be available to all. This implies that all products, and services will be reduced to the status of a commodity - this world sounds very much like Marxism or communism or anti-opportunity.

I happened to watch the first of the "Divergent" movies and the system of tribes there seemed to be the utopia towards which "data" would lead us. Clusters of people with a natural affinity to prefer averages. Now outliers, individualism, creativity and humanity would either be viewed as divergent or plane out subversive.

I sometimes love the play on words and do so for not particular reason- but here the point is as we define precision and accuracy and experiences based on data; more opportunities open up as opposed to throwing us in the downward spiral that needs a revolution to break. That said, let me further define the three terms that caught attention in my last bottom-line.

Transparency

Open, honest and respectful is all I can aspire to in this world of deception, positioning and differentiation. Let us all agree that the world can be made a better place by being open and honest without being disrespectful to ourselves first then our stakeholders.


In the context of a big data implementation- Business leaders have to be honest about the culture and values they espouse- not merely verbalize. Management has plenty of opportunities to pursue after democratizing information in their organizations. Think of big data as a mechanism of getting out of the mundane tasks of informing the organization of what happened, where are we headed and what is going wrong. You will have more time to deeply examine the industry, define better experiences for your customers and target prospects not possible earlier just due to the lack of time. There will be more chances for innovation than ever before.

Curiosity

IT and Data Scientists have a vital role in the economy of the future- and it is not elimination of traditional jobs like a taxonomist for example. Your primary role is to provide more roles guidance and tools to up their game- think more strategically and leverage information to multiply the impact to the organization.  This giving guidance piece comes with a covenant of taking guidance as well from these roles as well as business leaders- the curiosity of the mechanics behind the data.

I expect the world to go through a storm of datafication- call it IoT, wearables, automation, big data, cognitive computing or a lot of new upcoming terms. We have just taken baby steps into the world of datafication- literally with Fitbit and now iWatch. There are processes, habits, patterns, practices, beliefs, and faith behind all the data that will be generated that requires sustained human thoughts to decipher. This will open up opportunities like never before- along  with the tools to build hypothesis, test them out and act on them really really fast.

Concerning wearables- I myself have gone from quantifying my activities to running competitions in my mind to competing rigorously on social media to abandoning my device in a drawer because the experience got boring. I went through the awareness, acquisition, adoption, evangelism to abandonment in a matter of months- a product lifecycle with an added death after the decline many tech companies today are struggling to overcome.


You need specialists to win races, your role is to give them better tools so you can reach organizational objectives faster, better and with less effort. In your industry you might need taxonomists, process manager, marketers, sales planners, financial analysts, product managers, service designer, podiatry analyst and professions that don't exist today  to actually use the data you capture, compile and surface to them; to build products and services that add value to your business. 

Scientific inquiry

Data Scientists are Scientists that happen to work with data- in my opinion. Informing, educating, mentoring your business partners to adopt scientific inquiry will enhance your organizations ability to seize bigger and better opportunities. Take the AirBnB "Discovery" example in my earlier post on "Breaking the silence", Data scientists could have collaborated closely with the product managers to reinvent the discovery feature and blown the world away with an experience that set a new bar for travel.

Such opportunities are not gone and done, hundreds appear every day  from all walks of life- building an infrastructure to act on them is a huge gift to mankind. You can start in your vocations or avocation- setup tests, experiment, learn from them and then build the data capture, collection and analytic models so that we all marvel at our creation!

Monday, February 23, 2015

Breaking the silence

Hadoop World San Jose 2015 helped me break my silence- a hiatus from self doubt. After a great and hectic experience learning about the successes achieved in the face of challenges, roadblocks, doubts and confusion; I feel invigorated and primed with optimism.  The energy of the silicon valley, the free flow of ideas, the honest articulation of problems, the candid questioning of assumptions has definitely had an inspiring effect on most folks at Hadoop World.

From the high fives of business executives and wise nods of data scientists to awe-filled exhilaration of IT implementers and managers, the variety of problems solved continues to grow.  I am sure each attendee leaves with big data in his brain and will process the information over a period of months.

The Story

The blind men exploring the elephant stuck a resonant cord and succinctly summarizes my key takeaways at the conference.

I had heard the story as a child and had in fact told it to my son with some variations of course; to illustrate the concept of “viewpoints”. A few wise blind men lived in a village together and experienced life together. Having heard about a majestic animal called the elephant they discussed and debated on the form this beast. Not coming to a consensus on what an elephant is they decided to explore one at a nearby temple. Each went to the temple separately with a visually endowed shepherd, prayed to the local deity, touched the elephant and returned to their home. One evening they gathered to discuss their findings and were bewildered. An elephant is “a huge fan”; “a sturdy tree trunk”; “a stout rope”; “a wall”; "a sharp spear" and “a strong snake” were the real physical experiences of the majestic beast.

The Moral

Hadoop seems to induce a similar bafflement for Business executives, IT managers, IT implementers and Data Scientists. The success stories, use cases and recommendations still fall short of explaining the utility of the beast that threatens to help you rule at its best or trample you just as easily.
Under these circumstances, I believe there are opportunities and real benefits- though not clearly or easily available to any of the audiences. Big data is HARD and complicated and requires all the stakeholders work collaboratively, mentoring each other to find the best application to solve.
Making sense of it all requires everyone to agree that the fundamentals of science, management, mathematics or your functional area are not being disrupted by technology. Technology is merely enabling more possibilities through more options and it is up to you all as a team to make the most for your organization or cause. I believe you don’t need data to answer all the questions- simply asking is sufficient in some cases.

The Examples

“Why are taxis not available during rains in Singapore?”  
Millions of Singaporean dollars spent with a consulting company, technology, IT infrastructure etc. got an answer that could easily be the deduced by asking a “Taxi driver” and a “Taxi operator”.

AirBnB figured out their customers needed a “Discovery” feature by using a elaborate A/B testing experiment that was eloquently summarized in a 40-minute session detailing the data, the model, the testing and the insights. The Q/A went on to further probe and clarify assumptions, accuracy, consistency and a host of mathematical terms. A good old fashioned product manager asking the question to either a customer or a visitor that did not convert would have unearth the same insight at a fraction of the effort.

Or a classic example from Microsoft on "Connected cows".

The true value of big data comes from scientific curiosity, logical data collection and analysis, building models, defining metrics for a solution and measuring success in the future. There are more than one ways to solve problems and businesses look for the most cost effective actionable method. A concept the data scientists seem to neglect in the pursuit of rationalizing the value of data sciences. Data sciences can solve real problems- there is no question about it.  But a truly collaborative approach that helps everyone learn can lead organizations to scale even greater heights.

The Players

Much was said of the roles involved, influenced and impacted as a result of big data, I believe in the roles but differ in their value and hence the following guidance to my colleagues in the field.
  • The Business Managers should help define the question and continue to mentor IT and the data scientist on exploring the best, cost effective and speediest course to get to the answers.
  • The Technologist should own the requirements, be thorough and include both the data scientist and the business stakeholders in helping understand execution and operations.
  • The Data Scientist should lead in the enabling the business question with a scientific approach spanning data, functional and technical methods to define the best solution.

Together you have to push the boundaries of the “Art of the possible” for your business and there is no silver bullet for it.  Yes you will need to expand the data collection, data storage, analytics and operationalization- in time you will have a data-driven enterprise but there is no one way for you, your organization or your industry to map your way. You will define your own path and can be successful if you build on the core foundations of scale, flexibility and simplicity.

Bottomline 


Transparency, curiosity and scientific inquiry have helped mankind overcome great challenges and will continue to do so in an big data venture. You might need help and sometimes an independent viewpoint can help.

Thursday, April 17, 2014

How does big data work

Simplistically speaking, big data starts with data that is already collected, or is being generated or a combination of both. Which is collected and brought together in ways not possible earlier to be analyzed to extract some learnings.


this is an iterative process that is repeated several times to  get finer and deeper insights that actually can make a difference in a business.

Big data vendors approach this process from various angles. IBM, Oracle, Microsoft have a good handle on the data, analysis and processing model so they push solutions that apply there. Hadoop vendors are good at the collect and converge and are building newer tools for analysis so like to come at this pie from that view. Others start with data- specific to certain applications like Machine data, web, mobile, geo, financial, geological, remote sensing, weather... you name it and they know how to handle it. build a few collection, convergence and analysis modules and you have a workable big data solution that at least provides directional guidance if not actionable.

IDC, ESG, Forester, Booz Allen Hamilton put their own spin to this with big data workflows, converged infrastructure or third platform or the data lake or a bunch of other catchy terms. This is basic science and my friends the data scientists would say we have been doing this in our own ways first on note pads, then on spreadsheets then on databases then on data warehouses and more recently on massively parallel processing data warehouses or data appliances for a while.

The only difference is the scale and the tools to interpret the variety of data has expanded in scope.

In the next set of blogs i will explore four strategies and tactics that help work your way through the big data minefield where we straighten the iterative process into

  • Acquisition
  • Storage
  • Analysis
  • Application
Strategy 1.  Start with systems that acquire the data and move towards application. A path suggested by Oracle, Microsoft, SAP, Teradata etc.

Strategy 2. Stitch together a system that connects the four components by partnering with different component vendors. A path taken by various consortia like EMC, AWS and others

Strategy 3. Start with application  through  data exploration and contextualization applications like Splunk for machine data and Palantir for fraud detection

Strategy 4. Build the acquisition, storage and analysis engine with Hadoop and connect it to the applications through custom tools. A path recommended by Cloudera, Pivotal, hortonworks and MapR


Wednesday, March 26, 2014

What can you really do with big data?

If any analyst, guru, visionary was asked the question; we would get an answer simillar to

 "transforms a business from a data-aware organization to data-driven organization"

This is management speak for good science: an organization does things, that generates data, which gets analyzed so you do things better, different, more or less based on what the effects are.  This paradigm does not change; call it data-driven, analytics-driven or a whole lot of other names; you just have more tools to work with so can see a bigger rearview mirror better.

does that make big data less useful? no- just make sure you know where you are and what you see.

Monday, March 17, 2014

The Big data pie

Following the money patterns is interesting, after much thought and staring at spreadsheets; the breakdown is

  1. Professional services (35%): not a wonder since Apache Hadoop is free; its the vendors support that is the biggest slice of the pie
  2. Compute (17%) is after all commodity hardware;
  3. Storage (14%) was a surprise. Its the data that is growing but even with the falling storage costs.. this comes in a distant third.
  4. Apps and Analytics (13%): the splunks, tableaus of the world are very small. compared to the large and growing pie.
Did I miss the professional services boat? probably, but the real work is yet to be done.

I have heard the toy elephant story many times, but seriously the numbers tell that the real elephant is IBM- well they took the elephant sized pie.. the rest well seem like crumbs.

Sunday, March 2, 2014

Big data is big money

We are creating the data, we are storing, slicing dicing and presenting it but where is the money?

Big data is big business.  $18 billion last year, expected to grow up to $28 billion this year and keep on growing upto $50 billion in 2017. and who takes the largest slice of the pie?

Professional services take the biggest bite of the big apple. All the hadoop vendors are slightly under the #5 on the list Teradata which is less than half of #1 IBM.

Making the noise is one business and money is another business..


Friday, February 28, 2014

Data needs to be processed

When you have a lot of stuff, you need to be able to know what you are looking for and then find it. That's where processing comes into picture; and its a huge business.

In the big data world, "tools" make a huge difference to get some use out of the large banks of information we are collecting, creating, duplicating and throwing in one huge heap. Tools fall into a few buckets but broadly

  1. Organizing tools that help you to put some structure to madness- these force you to make a plan and stick to it.
  2. Search tools to help you locate something specific in a huge pile- these work great if you know whats in the pile and how to look for it. 
  3. Presentation tools that organize stuff, sort it and make it look like its all neatly stacked. 
We would love services like that for the physical world but the apps, systems that do just these basic items focus on specifics and don't cross boundaries. For example, there are business card managers, calendar managers, mail managers, news feed managers, photo managers/organizers, video managers/organizers. try to mix any of these up and you have a fruit salad. 

This is the big data opportunity and tons of companies are jumping on the bandwagon!