Musings of a Chief Analytics Officer: Data Science? What big deal, we just need to adopt the infinite monkey theorem!
The infinite monkey theorem states that a monkey hitting keys at random, on a typewriter keyboard for an infinite amount of time, will almost surely type a given text, such as the complete works of William Shakespeare. Stretching this metaphor to insights generation, one can argue that – just hire an army of data scientists and give them access to the vast universe of datasets, they will eventually churn out the golden nuggets. Surely, a valid argument but the reality is something else!
Beyond the hype of “data science” and “data-driven organization” buzzwords, it’s rarely seen that an organization has all the stakeholders on-board to drive data-driven decision making agenda. There’s a natural tendency in many people to view projects and needs in isolation, it is a mad rush to solve problems on hand, thereby creating islands of data-and-insight hubs. Result? They may still claim that they are well ahead in the analytics journey, but in reality, their datasets are disconnected and their precious resource (data scientists) are dispersed within the organization. Given data science has wide-reaching implications for all teams within an organization (when done well), it’s worth stepping back and thinking about it as a journey.
How organizations approach data science:
A broad swipe at data science and analytics projects as done across organizations, portrays three distinct adoption paths:
- Willing but uncertain: The organization is convinced and willing to invest in the data science and analytics capabilities, but is overwhelmed with the RoI questions, and in general is clueless about how to start, where to start.
- Willing but unable: The organization has few data science and analytics evangelists, but struggling to put together the right organizational construct, tools. They do not really have a methodology to apply.
- Willing but unorganized: The organization is already on the data science and analytics journey, but majority of the use cases and approaches are departmental initiatives resulting in fragmented capabilities.
Key question is, how does one achieve the transformational power of data? There are five principles that are core to becoming a leading data-driven, agile and innovating organization.
1. Empower a C-Level Analytics Champion. Data needs to be on the CEO’s agenda. If it’s not, any data-led initiatives are highly likely to fail. The enterprise must see data and analytics as a catalyst for thinking differently.
Think about why your organization needs data science. Focus on the high-value problems, and picture how data science can help discover insights to drive actionable business outcomes. Otherwise, your data science team just becomes a really expensive source of noise.
A C-level officer (e.g., Chief Data Officer, or Chief Analytics Officer), who comes from both a functional and analytics background must have the mandate to lead analytics initiatives. A forward-thinking analytics strategy thus needs to be created at the business unit level. Why? First, priorities will differ by business unit; the treatment of data in one business unit may have little utility in another. Second, management priorities have to reinforce functional level goals with targets and metrics.
2. Organization and people must align around data. Make “good enough” decisions. The objective is not to achieve perfection, rather it should be “how to make the best decision possible given the available data?” Decision agility is needed to take advantage of new insight and opportunities. The challenge is that many organizations get hung up on the one-size-fits-all approach to insights generation process, thereby subconsciously establishing an analysis-paralysis culture. Everyone has an opinion about data. It is typical that any discussion around data ends up with 20+ people in the room with varying opinions, subtly conflicting objectives, and no obvious path forward. One should keep in mind that not all decisions are equal and not all data is useful.
Data Science is not one role, a variety of skills are required to make sense of data. It is easy for business managers to ask unanswerable questions, and easy for data scientists to do clever analysis that does not drive any decisions. When it comes to data and insights, the key principle is to establish a clear direction, and process across everyone in the organization.
There is an old saying that “the devil is in the details”, this is very apt for data science. The most valuable part of data science is not about running algorithms on data to generate random insights, rather it is about solving critical business challenges, and for that, one has to de-silo the data, de-aggregate metrics and get down to the bottom of correlations and causation. Your team of data scientists need to be able to match their data-led insights to business goals. Given the complexity inherent in data science, it’s critical to be able to communicate well, particularly when you’re disrupting (and improving) long-held practices and priorities.
Any data set large enough, will necessarily show patterns simply as a consequence of its size. Random variations will start to look like interesting and exciting patterns leading to wrong decisions. Data scientist’s sole focus thus should be to leverage data to add business value, and help drive strategy. Thus, more than the math capabilities, it is important for organizations to develop overall analytical capabilities and story-telling to provide a clear picture to key stakeholders – findings need to be communicated and presented to executives without wrapping it in technical/statistical jargon.
3. Develop Analytical Capabilities not just Data Science. Everything is an algorithm in this digital world. Many people think they are “good at data”. However, in practice, data science is hard, messy, easy to get wrong and easy to misinterpret.
In an era of rapid pace of technology evolution, we now have powerful data platforms and sophisticated machine learning algorithms that allow us to draw inferences from vast data, at will. Inferences transform data into knowledge, which results in greater process transparency and improvements. It is important to remember that actionable knowledge is not inherent in data per se; rather, it must be inferred based upon established rules and algorithms and validations (often times tribal knowledge about processes and context behind the data becomes the key not the algorithms itself). Data scientists play an indispensable role in drawing these inferences, but the key is to make these findings actionable. The insights must be channeled and embedded in business process tools, in a simple and intuitive way to engage the managers and frontline employees to use them daily.
The insights generation process is tedious and iterative, thus choosing the right data analytics platform is extremely important. There are several aspects you need to keep in mind: First, no data latency—you need to see your business performance in real-time, in motion, as they unfold. This is critical to actionable intelligence. Second, it is not just about your enterprise data alone. Failure to capture temporal and geospatial data will leave even the savviest company flat-footed. Third, your core platform must allow you to visualize your data assets in multi-dimensions so that you can see what is happening across functions in a connected way. Fourth, it must be flexible enough to accommodate different analytics needs in your organization. Finally, the platform must have robust capabilities to scale and democratize data science in your organization.
4. Enrich your data set with situational awareness. Insights generated without situational awareness is not actionable. In most organizations, data must be pulled from disparate and distributed sources, and then processed to yield actionable intelligence. Analytics help a business line identify potential points of improvement. Organizations need to make course corrections not only in real-time as events unfold, but also within the constraints posed by the increasingly distributed nature of digital business.
Your current organization-specific data (transactions, customers, services, products) need to further evolve to accommodate multi-dimensional data that includes temporal and geospatial elements. Examples of temporal data are the acquisition of data from sources such as the Internet, speech and video data, real-time imaging from satellites, and ground-based sensors. These additional dimensions to your current data enable you to get closer to your customers and deliver hyper-personalization services.
It takes a village to do data science right. You need the data scientists, business analysts, data engineer, developers, and DevOps engineers to create reusable data science assets that can continuously drive more value for your organization. A significant challenge for the data science team is that they’ll often be in a situation where they have a current primary project, while at the same time, they are required to be updating, maintaining, or reviewing work from a previous project. They may also be simultaneously scoping out the initial phases of some new opportunity on the horizon. On top of that, there are the one-off analytics questions that come up in a meeting or “as a favor” through an email request.
Addressing all of these can be difficult in an environment in which data tools, data sources, data streams, and data transformation strategies are constantly evolving.
5. Self-Service Data Science. Make It Intuitively Obvious. Data should be made intuitive to the frontline employees who will actually be using it. This means contextualizing data science to the business problem or issue and making “data-driven” simply the way that decision makers at every level typically do their work. Data scientists need to embed contextualized intelligence into the way we work in enterprises today.
This can be done in two ways: make apps that are either intuitively relevant to the employee’s function, with a narrowed down view of optimization opportunities and/or recommendations; or create problem-domain apps that focus on a particular problem that spans across a broader set of functions and process-centric areas.
With self-service data science, the employees who are immersed in a particular business function can leverage data to inform their actions without having to wait for resource-constrained data science teams to provide some analysis. This lowers the barrier to adoption, thus expanding the scope of data analytics impacting business results.
Right now, data scientists are unique and inhabit a world of their own. To unleash the power of data, businesses need to empower frontline workers to easily create their own analysis. This infuses intelligence throughout the organization and frees up the data scientists to innovate and work on the biggest breakthroughs for the enterprise. As data and data science become more approachable, every worker will be a data scientist.
The benefits of self-service data science are twofold. First, you get empowered business teams, who can leverage their contextual intelligence with the data science to get exciting business results. Secondly, data science becomes embedded in the way business employees work. You know you’ve reached your goal when you hear an employee say of the data science, “It’s just how I do my job.”
Summary:
What does the data science capabilities mean for business users? Businesses are continually seeking competitive advantage, where there are a multitude of ways to use data and intelligence to under‐ pin strategic, operational, and execution practices. Business users today, especially with millenials (comfortable with the likes of Siri, Google Assistant, and Alexa) entering the work‐ force, expect an intelligent and personalized experience that can help them create value for their organization.
In short, data science drives innovation by arming everyone in an organization—from frontline employees to the board—with intelligence that connects the dots in data, bringing the power of new analytics to existing business applications, and unleashing new intelligent applications.
Data science enriches the value of data, going beyond what the data says to what it means for your organization—in other words, it turns raw data into intelligence that empowers everyone in your organization to discover new innovations, increase sales, and become more cost-efficient.
Data science is not just about geekiness and the realm of algorithms, had it been the case, every organization would have already employed army of monkeys and robust keyboards to churn out insights. Rather successful data science is all about a journey, lots of patience to experiment, an ecosystem of tools and technologies, and above all, it is all about deriving value.
More from Soumendra Mohanty
Last week, I was in Johannesburg meeting some clients, and the conversation turned toward a…
AI (Artificial Intelligence) will make up for the lack of data scientists and the next frontier…
It’s hard to not notice that in almost everything (starting from our mundane day to day activities…
In the recently concluded “Gartner Data Analytics Summit 2017”, there was an interesting…
Latest Blogs
Introduction Artificial Intelligence (AI) is transforming industries and redefining possibilities…
Introduction The evolution of artificial intelligence (AI) has been a remarkable journey,…