Carpe data before it’s too late

by Eric Hazan

As companies digitize their businesses, the data they generate piles up faster than they figure out what to do with it. The idea of using big data to inform decisions that will increase success is sound, but the reality is that doing it is difficult. Many companies are still agonizing over five key questions, but answering them should not impede immediate progress.

The explosion in the volume of data generated and stored worldwide is one of the main consequences of the digital transition. This effect, which McKinsey described back in May 2011, is accelerating at a breathtaking rate: together, we have all produced as much data in the last 10 months as we did in the entire history of humanity.

Meanwhile, the cost of data storage continues to fall, and prudent organizations everywhere are amassing gigantic volumes of information. Even if they are only using a fraction of that data, they still gather as much as they can, thinking, “Who knows? It could come in handy one day.”

We are starting to see a few of the first convincing examples of advanced data analytics that point toward a future economy driven to a large extent by data: custom-written TV series, preventive maintenance in industry, analytical hiring and retention of talented individuals, and advertising campaigns optimized in real time.

But with the exception of these few pioneers, the majority of companies are simply observers, gathering data and thinking about what to do with it. What they should be doing is seizing the opportunities their data gives them, right now. The five questions that get in the way of doing that are big and daunting. However, they will be resolved not by taking a conceptual approach, but by embracing an experimental approach. Companies that plunge right in, try new things, scale their successes up, and learn from their failures could very well establish an unbeatable lead over their more hesitant competitors.

1. How should the regulatory framework be interpreted?

When it comes to using data, it’s a jungle of regulation out there, especially for multinationals. The legal frameworks they are required to comply with vary considerably between the United States, the European Union, and the other regions where they operate. On top of that the laws change constantly in response to developments in technology and data usage. Often there are multiple stakeholders beyond the legal structures: independent agencies such as the French data protection agency CNIL, employee representatives (as in Scandinavia), and non-profits. Some of this complexity should improve when the European Commission’s general data protection regulation (GDPR) takes effect in May 2018. The regulation creates one comprehensive set of rules for all EU member states. But that still leaves the rest of the world’s regulations to deal with.

Another piece of the regulatory framework is the virtual “operating licenses” their customers grant (or deny) them to use their data. These licenses are subject to two conditions that can be tricky to uphold. The first is that a company agrees to take every possible precaution to keep customers’ data secure. The second is that the way the company uses customers’ data will benefit both parties. These benefits can include more accurate targeting of advertising that is less intrusive for web users, or a higher marketing fee for a publishing company. A company that runs afoul of these conditions risks severe damage to its reputation.

2. How can data be uncorked?

It was Pentaho founder and Chief Technology Officer James Dixon who first talked about the data lake concept. What is it? Let’s imagine that, until now, the majority of companies have stored and structured their data as though they were filling bottles with water, and then grouped these bottles together in various packs which are stored on shelves in different countries. To use data stored in this way, you first have to find the most promising bottles and mix their contents by hand before you can even use it.

With advances in the cognitive sciences it is now much more effective to pour all of this data out of its many bottles into a data lake where semi-autonomous algorithms fish for useful correlations between data sets. These correlations are the insights we hear so much about. The main advantage of these data lakes is that their growth is unlimited, as are the many ways of integrating different data types.

These lakes can contain first-party data generated and owned by the company, second-party data exchanged with partners, and third-party data brought in from specialist companies that gather it online.

In many silo-structured companies, the change of IT architecture that is essential for creating a data lake is accompanied by misgivings about the transition or resistance to change, especially when it involves hosting such a large amount of data in the cloud.

3. How can organizational silos be dissolved?

Pooling all the data held by a single company, let alone a corporate group, raises more than just technical questions. Governance factors are at least as challenging. As senior executives gradually gain a better grasp of the potential value offered by data, the political challenges typical of any company resource (financial capital, talent pool, strategic information, etc.) begin to emerge. This is inherent to human organizations. It can often taken years to set up strategic committees and processes to manage such challenges effectively, collaboratively and with minimum bias that could otherwise lead to bad decision making.

In the same way, companies must encourage business division leaders to share their data by developing mechanisms that make it possible to redistribute the internal value created by data, whether it’s monetary or symbolic capital. Appointing a chief data officer, as about a quarter of leading companies have done, helps facilitate this transition, but on its own this is not enough.

4. Which skills are needed, and where can they be found?

As soon as companies engage actively with the issue of big data, they come up against the thorny issue of talent. The reality is that they need a number of people with extreme specialist skills.

The first are data architects. These are the people responsible for the abstract system that will build the data lake. Their task is to ensure that all the data flows generated by each organizational entity converge in real time via a standard semantic system that makes them mutually intelligible.

They then need data scientists. These mathematicians are specialists in statistics who have robust programming experience. These are the designers of the algorithms and automated learning systems that will convert data to the insights that will give the company a competitive edge.

Companies will also need translators to direct the efforts of the data scientists and clear the path forward for their work. These are employees from individual business lines who have deep understanding of the challenges, but also have strong statistics skills and even the basics of programming. In the past, the role of data translators has been undervalued, but smart companies will quickly grasp the crucial importance of these individuals to success.

Lastly, companies can appoint a chief data officer. On paper, his or her mission consists of managing a team made up of all the people listed above; preparing, disseminating, and measuring group data quality and security standards; building partnerships to enrich the data available; and leading innovation through data usage by selecting the most promising usage scenarios and learning from them. But alongside this formal role, he or she must also be an evangelist with the ability to dispel internal reluctance and actively advocate change.

What these roles add up to is a complete skills chain. It isn’t complete without a specific HR plan for identifying, hiring, developing, rotating, and retaining these exceptionally valuable data specialists.

5. How can the decision-making process be reinvented?

The fifth element enables the previous four, and it’s the most challenging. It requires a change of culture within the company, and possibly on the much wider scale of society.

It means making the transition away from a value system that promotes and rewards business flair—the shrewd and instinctive acumen of the decision maker—toward a culture in which both large and small decisions are deeply informed by data. This profound change involves a kind of power shift at every level. The data system may show that a specific banking adviser should be selected to push a particular product, for example. A salesperson may be asked to focus his or her efforts on a particular customer. The system could even indicate that a chief executive should withdraw from a particular project, even though she or he “feels good” about pursuing it.

Companies will have to accept that data analysis can produce conclusions that directly contradict humans’ initial convictions or intuitions. For example, a service company has discovered that the average productivity of its IT engineers improved when they shared their time between a dozen simultaneous projects, rather than having them focus on two or three projects at the same time, which managers previously believed was more effective.

It will take time to adjust to this new business culture. Instead of the compelling stories of charismatic, brilliant, and trail-blazing chief executives, we will have a new narrative of leaders who make decisions based on data. A good example of this is the film Moneyball, in which Oakland A’s manager Billy Beane builds a winning baseball team on a meager budget by choosing players based on statistics, not gut instinct.

Top management must be the evangelists of this level of change. Leaders should visit organizations where data is already boosting performance dramatically, in areas such as health, safety, and sports. As with any process of transformational change, it’s important that leaders learn from the initial demonstrable successes to amplify the dynamic within the company.


This takes me back to my initial argument: don’t wait for each of these eminently complex questions to be answered fully. By the time that happens it may be too late.

All the companies that I have seen achieve initial successes with using advanced data analytics to drive decision making have one thing in common: they do not go over the top in intellectualizing the issues to be resolved. Rather, they embark immediately on experimentation with a small team, an incomplete database, and maybe even just a single computer that has no network connection. They identify a small number of use scenarios in which data analysis gives them a significant and measurable competitive advantage. Examples include an online retailer that boosted sales using targeted buying suggestions, an insurer that cut its attrition rate by implementing a preventive customer retention plan, and a major advertiser that successfully optimized its media expenditure.

By building on these initial successes they can then recreate the dynamic internally, involve the highest levels of management, and secure the flow of resources required. By embracing this approach of just diving in they have already covered a lot of the ground toward resolving the five difficulties described above. If their competitors continue to contemplate change and try to make comprehensive plans for it, this head start will prove decisive.

Eric Hazan is a senior partner in our Paris office.