Unleashing the power of life sciences analytics with data products

Life sciences organizations have vast amounts of data, but much of it is hard to reuse. Data products could allow them to extract data’s full potential.

December 7, 2023by Siddhartha Rai and Prasoon Sharma

As early adopters of analytics, life sciences organizations are firm believers in data. Life sciences domains from research to commercial have access to diverse and numerous data sets that, when used effectively, are revolutionizing the field. Well-wielded data is changing how organizations conduct research, develop life-saving therapies, deliver personalized medicine, and optimize their operations.

But the variety of platforms in current data architectures has become burdensome. Practitioners find themselves doing repetitive work to extract analytics insights. What’s more, this work is labor-intensive, expensive, and time-consuming. Life sciences domains from clinical trials to sales lose time and money in these efforts, which leads to a drop in overall productivity and missed opportunities.

Leading life sciences organizations are starting to address these challenges by embracing data products, specialized and reusable data assets created from raw data. A suite of data products, AI, and machine learning (ML) models are helping organizations capture the potential of their diverse data assets across multiple domains. These data products help standardize data, enable reuse across business units, streamline data integration, and drive actionable insights for research, clinical decision making, and operational efficiencies. In addition, data products help make data management processes more efficient by reducing redundancy, which ultimately saves organizations time and resources and allows them to focus on opportunities that yield value. The advantages of data products can even extend beyond organizational boundaries, encompassing opportunities for monetization (in compliance with relevant regulations) and collaborations with third parties and other partners.

We’ll discuss how successful teams build and support impactful data products; the implications of—and considerations for—creating data products; and examples of how life sciences organizations could implement these strategies.

Building successful data products

Across domains, data products have the potential to transform businesses. Data products provide an opportunity for organizations to use their existing technical and data platforms to surface data product capabilities to their teams. Some current approaches, such as federated data mesh architectures, comprehensively support the development of data products across large organizations.

In real-world evidence (RWE), for example, a cardiology team might decide to publish a data product that provides cardiology and other health-related details using anonymized patient data. Similar data products could be built in other domains to give the respective teams access to data from patients with different health conditions. These data products also help researchers and analysts examine data for each health condition from various perspectives.

As we discuss next, successful teams treat data product development akin to software development, blending technological capabilities (for example, modularization and test-driven development) with an organizational approach (for example, agile software development and product backlog management). These principles not only guide the development process but also lay a solid foundation for creating and using data products.

Technical capabilities

Unlike prior methods that simply provided data to end users, the current approach with data products allows for agile and iterative evolution. Life sciences organizations that use modern software development practices, product management approaches, AI methodologies, DevOps and MLOps automation, and so on can leverage the benefits of improved reuse, efficiency, and capability to scale.

We have identified some indispensable technical capabilities for creating high-quality data products in life sciences, including the following:

scalable storage infrastructure for large data volumes and a scalable compute infrastructure for large-scale compute jobs
trigger mechanisms for and management of large, complex workflows
data marketplace, cataloging, and governance to ensure robust governance
continuous integration and continuous delivery (CI/CD)

For instance, a biopharma research organization might use its technical capabilities to build a consistent series of data products to support multiple use cases across several stages of research such as in-vivo, in-vitro, and in-silico design. Since data products are treated as software products, adopting software engineering best practices is essential. These best practices help reduce redundancy, improve reliability, and scale the analytics approaches to many use cases.

Team, organizational, and governance approaches

Organizations can start sustainably implementing data products by introducing a clear governance structure for budgeting and resource allocation, compliance, data management, and cross-team collaboration. Successful data product managers think of their downstream colleagues and consumers as users and approach product development by collaborating across business units. This involves creating road maps and backlogs, prioritizing features, and allocating resources and budgets with input from various teams. In this process, it’s also important to consider funding, adoption, alignment, and product promotion. Data product managers can sustain data products by aligning approaches with organizational data management practices.

Key questions to consider

When adopting a data product approach, organizations achieve long-term success by considering the potential value gains. These include enhanced data reuse, streamlined access, improved data quality, consistent methodologies for sustainability, and increased agility in answering analytics questions.

To realize these benefits, life sciences domains could consider some of the following key questions to help shape their trajectory:

Which business domains stand to gain the most by enabling data reuse using data products?
How should the data product taxonomy and data model be organized to facilitate reuse within and across business domains?
How can organizations use data products sustainably while guaranteeing high data quality and accuracy?
Which data products should an organization build to have maximum impact and leverage?

Life sciences organizations are using data products to achieve scale, accelerate innovation, and meet their analytics goals with cross-functional teams and technological advancements. Evaluating current data platforms and assets as they build and utilize data products aids in shaping a vision for future endeavors and underscores the importance of collaborative efforts across teams. Integrating interconnected data products, from molecule design to distribution, empowers organizations to advance scientific research, enhance patient outcomes, and foster overall growth.

Siddhartha Rai is an associate partner in McKinsey’s Berlin office, and Prasoon Sharma is a partner in the New York office.