Big Data In Life Sciences: Trends, Challenges, And The Payoff
By Rich Sokolosky, partner, life sciences practice leader, NewVantage Partners
Information is playing a critical new role in the business of life sciences, from discovery to commercial operations. Big Data is a major agent of change in the trends, challenges, and payoffs for this emerging focus, and now is a good time for informatics and analytics professionals to step back and see where we are and where we can go.
A good starting place is to look at these four commercial life sciences information trends:
1. Patient-Level Insights Are Driving Competitive Advantage.
Organizations are looking to patient data for real-world drivers of brand use — to explain the drivers behind the trends and better inform commercial activities from brand planning to sales targeting.
2. We Are Moving Beyond The Data Warehouse To The “Data Lake.”
Companies are building data hubs to provide comprehensive access to the information necessary to create new insights that data warehouses cannot deliver.
3. The Data Scientist Is Becoming A Key Role In Commercial Organizations.
New insights drive decisions when market conditions are changing, and new products are being launched. Data scientists provide the necessary link between business knowledge and analytics expertise to provide these insights.
4. New Technologies Are Rapidly Changing Cost & Capability Dynamics.
NoSQL databases, Hadoop, and cloudbased platforms are significantly driving down the cost and time necessary to create value from information initiatives.
The common denominator to these trends? Big Data — and the analytics that life sciences businesses need to make profitable sense from the unparalleled amount of information now available to them. But this opportunity is presenting a new set of challenges, too.
Life sciences businesses want to know what Big Data really means and where new approaches can deliver value to commercial operations. And then there is the big question: How do I start?
Here are some common issues we see at this stage:
“The data warehouse team says my request to add patient-level data sources and analytics will cost $3M and take 18 months to complete. I am launching the product in 12 months, and don’t have that kind of budget or time.”
“My analysts spend 80 percent of their time finding and acquiring data, and those are all one-off efforts. How do I get them ready access to all of the data they need? Where do I start? How do I develop a strategy that all the change agents can get behind, including IT?”
“None of the data integration firms I have talked to understand commercial pharma, and I cannot take the time to educate them on the analytics I am looking to create.”
Meanwhile, data scientists and analysts play increasingly important roles as commercial analytics drivers, providing the new analytics that drive business decisions in a dynamic environment. They need access to all of the data to support business decisions and create competitive advantages. In fact, 80 percent of data scientists and analysts want access to comprehensive sets (subject areas) of integrated data rather than direct source data.
That’s a big demand. A single “uber” data warehouse cannot deliver the agility and information necessary for commercial reporting and analytics across all of its dimensions and uses. This is where cloud environments and open source (Big Data) technologies have dramatically changed the cost and capability parameters for these adventures in information.
The payoff begins by recognizing the new analytics requirements and the opportunities they represent. Today’s data scientists and analysts can start with hypotheses and iterate through them; faster iterations lead to more insights. They can search and explore all of the data across its life cycle stages to find the answers and perform work in a “sandbox” outside the data warehouses. Data sets are not perfect, but they are directionally correct, and they deliver results quickly. Analysts can work with current information for immediate business decisions.
The result? Big Data claims are not exaggerated. Working from such a Big Data analytics platform, life sciences providers have been able to deliver results at onetenth the cost and duration of a traditional implementation.
Data scientists and analysts spend the majority of their time exploring the data and creating new analytics vs. acquiring and understanding it. Analytics run in minutes vs. hours, leading to more iterations and reducing “information depreciation.” And this environment can accommodate new data sources of any size.
In the commercial operations of pharmaceutical companies, understanding realworld patient behavior is critical to brand messaging and sales activities. Here’s an example:
A top-20 pharma company needs to know if its patients are taking medication as directed (adherence) and how long patients are continuously staying on their brand vs. the competition (persistence). The company delivers this information to its sales force so they can work with physicians if there is an issue in their territory and to adjust marketing content if necessary. The company also needs a way to measure results from both activities as soon as possible to see if further actions are necessary.
The company receives weekly claims data for their brand and the competition, but it takes two to three weeks to create the adherence and persistence information, and by then the insights are not useful. They turned to a hosted Big Data platform to see if they could do better. With cloud and open source technologies, they were able to run the analysis in 20 minutes. The data lag went from four weeks to one week (the lag inherent in the data itself), and it took one month to implement — after the data-warehouse team had anticipated that the effort would take eight to nine months. While the company relied on Big Data expertise from a consulting company for the initial proof-of- concept, the end result was compelling enough for them to start building a Big Data platform and start training internal resources to handle the processing.
Where Do We Go From Here?
Big Data can be described in many ways, but the real value lies in providing a set of technologies and capabilities that data scientists and analysts can use to better inform business decisions at all levels of the commercial organization. The traditional data warehouse still has a role, but organizations need to rethink the data eco-system to incorporate the new discovery analytics that are so important to staying competitive in today’s market. By incrementally testing these new technologies and resulting insights, life sciences organizations can determine the best way to integrate Big Data with their existing data warehouse investments.