Solving Pharma’s Data Silo Problem: Why a Linkable Data Infrastructure is Good for Biopharma and Good for Patients
Each year, biopharma companies spend billions of dollars generating, buying and analyzing data. But a basic problem continues to plague everyone from the smallest biotech startup to the largest Fortune 100 companies: not being able to connect the data they’ve amassed at the patient level while protecting patient privacy. For decades, there hasn’t been a solution.
COVID-19 has demanded unprecedented speed and innovation in drug development, and the time is right to re-think what’s possible.
Connecting pharma’s siloed data is not only possible, it’s already being done. Within the last two years, companies across the industry have recognized this is a solvable technology and coordination problem. Establishing a linkable data infrastructure — in which all of a pharma company’s data can be linked at the patient level — is becoming the keystone of a next-generation, patient-centric data strategy.
Pharma’s Data Silo Problem
A single, siloed dataset can be useful for answering only a limited set of questions. For example:
- Clinical trials are used to gather data on drug efficacy and safety.
- Before launch, to understand the current market and target patient population, pharma companies might purchase third-party pharmacy and medical claims, lab data and EHR data.
- Post-launch, pharma companies may use product registries or third-party, real-world data like claims, EHR, and mortality data to conduct HEOR in support of reimbursement.
- To improve patient access and adherence, they might also use their proprietary data, including data from patient support programs and specialty pharmacies.
Each question may require its own dataset (or multiple datasets) to answer — and connecting that data at the patient level isn’t straightforward. To protect patient privacy, pharma largely works with de-identified data, tokenized data. Tokenization replaces identifying information with random strings of characters, or tokens, that cannot be reversed to reveal the underlying information. Tokens are also consistent — that is, the same identifying information will generate the same token every time, so the process can be used to link de-identified patient records across datasets while protecting patient privacy.
Historically, the utility of this approach to privacy-preserving data linkage has been limited because each dataset that a pharma company uses has been de-identified using a different tokenization scheme.
Without using a common token, or “key”, every dataset becomes its own silo. And without being able to link it to other patient-level data, much of the dataset’s value remains locked away.
Let’s say a biopharma company wants to understand why patients are dropping off their rheumatoid arthritis product. Their product is dispensed through 6 different specialty pharmacies. They’ve also purchased medical claims and EHR data to supplement their analysis.
Historically, each of these datasets would have remained in separate silos, with analysis limited to one dataset at a time. Without linking data at the patient level across all 6 specialty pharmacies and the third-party data sources, it’s difficult to see treatment and adherence patterns, or understand the underlying clinical factors for non-adherence.
The Solution: a Linkable Data Infrastructure (LDI)
If the biopharma company applies the same tokenization scheme to each dataset, all of the records for the same patient can be linked with the same “key”, without compromising patient privacy or HIPAA compliance. The resulting longitudinal dataset provides deeper insights on whether a patient has truly discontinued therapy or just switched pharmacies, why they might be non-adherent, and how outcomes were affected.
That’s just one application of a Linkable Data Infrastructure (LDI).
Commercial teams can take advantage of an LDI to enhance:
- Brand analytics & marketing strategy by linking specialty pharmacy data to third-party real-world data;
- HEOR & market access by linking trial data with third-party real-world data; and
- Commercial targeting & measurement by linking real-world data to improve HCP targeting and measure effectiveness of promotional spend
How to Establish a Linkable Data Infrastructure
Establishing an LDI doesn’t require building anything new, but rather taking advantage of existing resources. The first step is working with technology that is already standard across the industry.
Datavant, for example, is the most widely-used privacy protection and connectivity partner in healthcare, and has built the largest ecosystem of linkable real-world data in the U.S. More than 350 data sources — including all of the top claims, lab, and EHR data providers, plus mortality data, specialty pharmacies, academic medical centers, and social determinants of health data source — already use Datavant to tokenize their data.
That gives pharma the missing “key” to link data across all their sources.
In the open data ecosystem Datavant has built, each party is empowered to freely work with other data providers and link the data most relevant to their use case. This means that pharma companies aren’t limited to buying off-the-shelf cuts of siloed data, and can enhance what they’re already purchasing from aggregators.
The Next Frontier: Linking to Real-World Data for Clinical Development
One of the biggest remaining data silos is clinical trial data.
Until recently, the technical challenges of tokenizing and linking data without unblinding the study have made this impossible. Datavant has partnered with companies across the industry (including Janssen, Parexel, Medable, Medidata, and TriNetX) to solve this problem, and pharma companies are now starting to use LDI to dismantle the silos between real-world data and clinical trials.
By linking real-world data to clinical trial data, pharma companies can conduct smarter subcohort analysis, use real-world data to passively and more efficiently collect data for post-marketing studies, and start gathering evidence to support reimbursement much sooner.
With an LDI, you could link data from a patient’s Phase III trial results with the same patient’s real-world claims, lab data, and more. You could also link your data from their interactions with your patient support programs and specialty pharmacies for a comprehensive, longitudinal view.
Drug development and commercialization has traditionally relied on large amounts of data. Continued innovation depends not on generating or buying even more data, but on connecting the dots.
More and more companies across the industry are connecting their data silos and establishing a linkable data infrastructure for a comprehensive, longitudinal view of patients. That means bringing more life-saving therapies to market, getting them to the right patients faster, and improving patient outcomes for generations.
About Datavant:
Datavant’s mission is to connect the world’s health data to improve patient outcomes. Datavant works to reduce the friction of data sharing across the healthcare industry by building technology that protects the privacy of patients, while supporting the linkage of de-identified patient records across datasets. Datavant is headquartered in San Francisco. Learn more about Datavant at www.datavant.com.