February 2020 Edition Vol.11, Issue 2

Big Data Holds Promise in Healthcare Delivery

By Lynne Lederman, PhD

The Presidential Symposium at the 2019 American Society of Hematology (ASH) Annual Meeting addressed the use of real world data (RWD) and deeper learning artificial intelligence (AI) systems in hematology research and delivery of personalized healthcare.

Registries Improve Donor Selection and Recipient Survival

Mary M. Horowitz, MD, Center for International Blood and Marrow Transplant Research (CIBMTR), Medical College of Wisconsin, Milwaukee, WI, discussed how a database of RWD provided real world evidence (RWE) to improve the outcomes of blood and marrow transplantation (BMT).

RWD relates to patient health status and/or the delivery of healthcare routinely collected from a variety of sources, such as patient and disease registries. RWE interprets the RWD and weighs the potential use and benefits or risks of a medical product. RWE can be generated from clinical trials, both randomized and observational.

Dr. Horowitz showed how analysis of registry data improved the outcomes of unrelated donor transplants, which include about two-thirds of those done. In the 1990s, one-year survival was less than 40% for unrelated donor groups. This is substantially lower than that for human leukocyte antigen (HLA)-identical sibling transplants.

The one-year overall survival after unrelated donor transplantation has improved to over 70%, with the odds increasing by 6% per year on average between 1990 and 2015, adjusted for disease status, age, and performance status (95% CI, 7% to 9%). Dr. Horowitz attributes more than half of this improvement to better donor selection.

Conventional wisdom in the 1990s was that HLA-A, B, DR, and DQ were the most important loci to match. Mismatches detectable at the serologic or intermediate resolution level were more important than mismatches only detectable at the allelic or high resolution level. Donor age was not important, or not as important as donor-recipient ABO and sex-match and CMV status. “It took us a long time to have enough data to challenge this conventional wisdom,” she said.

Analysis of data from nearly 2,000 unrelated donor-recipient pairs by 2004 suggested that HLA-C, for which many weren’t typing, was indeed important. This was confirmed in a later, larger study. Registry data likewise showed mismatches that were only detectable at high resolution were important, whereas DQ matching was not.

Some studies have identified mismatches that are permissible and others that have adverse effects. Multiple other studies have confirmed that donor age is the single most important characteristic after HLA that is associated with outcome.

All of these findings have changed practice, resulting in the increasing use of matching for HLA-C, increased DP typing, and a gradual decrease in the median donor age from 38 years in the early 1990s to 28 years today.

The International Blood and Marrow Transplant Research (IBMTR) patient registry was established at the Medical College of Wisconsin in 1972, four years after the first successful transplants. In 1986, when it became clear large unrelated donor panels were needed to extend transplants to people who didn’t have matches within their families, the National Marrow Donor Program (NMDP), now known as Be the Match, was established in the US. It established an outcomes registry to track the results of the transplants and a research repository to store donor and recipient samples. In 2004 the IBMTR and NMDP patient outcomes registries were consolidated into the CIBMTR, which conducts research and clinical trials. Many other national and international BMT registries are continuing to be created.

Dr. Horowitz pointed out that data sharing and analyses from collaborations of bone marrow transplant registries provide proof of principle for hematology, which is a field of many rare diseases. Cancers in general are also becoming collections of rare diseases as the use of molecular profiling is increasing.

Finding Gold in the Mess

Russ B. Altman, MD, PhD, Stanford University, Stanford, CA, a self-identified general internist, discussed the use of pharmacogenomics in the clinic setting. He introduced the concept of deep learning for interpreting rare genetic variants. He also presented the use of AI for mining social media data for patient drug experiences.

Altman’s lab has been working on a pharmacogenetics knowledge base (www.pharmgkb.org) for the past 20 years. PharmGKB aims to catalog all known human, mostly germline genetic variations, that affect drug response. The PharmGKB database of pathways of drug action and metabolism is created manually.

Dr. Altman’s clinic tests for about 200 loci responsible for drug response. If patients are taking multiple drugs, their primary physicians may need assistance in figuring out potential side effects or lack of response.

Opioids, including oxycodone, tramadol, and codeine, for example, are metabolized by the enzyme cytochrome P4502D6 (CYP2D6). Codeine is inactive until it is metabolized into morphine by CYP2D6 by the demethylation of one oxygen. Polymorphisms in CYP2D6 affect patients’ ability to metabolize codeine, and therefore patients have very different experiences with the drug.

Dr. Altman saw a patient who had experienced no pain relief during prior surgical procedures. As a poor CYP2D6 metabolizer, the patient would not be expected to experience relief from oxycodone, tramadol, or codeine. Altman recommended dilaudid for pain relief as it does not require CYP2D6 metabolism for its effects. The patient convinced her surgeon to use dilaudid, which resulted in good post-surgical pain relief.

“When a patient tells you that codeine doesn’t work, you might want to believe them and not think they are a drug seeker,” he said.

However, most payers don’t reimburse for pharmacogenetic testing; those that do reimburse do so in limited circumstances.

Nearly a quarter of the US population has a compromised ability to metabolize opioids. It’s not just opioids that are affected; up to 20% of all drugs commonly used are metabolized by CYP2D6. Of subjects in the UK Biobank, 11% are taking at least one drug metabolized by CYP2D6. More than half of the observed variants in the gene for CYP2D6 have no known function. In addition, available sequence data and metabolic activity information are limited.

Altman’s lab created a deep learning system to predict the function of CYP2D6 variants that agreed where data were available and plans to apply the system to other genes to predict effect of rare variations.

They are also using AI to mine social media for patient drug experiences. The potential benefits of social media is that it’s immediate, there is increased granularity, and patients talk about things their physicians don’t see or know.

“It’s a mess, but it might be a mess with some gold,” Altman said. One problem is that there is a semantic gap between terminology and approaches of the medical establishment and what people say on social media.

One of Altman’s graduate students, Adam Lavertu, is creating networks for drugs of interest from information on Reddit, where there are billions of mentions of every drug anyone could take. The networks include hundreds of drug names, whether they are misspelled, phonetic, slang, or pill description.

He has been able to determine associations of adverse drug reactions and their severity, even though severity is not specifically mentioned. The group is planning to use this method to look at drug-drug interactions and identify those that are severe.

“Patients are telling us about their drug experiences on social media, we just have to listen,” Altman concluded.

FDA Looking Ahead

Amy Abernethy, MD, PhD, Principal Deputy Commissioner, Chief Information Officer (Acting), U.S. Food and Drug Administration, Silver Spring, MD, discussed the role of data, analytics, and policy in accelerating therapeutic development and optimizing healthcare. She suggested a learning healthcare system for cancer care, in which any individual’s cancer care will be based on those previously treated to inform the treatment and care of those who come after.

This learning healthcare system would be based on data routinely collected in real-world settings. The system would learn by analyzing captured information, generate evidence, and implement new insights into subsequent clinical care. The objectives are to generate and apply the best evidence relevant to each patient and across populations. Additionally, it would support continuous healthcare improvement, spark innovation, enhance patient safety, and maximize healthcare value, while propelling scientific discovery as a natural outgrowth of patient care.

Policies that are designed to drive this vision forward include the 21st Century Cures act, which is establishing a national agenda to modernize clinical evidence development, including streamlined clinical trial designs powered by informatics, and use of RWE derived from electronic health records (EHRs) and other sources, as well as increasing the NIH budget.

Abernethy said that the FDA has a framework for its RWE program, with a guideline to be issued in 2021, as well as evaluation of RWD/RWE for use in regulatory decisions. RWE may be used for the post-marketing experience and label expansion.

Meanwhile, the FDA technology modernization action plan (TMAP), which began this past September, will build technical underpinnings and other capabilities to scale. It will address the need to modernize technical infrastructure, enhance capabilities to develop technology products to support the FDA’s regulatory mission. TMAP would also communicate and collaborate with stakeholders to drive technological progress that is interoperable across the system.

Final Thoughts

The two issues that dominated the discussion portion of the session were privacy and data quality. The consensus was that it is currently impossible to build a complete database that can be continually updated yet maintain patient privacy. The public clearly needs to be included in the conversation.

Given that it is not possible to verify how “true” social media-derived data are, denominators are unknown, and signals may be artificially inflated as comments are spread, Altman thinks these data are best used for hypothesis generation.

Horowitz said, “It’s easy to see that social media data have quality issues. Our electronic health records also have quality issues. We have the ability to improve our EHR without making clinicians click a thousand times more a day,” she concluded to loud applause.

In response to a question about how to address both intentional and unintentional bias in AI, Altman said there are very smart people actively working on algorithmic approaches to identifying and removing bias in AI. Bias can be measured and subtracted out in other systems, so Altman is hopeful that can be done in healthcare, although he said we in the period right now of the worst, most biased AI systems that we will ever see. People are excitedly applying all the data building classifiers that couldn’t be more biased. “Don’t let them treat you with an AI system right away,” he cautioned.


Post a Comment

OBR Archives

To view previous issues of OBR green you can visit our archives. The entire library of OBR green articles is searchable.