Thursday, October 21, 2021

Machine Learning Can Make Lab Testing More Precise

An analysis of over 2 billion lab test results suggests a deep learning model can help create personalized reference ranges, which in turn would enable clinicians to monitor health and disease better.

Paul Cerrato, MA, senior research analyst and communications specialist, Mayo Clinic Platform and John Halamka, M.D., president, Mayo Clinic Platform, wrote this article.

Almost every patient has blood drawn to measure a variety of metabolic markers. Typically, test results come back as a numeric or text value accompanied by a reference range which represents normal values. If total serum cholesterol level is below 200 mg/dl or serum thyroid hormone level is 4.5 to 12.0 mcg/dl, clinicians and patients assume all is well. But suppose Helen’s safe zone varies significantly from Mary’s safe zone. If that were the case, it would suggest a one-size-fits-all reference range misrepresents an individual’s health status. That position is supported by studies that found the distribution of more than half of all lab test results, which rely on standard reference ranges, differ when personal characteristics are considered.1

With these concerns in mind, Israeli investigators from the Weismann Institute and Tel Aviv Sourasky Medical Center extracted data on 2.1 billion lab measurements from EHR records, taken from 2.8 million adults for 92 different lab tests. Their goal was to create “data-driven reference ranges that consider age, sex, ethnicity, disease status, and other relevant characteristics.”1  To accomplish that goal, they used machine learning and computational modeling to segment patients into different “bins'' based on health status, medication intake, and chronic disease.2. That in turn left the team with about half a billion lab results from the initial 2.8 million people, which they used to model a set of reference lab values that more precisely reflected the ranges of healthy persons. Those ranges could then be used to predict patients’ “future lab abnormalities and subsequent disease.”

Taking their investigation one step forward, Cohen et al. used their new algorithms to evaluate the risk of specific disorders amongst healthy individuals. When they looked at anemia cut offs like hemoglobin and mean corpuscular volume, a measurement of red blood cell size, their newly created risk calculators were able to separate anemic patients into groups at high risk for microcytic and macrocytic anemia from those with a risk no higher than the average nonanemic population. Similar benefits were observed when the researchers applied their models to prediabetes: “…using a personalized risk model, we can improve the classification of patients who are prediabetic and identify patients at risk 2 years earlier compared to classification based merely on current glucose levels.”

William Morice, M.D., Ph.D., chair of the Department of Laboratory Medicine and Pathology (DLMP) at Mayo Clinic and president of Mayo Clinic Laboratories, immediately saw the value of this type of data analysis: “In the ‘era of big data and analytics,’ it is almost unconscionable that we still use ‘normal reference ranges’ that lack contextual data, and possibly statistical power, to guide clinicians in the clinical interpretation of quantitative lab results. I was taught this by Dr. Piero Rinaldo, a medical geneticist in our department and a pioneer in this field, who focuses on its application to screening for inborn errors of metabolism. He has developed an elegant tool that is now used globally for this application, Collaborative Laboratory Integrated Reports (CLIR).”

During a recent conversation with Piero Rinaldo, M.D., Ph.D., he explained that Mayo Clinic has been using a more personalized approach to lab testing since 2015 and stated that “CLIR is a shovel-ready software for the creation of collaborative precision reference ranges.” The web-based application has been used to create several personalized data sets that can improve clinicians’ interpretation of lab test results. It has been deployed by Dr. Rinaldo and his associates to improve the screening of newborns for congenital hyperthyroidism.3. The software performs multivariate pattern recognition on lab values collected from 7 programs, including more than 1.9 million lab test results. CLIR is able to integrate covariate-adjusted results of different tests into a set of customized interpretive tools that physicians can use to better distinguish between false positive and true positive test results.


References

1. Tang A, Oskotsky T, Sirota M. Personalizing routine lab tests with machine Learning. Nature Medicine. 2021; 27:1510-1517.

2. Cohen N, Schwartzman O, Jaschek R et al. Personalized lab test models to quantify disease potentials in healthy individuals. Nature Medicine.2021; 27: 1582-1591.

3. Rowe AD, Stoway SD, Ahlman H et al. A Novel Approach to Improve Newborn Screening for Congenital Hypothyroidism by Integrating Covariate-Adjusted Results of Different Tests into CLIR Customized Interpretive Tools. Inter J Neonatal Screening. 2021. 7:23 https://doi.org/10.3390/ijns7020023

Wednesday, October 13, 2021

Gastroenterology Embraces Artificial Intelligence

AI and machine learning have the potential to redefine the management of several GI disorders.


John Halamka, M.D., president, Mayo Clinic Platform, and Paul Cerrato, senior research analyst and communications specialist, Mayo Clinic Platform, wrote this article.

Colonoscopy is one of the true success stories in modern medicine. Studies have demonstrated that colonoscopy screening detects the cancer at a much earlier stage, reducing the risk of invasive tumors and metastatic disease, and reducing mortality. However, while colorectal cancer is highly preventable, it is the third leading cause of cancer-related deaths in the U.S. About 148,000 individuals develop the malignancy and over 53,000 die from it each year. We asked ourselves a question: can AI improve the detection of this and related gastrointestinal disorders?

As we explained in The Digital Reconstruction of Healthcare, one of the challenges in making an accurate diagnosis of GI disease is differentiating between disorders that look similar at the cellular level. For example, because environmental enteropathy and celiac disease overlap histopathologically, deep learning algorithms have been designed to analyze biopsy slides to detect the subtle differences between the two conditions. Syed et al.1 used a combination of convolutional and deconvolutional neural networks in a prospective analysis of over 3,000 biopsy images from 102 children. They were able to tell the differences between environmental enteropathy, celiac disease, and normal controls with an accuracy rating of 93.4%, and a false negative rate of 2.4%. Most of these mistakes occurred when comparing celiac patients to healthy controls.

The investigators also identified several biomarkers that may help separate the two GI disorders: interleukin 9, interleukin 6, interleukin 1b, and interferon-induced protein 10 were all helpful in making an accurate prediction regarding the correct diagnosis. The potential benefits to this deep learning approach become obvious when one considers the arduous process that patients have to endure to reach a definitive diagnosis of either disorder: typically, they must undergo 4 to 6 biopsies and may need several endoscopic procedures to sample various sections of the intestinal tract because the disorder may affect only specific areas along the lining and leave other areas intact.

Several randomized controlled trials have been conducted to support the use of ML in gastroenterology. Chinese investigators, working in conjunction with Beth Israel Deaconess Medical Center and Harvard Medical School, tested a convolutional neural network to determine if it was capable of improving the detection of precancerous colorectal polyps in real time.2 The need for a better system of detecting these growths is evident, given the fact that more than 1 in 4 adenomas are missed during coloscopies. To address the problem, Wang et al. randomized more than 500 patients to routine colonoscopy and more than 500 to computer-assisted colonoscopies. In the final analysis, the adenoma detection rate (ADR) was higher in the ML-assisted group (29.1% vs. 20.3%, P < 0.001). The higher ADR occurred because the algorithm was capable of detecting a greater number of smaller adenomas (185 vs. 102). There were no significant differences in the detection of large polyps.

Nayantara Coelho-Prabhu, M.D., a gastroenterologist at Mayo Clinic, points out, however, that the clinical relevance of detection of diminutive polyps remains to be determined. “Yet, there is definite clinical importance in the subsequent development of computer assisted diagnosis (CADx) or polyp characterization algorithms. These will help clinicians determine clinically relevant polyps, and possibly advance the resect and discard practice. It also will help clinicians adequately assess margins of polyps, so that complete removal can be achieved, thus decreasing future recurrences.”

Randomized clinical trials demonstrated that a convolutional neural network in combination with deep reinforcement learning (collectively called the WISENSE system) can reduce the number of blind spots during endoscopy intended to evaluate the esophagus, stomach, and duodenum in real time. “A total of 324 patients were recruited and randomized; 153 and 150 patients were analysed in the WISENSE and control group, respectively. Blind spot rate was lower in WISENSE group compared with the control (5.86% vs 22.46%, p<0.001) . . .”3

Mayo Clinic’s Endoscopy Center, utilizing Mayo Clinic Platform’s resources, has also been exploring the value of machine learning in GI care with the assistance of Endonet, a comprehensive library of endoscopic videos and images, linked to clinical data including symptoms, diagnoses, pathology, and radiology. These data will include unedited full-length videos as well as video summaries of the procedure including landmarks, specific abnormalities, and anatomical identifiers. Dr. Coelho-Prabhu explains that the idea is to have different user interfaces: 

“From the patient’s perspective, it will serve as an electronic video record of all their procedures, and future procedures can be tailored to survey prior abnormal areas as needed.

From a research perspective, this will be a diverse and rich library including large volumes of specialized populations such as Barrett’s esophagus, inflammatory bowel disease, familial polyposis syndromes. The additional strength is that Mayo Clinic provides highly specialized care, especially to these select populations. We can develop AI algorithms to advance medical care using this library. From a hospital system perspective, this would serve as a reference library, guiding endoscopists, including for advanced therapeutic procedures in the future. It also could be used to measure and monitor quality indicators in endoscopy. From an educational standpoint, this library can be developed into a teaching set for both trainee and advanced practitioners looking for CME opportunities. From industry perspective, this database could be used to train/validate commercial AI algorithms.”

AI and machine learning may not be the panacea some technology enthusiasts imagine it to be, but there’s little doubt they are becoming an important partner in the road to more personalized patient care.


References

1. Syed S, Al-Bone M, Khan MN, et al. Assessment of machine learning detection of environmental enteropathy and celiac disease in children. JAMA Network Open. 2019;2:e195822.

2. Wang P, Berzin TM, Brown JR, et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019;68:1813–1819.

3. Wu L, Zhang J, Zhou W, et al Randomised controlled trial of WISENSE, a real-time quality improving system for monitoring blind spots during esophagogastroduodenoscopy. Gut. 2019;68:2161–2169.

Wednesday, October 6, 2021

Societal Resilience Requires a Public Health Focus

We must make a serious commitment to increase financial resources and provide better analytics for real world evidence/real time data in support of public health.


John Halamka, M.D., president, Mayo Clinic Platform, and Paul Cerrato, senior research analyst and communications specialist, Mayo Clinic Platform, wrote this article.

Public health has been underfunded for decades. That neglect has had a profound impact since the COVID-19 pandemic has taken hold, and awakened policy makers and thought leaders to the need for more investment.

Consider the statistics: The U.S. spends about $3.6 trillion each year on health but less than 3% of that amount on public health and prevention. A 2020 Forbes report likewise pointed out that “From the late 1960s to the 2010s, the federal share of total health expenditure for public health dropped from 45 percent to 15 percent.” This relative indifference to public health is partly responsible for the nation’s mixed response to the SARS-CoV-2 pandemic. A recent McKinsey & Company analysis concluded: “Government leaders remain focused on navigating the current crisis, but making smart investments now can both enhance the ongoing COVID-19 response and strengthen public-health systems to reduce the chance of future pandemics. Investments in public health and other public goods are sorely undervalued; investments in preventive measures, whose success is invisible, even more so.”

Among the other “public goods” that require more investment is population health management and analytics. Although experts continue to debate the differences between public health and population health, most are unimportant. For our purposes, population health refers to the status of a specific group of individuals, whether they reside in a specific city, state, or country. Public health usually casts a wider net, concerned about the status of the entire population. Managing the health of these subgroups requires an analytical approach that can take into account a long list of variables, including social determinants of health (SDoH), the content of their medical records, and much more. SDoH data from Change Health care, for instance, has demonstrated that economic stability index (ESI) is a strong predictor of health care utilization. ESI is a cluster model that uses market behavior and financial attitudes o group individuals into one of 30 categories, with category 1 representing persons most likely to be economically stable and category 30 least likely to be stable. The figure, which links race, ESI and health care utilization in Kentucky, suggests that Blacks/African Americans are far less likely to be economically stable (category 1). The same analysis found that Blacks/African Americans were almost twice as likely to use the ED compared to Whites (30.5% vs 18.1%). A growing number of health care organizations are starting to see the value of such population health metrics and are incorporating these statistics into their decision making.

Among the valuable sources of data that can inform population health are patient surveys, clinical registries, and EHRs. Several traditional analytics tools are available to extract actionable insights from these data sources, including logistic regression. Over the decades, several major studies have also generated risk scoring systems to improve public health. The Framingham heart health risk score has been used for many years to assess the likelihood of developing cardiovascular disease over a 10-year period. Because the scoring system can help predict the onset of heart disease, it can also serve as a useful tool in creating population-based preventive programs to reduce that risk. The tool requires patients to provide their age, gender, smoking status, total cholesterol, HDL cholesterol, systolic blood pressure, and whether they are taking antihypertensive medication. The American Diabetes Association has developed its own risk scoring method to assess the likelihood of type 2 diabetes in the population. The tool takes into account age, gender, history of gestational diabetes, physical activity level, family history of diabetes, hypertension, height and weight. Another analytics methodology that has value in population health is the LACE Index. The acronym stands for length of stay, acuity of admission, Charlson comorbidity index (CCI), and number of emergency department visits in the preceding 6 months. More recently, there are several AI-based analytic tools currently being used to improve population health. A review of ML-related analytic methods found that neural networks based algorithms are the most commonly used (41%) in this context, compared to 25.5% for support vector machines, and 21% for random forest modeling.

There is no way of knowing how the world would have coped with COVID-19 had policy makers fully invested in public and population health programs and analytics. But there’s little doubt that we’ll all fare much better during the next health crisis if we put more time, energy, and resources into these initiatives.

Thursday, September 30, 2021

Reimagining the FDA’s Role in Digital Medicine

In addition to evaluating the safety of software as a medical device (SaMD), the agency needs to devote more resources to evaluating its efficacy and quality.

John Halamka, M.D., president, Mayo Clinic Platform, and Paul Cerrato, senior research analyst and communications specialist, Mayo Clinic Platform, wrote this article.

The FDA’s approach to software as a medical device (SaMD) has been evolving. Consider a few examples.

In 2018, IDx-DR, a software system used to improve screening for retinopathy, a common complication of diabetes that affects the eye, became the first AI-based medical device to receive US Food and Drug Administration clearance to “detect greater than a mild level of … diabetic retinopathy in adults who have diabetes.” To arrive at that decision, the agency not only reviewed data to establish its safety, it also took into account prospective studies, an essential form of evidence that clinicians look for when trying to decide if a device or product is worth using. The software was the first medical device approved by the FDA that does not require the services of a specialist to interpret the results, making it a useful tool for health care providers who may not normally be involved in eye care. The FDA clearance emphasized the fact that IDx-DR is a screening tool not a diagnostic tool, stating that patients with positive results should be referred to an eye care professional. The algorithm built into the IDx-DR system is intended to be used with the Topcon NW400 retinal camera and a cloud server that contains the software.

Similarly, FDA looked at a randomized prospective trial before approval of a machine learning-based algorithm that can help endoscopists improve their ability to detected smaller, easily missed colonic polyps. Its recent clearance of GI Genius by Medtronic was based on a clinical trial published in Gastroenterology, in which investigators in Italy evaluated data from 685 patients, comparing a group that underwent the procedure with the help of the computer-aided detection (CADe) system to a group who acted as controls. Repici et al found that the adenoma detection rate was significantly higher in the CADe group, as was the detection rate for polyps 5 mm or smaller, which led to the conclusion: “Including CADe in colonoscopy examinations increases detection of adenomas without affecting safety.”

Their findings raise several questions: is it reasonable to assume that a study of 600+ Italians would apply to a U.S. population, which has different demographic characteristics? More importantly, were the 685 patients representative of the general public, including adequate numbers of persons of color and those in lower socioeconomic groups? While the Gastroenterology study did report enough female patients, there is no mention of these other marginalized groups.  

An independent 2021 analysis of FDA approvals has likewise raised several concerns about the effectiveness and equity of several recently approved AI algorithms. Eric Wu from Stanford University and his colleagues examined the FDA’s clearance of 130 devices and found the vast majority were approved based on retrospective studies (126 of 130). And when they separated all 130 devices into low- and high-risk subgroups using FDA guidelines, they found none of the 54 high-risk devices had been evaluated by prospective trials. Other shortcomings documented in Wu’s analysis included the following:

  • Of the 130 approved products, 93 did not report multi-site evaluation.
  • Fifty-nine of the approved AI devices included no mention of the sample size of the test population. 
  • Only 17 of the approved devices discussed a demographic subgroup. 

We would certainly like to see the FDA take a more thorough approach to AI-based algorithm clearance, but in lieu of that, several leading academic medical centers, including Mayo Clinic, are contemplating a more holistic and comprehensive approach to algorithmic evaluation. It would include establishing a standard labeling schema to document the characteristics, behavior, efficacy, and equity of AI systems, to reveal the properties of systems necessary for stakeholders to assess them and build the trust necessary for safe adoption. The schema will also support assessment of the portability of systems to disparate datasets. The labeling schema will serve as an organizational framework that specifies the elements of the label. Label content will be specified in sections that will likely include:

  • model details such as name, developer, date of release, and version,
  • the intended use of the system,
  • performance measures,
  • accuracy metrics, and
  • training data and evaluation data characteristics

While it makes no sense to sacrifice the good in pursuit of the perfect, the current regulatory framework for evaluating SaMD is far from perfect. Combining a more robust FDA approval process with the expertise of the world’s leading medical centers will offer our patients the best of both worlds.

Thursday, September 9, 2021

Secure Computing Enclaves Move Digital Medicine Forward

By providing a safe, secure environment, novel approaches enable health care innovators to share data without opening the door to snoopers and thieves.

John Halamka, M.D., president, Mayo Clinic Platform, and Paul Cerrato, senior research analyst and communications specialist, Mayo Clinic Platform, wrote this article.

We know that bringing together AI algorithms and data in ways that preserve privacy and intellectual property is one of the keys to delivering the next generation of clinical decision support. But meeting that challenge requires health care innovators to look to other innovators who themselves have created unique cybersecurity solutions. Among these “Think outside the box” solutions are products and services from vendors like TripleBlind, Verily, Beekeeper.AI/Microsoft, Terra, and Nvidia.

The concept of secure computing enclaves has been around for many years. Apple created its secure enclave, a subsystem built into its systems on a chip (SoC), which in turn is “an integrated circuit that incorporates multiple components into a single chip,” including an application processor, secure enclave, and other coprocessors. Apple explains that “The Secure Enclave is isolated from the main processor to provide an extra layer of security and is designed to keep sensitive user data secure even when the Application Processor kernel becomes compromised. It follows the same design principles as the SoC does—a boot ROM to establish a hardware root of trust, an AES [advanced encryption standard] engine for efficient and secure cryptographic operations, and protected memory. Although the Secure Enclave doesn’t include storage, it has a mechanism to store information securely on attached storage separate from the NAND flash storage that’s used by the Application Processor and operating system.” The secure enclave is embedded into the latest versions of its iPhone, iPad, Mac computers, Apple TV, Apple Watch, and Home Pod.

While this security measure provides users when an extra layer of protection, because it’s a hardware-based solution, its uses are limited. With that in mind, several vendors have created software-based enclaves that are more readily adapted by customers. At Mayo Clinic Platform, we are deploying TripleBlind’s services to facilitate sharing data with our many external partners. It allows Mayo Clinic to test its algorithms using another organization’s data without either party losing control of its assets. Similarly, we can test an algorithm from one of our academic or commercial partners with Mayo Clinic data, or test an outside organization’s data with another outside organization’s data.

How is this “magic” performed? Of course, it’s always about the math. TripleBlind allows the use of distributed data that is accessed but never moved or revealed; it always remains one-way encrypted with no decryption possible. TripleBlind’s novel cryptographic approaches can operate on any type of data (structured or unstructured images, text, voice, video), and perform any operation, including training of and inferring from AI and ML algorithms. An organization’s data remains fully encrypted throughout the transaction, which means that a third party never sees the raw data because it is stored behind the data owner organization’s firewall. In fact, there is no decryption key available, ever. When two health care organizations partner to share data, for instance, TripleBlind software de-identifies their data via one-way encryption; then, both partners access each other’s one-way encrypted data through an Application Programming Interface (API). That means each partner can use the other’s data for training an algorithm, for example, which in turn allows them to generate a more generalizable, less biased algorithm. During a recent conversation with Riddhiman Das, CEO for TripleBlind, he explained: “To build robust algorithms, you want to be able to access diverse training data so that your model is accurate and can generalize to many types of data. Historically, health care organizations have had to send their data to one another to accomplish this goal, which creates unacceptable risks. TripleBlind performs one-way encryption from both interacting organizations, and because there is no decryption possible, you cannot reconstruct the data. In addition, the data can only be used by an algorithm for the specific purpose spelled out in the business agreement.”

Developing innovative technological services is exciting work, with the potential to reshape the health care ecosystem worldwide. But along with the excitement is the challenge of keeping data safe and secure. Taking advantage of the many secure computing enclaves available on the market allows us to do just that.

Tuesday, August 31, 2021

Breast Cancer Screening: We Can Do Better

The three risk assessment tools now in use fall far short. Using the latest deep learning techniques, investigators are developing more personalized ways to locate women at high risk.


John Halamka, M.D., president, Mayo Clinic Platform, and Paul Cerrato, senior research analyst and communications specialist, Mayo Clinic Platform, wrote this article.

The promise of personalized medicine will eventually allow clinicians to offer individual patients more precise advice on prevention, early detection and treatment. Of course, the operative word is eventually. A closer examination of the screening tools available to detect breast cancer demonstrates that we still have a way to go before we can fulfill that promise. But with the help of better technology, we are getting closer to that realization.

Disease screening is about risk assessment. Researchers collect data on thousands of patients who develop breast cancer, for instance, and discover that the age range, family history and menstruation history of those who develop the disease differs significantly from those who remain free of it. That in turn allows policy makers to create a screening protocol that suggests women of a certain age who have experienced early menarche or late menopause are more likely to develop the malignancy. That risk assessment is consistent with the fact that more reproductive years means more exposure to the hormones that contribute to breast cancer. Similarly, there’s evidence to show that women with first degree relatives with the cancer and those with a history of ovarian cancer or HRT use are at greater risk.

Statistics like this are the basis for several breast cancer risk scoring systems, including the Gail score, the IBIS score, and BCSC tool.  The National Cancer Institute, which uses the Gail model, explains: “The Breast Cancer Risk Assessment Tool allows health professionals to estimate a woman's risk of developing invasive breast cancer over the next 5 years and up to age 90 (lifetime risk). The tool uses a woman’s personal medical and reproductive history and the history of breast cancer among her first-degree relatives (mother, sisters, daughters) to estimate absolute breast cancer risk—her chance or probability of developing invasive breast cancer in a defined age interval.” While the screening tool saves lives, it can also be misleading. If, for example, it finds that a woman has a 1% likelihood of developing breast cancer, what that really means is a large population of women with those specific risk factors has a one in 100 risk of developing the disease. There is no way of knowing what the threat is for any one patient in that group. Similar problems exist for the International Breast Cancer Intervention Study (IBIS) score, based on the Tyrer-Cuzick Model, and the Breast Cancer Surveillance Consortium (BCSC) Risk Calculator. These 3 assessment tools can give patients a false sense of security if they don’t dive into the details. BCSC, for instance, cannot be applied to women younger that 35 or older than 74, nor does it accurately measure risk for anyone who has previously had ductal carcinoma in situ (DCIS), or had breast augmentation. Similarly, the NCI tool doesn’t accurately estimate risk in women with BRCA1 or BRCA1 mutation, as well as certain other subgroups.

During a conversation with Tufia Haddad, M.D,, a Mayo Clinic medical oncologist with specialty interest in precision medicine in breast cancer and artificial intelligence, she discussed the research she and her colleagues are doing to improve the risk assessment process and identify more high-risk women. Dr. Haddad pointed out that there are numerous obstacles that prevent women from obtaining the best possible risk assessment. Too many women do not have a primary care practitioner who might use a risk tool. And those that do have a PCP are more likely to have an evaluation based on the Breast Cancer Risk Assessment tool (the Gail model). “We prefer the Tyrer-Cuzick model in part because it incorporates more personal information for each individual patient including a detailed family history, a woman’s breast density from her mammogram, as well as her history of atypia or other high risk benign breast disease,” says Dr. Haddad. Unfortunately, the Tyrer-Cuzick method requires many more data elements to assess breast cancer risk, which discourages busy clinicians from using it.

Another obstacle to using any of these risk assessment tools is the fact that they don’t readily fit into the average physician’s clinical workflow. Ideally these tools should seamlessly integrate into the EHR system. Even better would be the incorporation of AI-enhanced algorithms that automate the abstraction of the required data elements from the patient’s record into the assessment tool. For example, the algorithm would flag a family history of breast cancer, increased breast density as determined during a mammogram, as well as hormone replacement therapy and insert those risk factors into the Tyrer-Cuzick tool.

Even with this AI-enhanced approach, all of the available risk models fall short because they take a population-based approach, as we mentioned above. Dr. Haddad and her colleagues are looking to make the assessment process more individualized, as are others work in this specialty. That model could incorporate each patient’s previous mammography results, their genetics and benign breast biopsy findings, and much more. Adam Yala, and his colleagues at MIT recently developed a mammography-based deep learning model designed to take this more sophisticated approach. Called Mirai, it was trained on a large data set from Massachusetts General Hospital and from facilities in Sweden and Taiwan.  The new model generated significantly better results for breast cancer risk prediction than the TC model.

Breast cancer risk assessment continues to evolve. And with better utilization of existing assessment tools and the assistance of deep learning, we can look forward to better patient outcomes.

Monday, August 23, 2021

Can Social Determinants of Health Predict Your Patient’s Future?

The evidence is mixed but suggests that these overlooked variables have a profound impact on each patient’s journey. 

This article was written by Tim Suther, Nicole Hobbs, Jeff McGinn, Matt Turner with Change Healthcare, John Halamka, MD, MS, president of Mayo Clinic Platform, and Paul Cerrato, senior research analyst and communications specialist, Mayo Clinic Platform.

By one estimate, social determinants of health (SDoH) influence up to 80% of health outcomes. Although reports like this suggest that these social factors have a major impact, thought leaders continue to debate whether they can also enhance the accuracy in predictive models. Resolving that debate is far from simple because the answer depends on the type, source and quality of the data, and the design of the model under consideration.

In general, we derive SDoH from subjective and objective sources. Subjective data includes self-reported or clinician-collected data such as patient reported outcomes, Z codes from ICD-10-CM that report factors that influence health status and interactions with health service providers, and other unstructured EHR data. Objective data includes individual-level and community-level data from government, public and private (and consumer behavior) sources; it’s usually more structured and often derived from national-level datasets.

Unfortunately, the research on the value of SDoH in predictive models varies widely. Some studies report no appreciable differences when SDoH are injected into models, while others report significant enhancements to predictive power. Unsurprisingly, these varying study results depend in part on levels of reliance on traditional clinical models and, most importantly, on the types and sources of SDoH data employed in the studies.

For example, a group from Johns Hopkins Bloomberg School of Public Health demonstrated SDoH predictive models can fail in part due to predictive model design as well as to EHR-level data that is unstructured and collected inconsistently.  They also demonstrated that dependence on data from EHR-derived population health databases for SDoH can be problematic because the data tends to be used as a proxy for individual-level social factors.  The problem lies in the fact that these proxies are often based on assumptions, not evidence. Other research supports the above and showcases the challenges of using SDoH data from sources that traditionally struggle with the comprehensive collection and standardization of these data types.

On a more positive note, several studies and healthcare articles have reported success by relying on objectively collected and/or highly structured and consistent data. For example, one study that used EHR-derived SDoH data sources found that the addition of structured data on median income, unemployment rate,  and education from trustworthy non-EHR sources  enhanced their model’s health prediction granularity for some of the most vulnerable subgroups of patients. In another study, collaboration between Stanford, Harvard, and the Imperial College London found that adding structured SDoH data from the US Census, along with using machine learning techniques, improved risk prediction model accuracy for hospitalization, death, and costs. They also showed that their models based on SDoH alone, as well as those based on clinical comorbidities alone, could predict health outcomes and costs. Similarly, researchers at The Ohio State University College of Medicine added community-level and consumer behavior data not available in standard EHR data and found it enhanced the study of and impact on obesity prevention.  Juhn et. al. at Mayo Clinic tapped telephone survey data and appended housing and neighborhood characteristic data from local government sources to create a socioeconomic status index (HOUSES). They first showed that HOUSES correlated well with outcome measures and later showed that HOUSES could even serve as a predictive tool for graft failure in patients.

Patient Level SDoH + Clinical Data = Predictive Power

Incorporating social factors into the healthcare equation can fill gaps needed at the point of care, but it also generates better healthcare predictions, but only when these determinants are patient level and linked to robust clinical data. Change Healthcare, for example, has curated such an integrated national-level dataset, linking billions of historical de-identified distinct medical claims with patient-level social, physical and behavioral determinants of health. One of this dataset’s most important uses is to understand the relative weight of specific patient SDOH factors, in comparison to clinical factors alone, for various therapeutic conditions, including COVID-19. For example, across Change Healthcare’s research, economic stability is repeatedly ranked as the highest or among the highest predictors of the healthcare experience. Despite this realization, most end users, including providers and payers, lack such visibility (or rely on geographic averages that are unhelpful in making accurate predictive models).

Incorporating SDoH data into predictive models holds much promise. Given the relative newness of SDoH data in predictive analytics, along with a lack of data standardization and scale, it’s not surprising to find varying degrees of success in using it to improve predictive health models. But as researchers learn more about the best types and sources of SDoH data to use, along with developing better-suited models for these types of data, we’re likely to see significant advances in healthcare predictive models. By combining the right data with the right models, SDoH are a powerful asset in predictive models of health, outcomes, and potential health disparities.

If you're still with us . . .

Please consider supporting Dr. Steve Parodi, Reed Abelson and I by "voting up" on our panel at the upcoming South by Southwest conference in March of 2022. Our proposed panel, "Extending the Stethoscope Into the Home," will dive into a discussion about acute health care for patients in their home and the infrastructures needed to support it. If you are so inclined to vote, please do so here.