John Halamka, M.D., president, Mayo Clinic Platform, and Paul Cerrato, senior research analyst and communications specialist, Mayo Clinic Platform, wrote this article.
If Mary Putnam Jacobi were alive today, she would probably embrace artificial intelligence (AI), machine learning, and data analytics. Dr. Jacobi might best be described as the mother of modern scientific medicine — or at the very least one of its founding parents. In 1868, she was the first woman to enroll in the University of Paris School of Medicine. After graduating in 1871, this unconventional thinker arrived in the U.S., where she advocated for the inclusion of laboratory science, experimentation and statistics as the foundation cornerstone of modern medical practice. Equally important, Jacobi “became a powerful advocate for the equal contribution of women to medicine.” Pushing clinicians to buy into the notion that experimentation and statistics were needed for good quality patient care may seem unimpressive today. Still, it was almost heresy in an age when the received wisdom from one’s medical school professor was all that was necessary to “demonstrate” that a treatment protocol was effective. Given her “color-outside-the-lines” approach, it’s not hard to imagine her becoming passionate about the role of AI in medicine — not as a panacea but as an indispensable adjunct to human reasoning and clinical trials.
The evidence shows that machine learning, advanced data analytics, and subgroup analysis all have the power to reinvent the way patient care is delivered and the potential to personalize that care. But as we said in an earlier blog, while there are several well-documented AI-based algorithms now available, there are also several questionable vendors that have rushed to market with little evidence to support their claims. This dilemma begs the question: How do we ensure that well-supported digital tools get the attention they deserve? Several new guidelines have been published to encourage more robust AI-related research, but clinicians also need to develop machine learning literacy. Put another way, we can all benefit from gaining a better grasp of what’s under the hood, explained in plain, jargon-free English. That was one of the goals of our last book, Reinventing Clinical Decision Support.
Our research demonstrates that clinical decision support certainly needs to be reinvented. If you look back over the last several decades, you’ll find that hundreds of clinical research studies have been misinterpreted. A review of 71 randomized clinical trials in 1987, for instance, concluded that numerous treatments were useless; a closer analysis, however, found that all 71 had generated false-negative results because the populations they studied were too small — a type 2 statistical error.  Fast forward to 1994; a JAMA analysis revealed that 383 RCTs, many published in the world’s top academic journals, likewise jumped to the conclusion that several treatments were ineffective, once again because the trials had enrolled too few patients. 
More recently, there’s reason to believe that the bedrock upon which many day-to-day clinical decisions rest is somewhat shaky. It’s a foundation that can be made stronger with ML-based tools like convolutional neural networks, random forest modeling, gradient boosting, and clustering. A case in point: The Look AHEAD study that published in 2013. This RCT assigned over 5,000 overweight patients with type 2 diabetes to either an intensive lifestyle modification program or a control group. The investigators' goal was to determine if the lifestyle program would reduce the incidence of cardiovascular events. The study was terminated early because there were no significant cardiovascular differences between the intervention and control groups.
An ML-fueled analysis that used random forest modeling to re-examine the Look AHEAD data turned these results upside down. During random forest analysis, a series of decision trees are created. Initially, the technique randomly splits all the available data — in this case, the stored characteristics of about 5,000 patients in the Look AHEAD study — into two halves. The first half serves as a training data set to generate hypotheses and construct the decision trees. The second half of the data serves as the testing data set.
Using this technique, Baum et al. constructed a forest that contained 1,000 decision trees. They looked at 84 risk factors that may have been influencing patients’ response or lack of response to the intensive lifestyle modification program, including numerous characteristics that researchers rarely, if ever, consider when doing a subgroup analysis. The random forest modeling also allowed the investigators to examine how these variables interact in multiple combinations to impact clinical outcomes. In the final analysis, Baum et al. discovered that intensive lifestyle modification did prevent cardiovascular events for two subgroups, patients with HbA1c 6.8% or higher (poorly managed diabetes) and patients with well-controlled diabetes (Hba1c < 6.8%) and good self-reported health. That finding applied to 85% of the entire patient population studied. On the other hand, the remaining 15% who had controlled diabetes but poor self-reported general health responded negatively to the lifestyle modification regimen. The negative and positive responders canceled each other out in the original Look AHEAD statistical analysis, falsely concluding that lifestyle modification was useless.
AI-enhanced data analyses like this only serve to reinforce Maru Putnam Jacobi’s contention that coloring outside the lines will propel health care in new directions. Her unconventional mindset, and ours, are best summed up in the words of Crosby Still and Nash: “Let your freak flag fly!”
1. Freiman JA et al. The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial. Survey of 71 "negative" trials. N Engl J Med. 1978; 299:690-694.
2. Moher D et al. Statistical power, sample size, and their reporting in randomized controlled trials. JAMA. 1994; 272:122.
3. Look AHEAD Research Group. Cardiovascular effects of intensive lifestyle intervention in type 2 diabetes. N Engl J Med. 2013. 369:145-154.
4. Baum, A et al. Targeting Weight Loss Interventions to Reduce Cardiovascular Complications of Type 2 Diabetes: A Machine Learning-Based Post-Hoc Analysis of Heterogeneous Treatment Effects in the Look AHEAD Trial. Lancet Diabetes Endocrinology, 2017; 5: 808–815.