Sepsis sneaks up on thousands Americans every year. AI can spot it sooner.
This article originally appeared in undark .
Ten years ago, 12-year-old Rory Staunton dove for a ball in gym class and scraped his arm. He woke up the next day with a 104 F fever, so his parents took him to the pediatrician and eventually the emergency room. They were told it was only stomach flu. Rory died three days later from sepsis. The bacteria from the scrape had infiltrated his bloodstream and caused organ failure.
“How can that happen in a modern country?” his father Ciaran Staunton said in a recent interview to Undark.
Every year, sepsis in the United States kills more than a quarter of a million people — more than strokes, diabetes, and lung cancer. The reason for all the death is that sepsis is not well understood and can be fatal if it isn’t caught in time. Research has been intense on sepsis prevention, but current clinical support systems, which use electronic tools to improve patient care, have struggled to catch the disease early. They also use pop-up alarms that are inaccurate and prone to false alarms.
This could soon change. Back in July, Johns Hopkins researchers published a trio of studies in Nature Medicine and npj Digital Medicine, showcasing an early warning system that uses artificial intelligence. The system caught 82 percent of sepsis cases and reduced deaths by nearly 20 percent. Machine learning, or AI, has been touted as improving health care. However, most of the studies that have shown its benefits have been done using historical data. Undark was informed by sources that no AI algorithm has been proven to be effective when applied to patients in real time. Suchi Saria, director of the Machine Learning and Health Care Lab at Johns Hopkins University and senior author of the studies, said the novelty of this research is how “AI is implemented at the bedside, used by thousands of providers, and where we’re seeing lives saved.”
The Targeted real-time early warning system, or TREWS scans hospitals’ electronic medical records — digital copies of patients’ medical histories — in order to identify signs that could indicate sepsis and alert health care professionals about patients at high risk. Albert Wu, an internal medicine physician at Johns Hopkins, said that TREWS, which leverages vast amounts of data, provides real-time patient insight and a unique level transparency into its reasoning.
Wu stated that this system also provides a glimpse into the future of medical electronicization. Since their introduction in the 1960s, electronic health records have reshaped how physicians document clinical information, but decades later, these systems primarily serve as “an electronic notepad,” he added. Saria stated that electronic records could be used in new ways to transform health care delivery. This would provide physicians with additional eyes and ears, and help them make better medical decisions.
It is a tempting vision, but Saria, the CEO of the company that develops TREWS, has a financial interest. This vision overlooks the difficulties involved in implementing new medical technology. Providers may be reluctant to trust machine-learning tools and these systems might not work well outside of controlled research settings. Electronic health records also come with many existing problems, from burying providers under administrative work to risking patient safety because of software glitches.
Saria is nevertheless optimistic. She stated that technology exists and the data is available. “We need high-quality care augmentation tools that allow providers to do more .”
Currently, there is no single test for sepsis. This means that health care providers must review a patient’s medical history and conduct a physical exam. They also have to rely on their clinical impressions. Given such complexity, over the past decade doctors have increasingly leaned on electronic health records to help diagnose sepsis, mostly by employing a rules-based criteria — if this, then that.
One example of the SIRS criteria is that a patient is at high risk for sepsis if they have two of four signs: body temperature, heart rate and breathing rate, white blood cells count, and body temperature. Although this broadness is helpful in identifying the many signs of sepsis, it can also trigger many false positives. Consider a patient who has a broken arm. “A computerized system might tell a patient with a broken arm, ‘Hey, fast heart rate, breathing rapid,'” said Cyrus Shariat of Washington Hospital in California. Although the patient is unlikely to have sepsis, it would still set off an alarm.
These alerts also appear as pop-ups on providers’ computers, forcing them to halt whatever they are doing to respond. So, despite these rules-based systems occasionally reducing mortality, there’s a risk of alert fatigue, where health care workers start ignoring the flood of irritating reminders. M. Michael Shabot is a trauma surgeon who was also the former chief clinical officer at Memorial Hermann Health System. He said that it’s like a fire alarm being set off constantly. You are more susceptible to it. It’s not something you pay attention to .”
Electronic records aren’t very popular among doctors. In a 2018 survey, 71 percent of physicians said that the records greatly contribute to burnout and 69 percent that they take valuable time away from patients. Another 2016 study found that, for every hour spent on patient care, physicians have to devote two extra hours to electronic health records and desk work. James Adams, chair of the Department of Emergency Medicine at Northwestern University, called electronic health records a “congested morass of information.”
But Adams said that the industry is at an inflection stage to transform the files. Adams said that an electronic record does not have to be created by a doctor or nurse. Instead, it must be “transformed to be a clinical delivery tool.” Electronic records can be used to warn providers about sepsis, and other conditions. However, this will require more than a rules-based approach.
What doctors require, according to Shabot is an algorithm that can combine multiple streams of clinical data to give a clearer picture of what’s wrong.
Machine-learning algorithms use patterns in data to predict an outcome, such as a patient’s likelihood of sepsis. The algorithms are trained by researchers using existing data. This helps them create a model of the world and then make predictions with new data. The algorithms can also be actively improved and adapted over time without human intervention.
TREWS is a general mold. The researchers first trained the algorithm on historical electronic records data of 175,000 patient encounters, so it could recognize early signs of sepsis. The algorithm was then deployed in hospitals to improve patient care.
Saria, Wu and others published three studies on TREWS. The first tried to determine how accurate the system was, whether providers would actually use it, and if use led to earlier sepsis treatment. The second went a step further to see if using TREWS actually reduced patients’ mortality. And the third described what 20 providers who tested the tool thought about machine learning, including what factors facilitate versus hinder trust.
In these studies, TREWS scanned the data of patients in the emergency room and inpatient wards for signs of sepsis. This included vital signs, laboratory results, medication histories and provider notes. (Providers could do this themselves, Saria said, but it might take them about 20 to 40 minutes.) Based on analysis of millions of data points, the system could flag the patient and prompt providers to confirm sepsis or temporarily pause the alert.
” This is a colleague telling yo, based on data and having looked at all the charts, why they think there’s cause for concern.” Saria stated. “We want our frontline providers not to disagree, because they have ultimately their eyes focused on the patient.” TREWS continually learns from this feedback. This is what makes TREWS different from other electronic records tools for treating sepsis.
TREWS does not send out pop-up alerts to providers. Instead, the system adopts a passive approach. Alerts are displayed as icons on the patient’s list that providers can click on later. Initially, Saria was worried that this might be too passive: “Providers aren’t going to listen. They won’t agree. You’re mostly going to get ignored.” Instead, clinicians responded to 89 percent of the system’s alerts. As the third study revealed via in-depth interviews, TREWS was seen as less “irritating” than the previous rules-based system.
Saria stated that TREWS’ high acceptance rate indicates that providers trust AI tools. However, Fei Wang, an associate professor at Cornell University of health informatics, is skeptical about whether these findings will hold up if TREWS are deployed more widely. He called these studies first-of their kind and said he believes the results are encouraging. However, Wang noted that providers can be conservative or reticent to change. “It’s just difficult to convince physicians to use a tool they don’t know.” Until proven otherwise, any new system is a burden. Trust takes time.
TREWS’s capabilities are further limited by the fact that it only knows what has been entered into the electronic medical record. The system is not at the patient’s side. In an interview for the third study, one emergency department physician stated that the system cannot see what it can see. And even what it can see, it is filled with inaccurate, missing, or out-of-date information.
But Saria stated that TREWS’ strengths as well as its limitations are complementary to the ones of health care providers. Although the algorithm can analyze large amounts of clinical data in real time, it will still be limited by the quality of the electronic medical record. Saria stated that the goal is not to replace doctors, but to work with them and enhance their capabilities.
The most impressive aspect of TREWS, according to Zachary Lipton, an assistant professor of machine learning and operations research at Carnegie Mellon University, was not the model’s novelty, but the effort it must have taken to deploy it across five hospitals and 2,000 providers over a two-year period. “In this area there is a tremendous amount off-line research,” Lipton stated. However, it was not the model’s novelty that impressed Zachary Lipton, an assistant professor of machine learning and operations research at Carnegie Mellon University. It was the effort it took to deploy it across five hospitals and 2 providers over a two-year period. This requires collaborations engineers to administrators to product designers to system engineers to system engineers to administrators to product designers to system engineers to systems engineers to to administrators.
By demonstrating how the algorithm worked in a large clinical trial, TREWS has entered an exclusive club. This uniqueness could be temporary. As one example, Duke University’s Sepsis Watch algorithm is currently being tested across three hospitals, with results forthcoming. Sepsis Watch uses deep learning, a type of machine-learning that is not available in TREWS. This can provide more powerful insights but the algorithm that generates them is not clear. Computer scientists call this the black box problem. While the outputs and inputs are clearly visible, the process between them is not.
On the one hand, it’s a question of whether this is a problem. For instance, doctors don’t always know how drugs work, Adams said, “but at some point, we have to trust what the medicine is doing.” Lithium, for example, is a widely used, effective treatment for bipolar disorder, but nobody really understands how it works. Interpretability may not matter if an AI system is equally useful.
Wang suggested it was a dangerous conclusion. He asked, “How can you confidently state that your algorithm is correct?” It’s hard to know everything when the model’s mechanics are a blackbox. TREWS, a simpler algorithm that can explain its own, may be a better approach. “If you have this set of rules,” Wang said, “people can easily validate that everywhere.”
Indeed providers trust TREWS largely as they can see the measurements used in arriving at its alert. None of the clinicians interviewed understood machine learning fully, but this wasn’t necessary. As one provider who used TREWS said, “the extent that I can see all of the factors that are playing a role in the decision, that’s helpful to me to trust it. I don’t think my understanding must go beyond .”
While the algorithmic design of machine learning is important, the results must speak for themselves. By catching 82 percent of sepsis cases and reducing time to antibiotics by 1. 85 hours, TREWS reduced patient deaths by nearly one-fifth. Adams stated that TREWS is number one in terms of quality, number two by clinicians and number three for mortality. This combination makes it very special .”
Shariat was, however, more cautious about these findings. He was an ICU physician at Washington Hospital, California. These studies did not compare patients with sepsis who received the TREWS alert within 3 hours versus those without it. Shariat stated that they are simply telling us that the alert system we’re studying is more efficient if someone responds. A randomized controlled trial would have been a more robust approach. This is the gold standard in medical research. Half of patients received TREWS and the other half did not. Shariat agreed with Saria that randomization would have proved difficult due to patient safety concerns. He said that the absence of randomization “makes data less rigorous .”
Shariat is also concerned that alert fatigue, which results in about two-thirds of all alerts being false positives (and potentially overtreatment of fluids and antibiotics) could still be a problem. This can lead to serious medical complications such as pulmonary edema or antibiotic resistance. Saria acknowledged that TREWS’ false negative rate is lower than other electronic health records systems, but stressed that it will remain crucial for clinicians to use their judgment.
The studies also have a conflict-of-interest: Saria, Johns Hopkins and TREWS are entitled to revenue distribution. Shariat stated that if this is a hit and it’s sold to every hospital, it will make a lot of money. It’s billions and trillions of dollars .”
Saria claimed that these studies were subject to rigorous internal and external review processes in order to avoid conflicts of interest. The vast majority of study authors have no financial stake in the research. Shariat stated that it is crucial to have independent validation in order to confirm the findings and ensure that the system can be generalized.
The Epic Sepsis Model is a widely-used algorithm that scans electronic records but does not use machine learning. This cautionary example was provided by David Bates, chief general internal medicine at Brigham and Women’s Hospital. He described how the model was first developed in a few hospitals and then used at hundreds of others. The model then deteriorated, identifying only 33 percent of patients with sepsis and having a 88 percent false positive rate. “You can’t predict how much performance will degrade,” Bates stated, “without actually going to look .”
Despite the potential drawbacks of TREWS, Rory’s mom Orlaith Staunton told Undark that TREWS could’ve saved her son’s lives. She said that her son was in a complete state of breakdown and that none of his doctors had considered sepsis until it was too late. An early warning system that alerted them about the condition, she added, “would make the world of difference.”
After Rory’s death, the Stauntons started the nonprofit End Sepsis to ensure that no other family would have to go through their pain. New York State required hospitals to develop sepsis protocols. The CDC declared sepsis a medical emergency. Ciaran Staunton stated that none of this will bring back Rory.
This research is also personal to Saria. Her nephew, who was almost a decade old, died from sepsis. His doctors couldn’t do anything when it was discovered. She said, “It all happened too fast, and we lost him.” It is important to detect early. The difference between life and death can be as quick as a matter of minutes. “Last year we flew helicopters to Mars,” Saria stated. “But we’re still freaking out killing patients every day .”
Simar Basaj studies history of science at Harvard University. He is also a research fellow at Stanford University and Massachusetts General Hospital.