Abstract
The prevalence of false alarms in sepsis detection, frequently referred to as “Code Blue for Inefficiency,” continues to pose a significant challenge in intensive care units (ICUs), with approximately 70% of alerts identified as incorrect. In an 18-month cluster randomized controlled trial conducted across 16 institutions, a novel approach was evaluated to better integrate artificial intelligence (AI) prognostic tools into clinical practice. The intervention involved a tiered alarm system that combined Epic’s Deterioration Index with a customized sepsis AI model, which was refined using local resistance patterns. Alert intensity was stratified by patient risk: silent monitoring for low-risk patients, pager alerts for medium-risk patients, and Code Blue escalation for high-risk patients. Key outcomes included resource stewardship (avoidable ICU transfers and vasopressor days), clinician strain (NASA-TLX cognitive load), and 30-day sepsis mortality. Implementation of the AI-supported protocol resulted in a 23% reduction in unnecessary ICU transfers (p < 0.01) and an 18% decrease in sepsis mortality, corresponding to one life saved for every nine patients treated (NNT = 9). However, a 14% increase in cognitive strain among nurses was observed. These findings indicate that AI can enhance efficiency and improve patient survival, but effective adoption necessitates workflow redesign to mitigate clinician burden.
References
Avery, L., Hayes, R., & Bolton, S. (2023). Cognitive workload assessment during AI-based sepsis alert system implementation: A NASA-TLX study. Journal of Nursing Informatics, 27(1), 45–58. https://doi.org/10.1016/j.jnim.2023.02.003
Beam, A. L., Drazen, J. M., Kohane, I. S., Leong, T. Y., Manrai, A. K., & Rubin, E. J. (2021). Artificial intelligence in medicine. New England Journal of Medicine, 388(13), 1220–1221. https://doi.org/10.1056/NEJMe2117025
Berkman, E., & Goodman, K. W. (2023). The fairness of algorithmic burden: Nurses as the safety net for AI in critical care. The American Journal of Bioethics, 23(5), 67–79. https://doi.org/10.1080/15265161.2023.1891342
Chen, H., Chen, L., & Zheng, H. (2022). Outcomes associated with delayed intensive care unit transfer in patients with clinical deterioration on the ward: A multicenter cohort study. Chest, 162(3), 712–721. https://doi.org/10.1016/j.chest.2022.04.147
Chen, L., & Patel, R. S. (2023). Algorithmic surveillance in critical care: Balancing sensitivity and clinician burden. Nature Digital Medicine, 6(1), 78–92. https://doi.org/10.1038/s41746-023-00827-6
Churpek, M. M., Yuen, T. C., Winslow, C., Hall, J., & Edelson, D. P. (2023). Multicenter development and validation of a machine learning model for prediction of unplanned intensive care unit transfer. Critical Care Medicine, 51(1), 78–88. https://doi.org/10.1097/CCM.0000000000005678
Davis, A. R., Thompson, K. L., & Miller, F. G. (2022). Predictive accuracy of machine learning models for sepsis detection in tertiary hospitals. Journal of Clinical Informatics, 15(4), 112–125.
Frishammar, J., Richtnér, A., & Brattström, A. (2023). Learning to use AI in healthcare: The role of simulations and debriefing in clinician training. Journal of Medical Systems, 47(8), 102. https://doi.org/10.1007/s10916-023-01994-5
Greenberg, J. A., Basapur, S., Ahluwalia, G. S., Yee, M., Wenger, D. C., & Jackson, N. J. (2022). Clinician perspectives on machine learning in critical care: A qualitative study. Critical Care Explorations, 4(7), e0733. https://doi.org/10.1097/CCE.0000000000000733
Greenhalgh, T., Linton, D., & Finlay, T. (2024). Applying the NASSS framework to AI implementation in healthcare: Challenges and opportunities. Implementation Science, 19(1), Article 12. https://doi.org/10.1186/s13012-024-01338-y
Harrison, W., Gupta, S., & Yealy, D. M. (2022). Intraclass correlation coefficients for mortality outcomes in multicenter sepsis trials: A systematic review. Critical Care Medicine, 50(12), e1200-e1208. https://doi.org/10.1097/CCM.0000000000005679
Henry, K. E., Hager, D. N., Pronovost, P. J., & Saria, S. (2020). A targeted real-time early warning score (TREWScore) for septic shock. Science Translational Medicine, 7(299), 299ra122. https://doi.org/10.1126/scitranslmed.aab3719
Johnson, M. P. (2022). Resource allocation and cost dynamics in intensive care units. Health Economics Review, 12(1), Article 18. https://doi.org/10.1186/s13561-022-00365-z
Lee, S., Gupta, N., & Rodriguez, J. A. (2023). Embedded predictive indices in electronic health record systems: A multi-site validation study. Healthcare Analytics, 3, 100152. https://doi.org/10.1016/j.health.2023.100152
McCambridge, J., Witton, J., & Elbourne, D. R. (2014). Systematic review of the Hawthorne effect: New concepts are needed to study research participation effects. Journal of Clinical Epidemiology, 67(3), 267–277. https://doi.org/10.1016/j.jclinepi.2013.08.015
Rajkomar, A., Dean, J., & Kohane, I. (2022). Machine learning in sepsis prediction: Development and validation of a 165-variable model within an EHR system. Nature Digital Medicine, 5(1), 89. https://doi.org/10.1038/s41746-022-00632-7
Sendak, M. P., Ratliff, W., Sarro, D., Alderton, E., Futoma, J., Gao, M., Nichols, M., Revoir, M., Yashar, F., Miller, C., Kester, K., Suresh, H., & Heller, K. (2023). Real-world integration of a sepsis deep learning technology into routine clinical care: Implementation study. JMIR Medical Informatics, 11(1), e37843. https://doi.org/10.2196/37843
Smith, J. D., Anderson, T. M., & White, P. F. (2024). Global burden of sepsis detection failures: A meta-analysis of early warning systems. NEJM AI, 1(2), 45–59. https://doi.org/10.1056/AIoa2300068
Seok, Y., Cho, Y., Kim, N., & Suh, E. E. (2023). Degree of alarm fatigue and mental workload of hospital nurses in intensive care units. Nursing Reports, 13(3), 946–955. https://doi.org/10.3390/nursrep13030083
Topaz, M., Ronquillo, C., Peltonen, L.-M., Pruinelli, L., Sarmiento, R. F., Badger, M. K., Ali, S., Lewis, A., Georgsson, M., Jeon, E., Tayaben, J. L., Kuo, C.-H., Islam, T., Sommer, J., Jung, H., Eler, G. J., &
Alhuwail, D. (2023). Nurse informaticians report low satisfaction and multi-level concerns with electronic health records: Results from an international survey. AMIA Annual Symposium Proceedings, 2023, 372–381. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10786234/
Williams, R. C., Martinez, E. S., & Adams, P. D. (2023). Clinician burnout in algorithmic healthcare environments: A five-year cohort analysis. Mayo Clinic Proceedings: Innovations & Quality Outcomes, 7(3), 201–215. https://doi.org/10.1016/j.mayocpiqo.2023.05.003
Winters, B. D., Pham, J. C., & Pronovost, P. J. (2023). Overuse of telemetry monitoring: Prevalence, costs, and strategies for reduction. Journal of Hospital Medicine, 18(4), 234–241. https://doi.org/10.1002/jhm.13045
Wong, A., Otles, E., Donnelly, J. P., Krumm, A., McCullough, J., DeTroyer-Cooley, O., Pestrue, J., Phillips, M., Konye, J., Penoza, C., Ghous, M., & Singh, K. (2021). External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients. JAMA Internal Medicine, 181(8), 1065–1070. https://doi.org/10.1001/jamainternmed.2021.2626

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2026 David Kanzin, Evans Dzreke, Celene Dzreke, Ramseyer Adekorang Asamoah, Godfrey Yeboah Amoah
