FreshRSS

🔒
❌ Acerca de FreshRSS
Hay nuevos artículos disponibles. Pincha para refrescar la página.
AnteayerTus fuentes RSS

Evaluating AI-based comprehensive clinical decision support for sepsis and ARDS: protocol for a Clinician Turing Test

Por: Angeli Gazola · A. · Bishop · N. S. · Schmid · B. E. · Pirracchio · R. · Valley · T. S. · Bhavani · S. V. · Krutsinger · D. C. · Giannini · H. M. · Lu · Y. · Ungar · L. H. · Meyer · N. J. · Kerlin · M. P. · Weissman · G. E.
Introduction

Few artificial intelligence (AI) clinical decision support systems (CDSSs) are ever evaluated in practice. Although some signal of clinical effectiveness may be needed to justify AI deployment and testing, such data are typically unavailable in early-stage research. This conundrum is especially relevant in the intensive care unit (ICU), where conditions like sepsis and acute respiratory distress syndrome (ARDS) require high-stakes decisions. Our group developed the AI ventilator assistant (AVA), a novel AI CDSS for patients with sepsis ARDS receiving invasive mechanical ventilation. But the promising results of predictive performance estimates are not sufficient to assess AVA’s clinical safety and appropriateness prior to future evaluation and deployment. Therefore, we propose a Clinician Turing Test as a novel validation approach to determine whether clinicians can distinguish AVA-generated treatment recommendations from those enacted by real human clinicians. If AVA’s recommendations are consistently indistinguishable from those of real clinicians, thereby ‘passing’ this Turing test, this would provide a strong preclinical signal of safety and appropriateness.

Methods and analysis

This multisite, randomised, electronic, vignette-based Phase 1b study will use a Clinician Turing Test design. We aim to recruit 350 critical care clinicians, including physicians and advanced practice providers from six US hospitals. Participants will review nine clinical vignettes of patients with sepsis and ARDS derived from the Molecular Epidemiology of Severe Sepsis in the ICU cohort and an associated profile of a suggested treatment plan. For each participant–vignette combination, the source of the treatment profile will be randomly assigned (AI-generated by AVA vs the actually enacted treatment from real human clinicians) in a 1:1 allocation. The primary endpoint is the participants’ accuracy in identifying whether a treatment profile was AI-generated or human-generated, assessed using equivalence testing through a mixed-effects logistic regression model with random effects for participants and vignettes. Secondarily, a fitted binary classifier will assess discrimination ability using the C-statistic. Secondary endpoints include clinicians’ perceptions of the safety and appropriateness of the treatment profiles, confidence in distinguishing AI-generated and human-generated recommendations, interest in AI CDSSs for sepsis and ventilator management and the time to complete the survey. This novel Phase 1b design provides preliminary but essential information about an AI CDSS’s clinical appropriateness without the risk or cost of actual deployment, thereby informing decisions about future clinical implementation and evaluation in real clinical environments.

Ethics and dissemination

This protocol was approved by the Institutional Review Board of the University of Pennsylvania (Protocol #858201). Results are expected in 2026 and will be submitted for publication in peer-reviewed journals and presented at scientific conferences.

Trial registration number

NCT07025096.

❌