Background

With ambient listening systems increasingly adopted in healthcare, analyzing clinician-patient conversations has become essential. The Omaha System is a standardized terminology for documenting patient care, classifying health problems into four domains across 42 problems and 377 signs/symptoms. Manually identifying and mapping these problems is time-consuming and labor-intensive. This study aims to automate health problem identification from clinician-patient conversations using large language models (LLMs) with retrieval-augmented generation (RAG).

Methods

Using the Omaha System framework, we analyzed 5118 utterances from 22 clinician-patient encounters in home healthcare. RAG-enhanced LLMs detected health problems and mapped them to Omaha System terminology. We evaluated different model configurations, including embedding models, context window sizes, parameter settings (top k, top p), and prompting strategies (zero-shot, few-shot, and chain-of-thought). Three LLMs—Llama 3.1-8B-Instruct, GPT-4o-mini, and GPT-o3-mini—were compared using precision, recall, and F1-score against expert annotations.

Results

The optimal configuration used a 1-utterance context window, top k = 15, top p = 0.6, and few-shot learning with chain-of-thought prompting. GPT-4o-mini achieved the highest F1-score (0.90) for both problem and sign/symptom identification, followed by GPT-o3-mini (0.83/0.82), while Llama 3.1-8B-Instruct performed worst (0.73/0.72).

Conclusions

Using the Omaha System, LLMs with RAG effectively automate health problem identification in clinical conversations. This approach can enhance documentation completeness, reduce documentation burden, and potentially improve patient outcomes through more comprehensive problem identification, translating into tangible improvements in clinical efficiency and care delivery.

Does synthetic data augmentation improve the performances of machine learning classifiers for identifying health problems in patient–nurse verbal communications in home healthcare settings?

Por: Jihye Kim Scroggins · Maxim Topaz · Jiyoun Song · Maryam Zolnoori

Abstract

Background

Identifying health problems in audio-recorded patient–nurse communication is important to improve outcomes in home healthcare patients who have complex conditions with increased risks of hospital utilization. Training machine learning classifiers for identifying problems requires resource-intensive human annotation.

Objective

To generate synthetic patient–nurse communication and to automatically annotate for common health problems encountered in home healthcare settings using GPT-4. We also examined whether augmenting real-world patient–nurse communication with synthetic data can improve the performance of machine learning to identify health problems.

Design

Secondary data analysis of patient–nurse verbal communication data in home healthcare settings.

Methods

The data were collected from one of the largest home healthcare organizations in the United States. We used 23 audio recordings of patient–nurse communications from 15 patients. The audio recordings were transcribed verbatim and manually annotated for health problems (e.g., circulation, skin, pain) indicated in the Omaha System Classification scheme. Synthetic data of patient–nurse communication were generated using the in-context learning prompting method, enhanced by chain-of-thought prompting to improve the automatic annotation performance. Machine learning classifiers were applied to three training datasets: real-world communication, synthetic communication, and real-world communication augmented by synthetic communication.

Results

Average F1 scores improved from 0.62 to 0.63 after training data were augmented with synthetic communication. The largest increase was observed using the XGBoost classifier where F1 scores improved from 0.61 to 0.64 (about 5% improvement). When trained solely on either real-world communication or synthetic communication, the classifiers showed comparable F1 scores of 0.62–0.61, respectively.

Conclusion

Integrating synthetic data improves machine learning classifiers' ability to identify health problems in home healthcare, with performance comparable to training on real-world data alone, highlighting the potential of synthetic data in healthcare analytics.

Clinical Relevance

This study demonstrates the clinical relevance of leveraging synthetic patient–nurse communication data to enhance machine learning classifier performances to identify health problems in home healthcare settings, which will contribute to more accurate and efficient problem identification and detection of home healthcare patients with complex health conditions.

FreshRSS

From Conversation to Standardized Terminology: An LLM‐RAG Approach for Automated Health Problem Identification in Home Healthcare

ABSTRACT

Background

Methods

Results

Conclusions

Clinical Relevance

Does synthetic data augmentation improve the performances of machine learning classifiers for identifying health problems in patient–nurse verbal communications in home healthcare settings?

Abstract

Background

Objective

Design

Methods

Results

Conclusion

Clinical Relevance