Description
Chair: Julia Kempe
Physicians make critical time-constrained decisions everyday. Clinical predictive models can help physicians and administrators make decisions by forecasting clinical and operational events. Existing structured data based clinical predictive models have limited use in everyday practice due to complexity in data processing, model development, and deployment. Here, we show that unstructured clinical notes from the electronic health record can enable the training of clinical language models, which can be used as all-purpose clinical predictive engines with low-resistance development and deployment. Our approach leverages recent advances in natural language processing to train a large language model for medical language (NYUTron), and subsequently finetune it across a wide range of clinical and operational predictive tasks. We evaluated our approach within our health system for five such tasks: 30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay prediction, and insurance denial prediction. We show that NYUTron has an AUC of 78.7%-94.9%, with an improvement of 5.36%-14.7% AUC compared to traditional models. We additionally demonstrate the benefits of pretraining with clinical text, potential for increasing generalizability to different sites through finetuning, and demonstrate full deployment of our system in a prospective, single-arm trial. These results show the potential for using clinical language models in medicine to read alongside physicians and provide guidance at the point of care.