Adding Machine Learning to Natural Language Processing Can Improve VTE Risk Assessment
SOURCE: HEMATOLOGYADVISOR.COM
AUG 30, 2024
August 30, 2024
In the last decade, artificial intelligence (AI) has emerged as a potential game-changer in medicine, with several branches displaying broad utility in healthcare.1 This includes machine learning (ML), where the AI system learns from data to detect patterns on its own without significant human intervention, and natural language processing (NLP), which provides AI computers with the ability to recognize, generate, and manipulate human language.
Such capabilities have demonstrated the capacity to aid researchers and healthcare professionals alike by processing large amounts of data2 to support quicker and more accurate diagnosis,3 encourage novel drug discovery and development,4 and disease-prevention efforts at both the individual and population levels.5
In hematology, various studies have examined the use of AI for prediction, screening, diagnosis, and/or treatment of different conditions, including thrombosis, blood cancers, and cardiac arrhythmias.7 A systematic review and meta-analysis published in Blood Advances examined the concurrent use of ML and NLP for investigating cases of venous thromboembolism (VTE).8
“VTE [which includes both pulmonary embolism and deep VTE] is a leading cause of preventable death in hospitals and a crucial measure of quality and safety in healthcare,” explained corresponding author Rushad Patell, MD, at Beth Israel Deaconess Medical Center in Boston, Massachusetts. However, monitoring of affected patients is limited by the challenges of manual medical records review and diagnostic code interpretation.8
Aside from mortality, patients who experience acute pulmonary embolism also remain at risk for other complications, such as VTE recurrence, post-traumatic stress, chronic thromboembolic pulmonary hypertension, and post-thrombotic syndrome.8 As such, “improved detection can help clinicians understand better when and how VTE occurs, identify risk factors, and develop better prevention and treatment strategies.”
Although NLP could help to automate the process of medical records review and diagnostic code interpretation, rule-based NLP methods are time-consuming to implement. Combining ML and NLP together (ML-NLP) could avoid this limitation.
This analysis selected a total of 13 studies published before May 12, 2023 for systematic review from MEDLINE, EMBASE, PubMed, and the Web of Science, with 8 of the studies offering data for the meta-analysis. All included studies examined the use of ML-NLP to identify VTE diagnoses in electronic health records and met at least 13 of the 21 NLP-modified Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) items, demonstrating fair quality.
VTE…is a leading cause of preventable death in hospitals and a crucial measure of quality and safety in healthcare.
The authors noted that the highest-performing models used vectorization rather than “bag-of-words” and deep-learning techniques such as convolutional neural networks. Significant heterogeneity existed in the studies, and just four studies validated their model using an external dataset.
A total of 8 studies (with 10 different models) had a pooled sensitivity of 0.93 (95% CI, 0.88-0.96) and a specificity of 0.98 (95% CI, 0.97-0.99).8 The pooled positive predictive value (PPV) was 0.91 (95% CI, 0.87-0.94) and the negative predictive value (NPV) was 0.99 (95% CI, 0.98-0.99).
Further, the pooled performance of the 4 studies that used word vectorization16-18,24 was higher than studies that used a different approach, such as bag of words: sensitivity 0.96 (95% CI, 0.94-0.98) vs 0.91 (95% CI, 0.82-0.96), specificity 0.99 (95% CI, 0.94-0.99) vs 0.98 (95% CI, 0.96-0.99), PPV 0.96 (95% CI, 0.85-0.99) vs 0.89 (95% CI, 0.82-0.93), and NPV 0.99 (95% CI, 0.97-0.99) vs 0.98 (95% CI, 0.97-0.99).
These results showed that NLP with ML can be a successful method for identifying VTE. However, the researchers noted there was significant heterogeneity in the studies they examined. For example, there was variation in the evaluation metrics used and many of the studies used accuracy as their primary measure of efficacy for AI, among other factors.
The study authors emphasized that research is required in this area to progress ML-NLP toward real-world implementation, the authors caution. “Introducing ML for VTE identification in medical records comes with its own share of challenges,” explained lead study author Barbara Lam, MD, a clinical fellow in medicine at Beth Israel Deaconess Medical Center. “One big hurdle is data standardization, since studies often differ in how they report and evaluate model performance, making it tough to compare results. There is also an issue with external validation; many models don’t perform as well when tested on new datasets, which raises questions about their reliability in real-world settings.”
“Developing and training these models is not easy either; it takes a lot of data and resources,” added co-first author Pavlina Chrysafi, MD, an internal medicine resident at Mount Auburn Hospital in Cambridge, Massachusetts. “Plus, getting these AI tools to fit smoothly into current clinical workflows can be tricky and requires proper training and preparation for clinicians.”
Overall, Dr Chrysafi noted that utilizing these tools can lead to better outcomes for patients, if used correctly. “By automating the identification of health issues in records, ML-NLP can boost efficiency, cut down in errors, make large-scale disease more manageable and speed up research.”
References:
LATEST NEWS
Devices
Here’s how to stop annoying vibrations and notifications on your Samsung device
OCT 12, 2024
WHAT'S TRENDING
Data Science
5 Imaginative Data Science Projects That Can Make Your Portfolio Stand Out
OCT 05, 2022
SOURCE: STREETINSIDER.COM
SEP 19, 2024