Background: Recent advances in natural language processing (NLP) have heightened the interest of the medical community in its application to health care in general, in particular to stroke, a medical emergency of great impact. In this rapidly evolving context, it is necessary to learn and understand the experience already accumulated by the medical and scientific community.
Objective: The aim of this scoping review was to explore the studies conducted in the last 10 years using NLP to assist the management of stroke emergencies so as to gain insight on the state of the art, its main contexts of application, and the software tools that are used.
Methods: Data were extracted from Scopus and Medline through PubMed, using the keywords “natural language processing” and “stroke.” Primary research questions were related to the phases, contexts, and types of textual data used in the studies. Secondary research questions were related to the numerical and statistical methods and the software used to process the data. The extracted data were structured in tables and their relative frequencies were calculated. The relationships between categories were analyzed through multiple correspondence analysis.
Results: Twenty-nine papers were included in the review, with the majority being cohort studies of ischemic stroke published in the last 2 years. The majority of papers focused on the use of NLP to assist in the diagnostic phase, followed by the outcome prognosis, using text data from diagnostic reports and in many cases annotations on medical images. The most frequent approach was based on general machine learning techniques applied to the results of relatively simple NLP methods with the support of ontologies and standard vocabularies. Although smaller in number, there has been an increasing body of studies using deep learning techniques on numerical and vectorized representations of the texts obtained with more sophisticated NLP tools.
Conclusions: Studies focused on NLP applied to stroke show specific trends that can be compared to the more general application of artificial intelligence to stroke. The purpose of using NLP is often to improve processes in a clinical context rather than to assist in the rehabilitation process. The state of the art in NLP is represented by deep learning architectures, among which Bidirectional Encoder Representations from Transformers has been found to be especially widely used in the medical field in general, and for stroke in particular, with an increasing focus on the processing of annotations on medical images.
Stroke, also called “brain attack,” is a medical emergency that occurs when blood flow to a part of the brain is disrupted caused by a clot blocking an artery or by a cerebral hemorrhage due to a ruptured artery. Stroke can result in a range of symptoms and complications depending on the area of the brain that is affected, having impacts on perception, motor control (typically weakness or paralysis on one side of the body, dizziness or difficulty with balance), or behavior (difficulty in speaking or understanding speech), which is a life-threatening emergency that requires immediate medical attention. Although mortality from stroke is decreasing in developed, high-income countries, it remains one of the leading causes of mortality and disability along with ischemic heart disease, and the prevalence of people living with the effects of stroke is increasing due to the growing and aging population .
Therefore, the economic and social costs related to the hospitalization, treatment, and recovery of stroke patients are increasing, and there is a growing demand for advanced technologies that can assist in clinical diagnosis, treatment, predictions of clinical events, intervention recommendations, rehabilitation programs, and related factors . For instance, a quick diagnosis and treatment of stroke is crucial as it leads to improved outcomes and prognosis among patients treated within the so-called “golden hour” [ ].
In this context, novel approaches that complement and go beyond evidence-based medicine are required. Tools based on artificial intelligence (AI), with their ability to process large amounts of data, have been widely discussed in recent years as one of the proposed approaches to improve the care of stroke, assisting in diagnosis, prognosis, treatment, and prevention [, ].
AI is an interdisciplinary science with multiple approaches, which in recent years has experienced a significant growth in the fields of machine learning (ML) and deep learning (DL). ML and DL algorithms can learn from data and improve their performance over time without being explicitly programmed, and these methods can deal with very large and complex data sets. DL is considered a recent specialization of ML, which uses artificial neural networks to extract complex representations and features from data. Throughout the manuscript, a distinction is made between DL, used for algorithms based on multilayered neural networks, and traditional ML based on other techniques.
The application of AI to the management of stroke is a topic that has gained a lot of traction in the general field of health informatics , partly owing to the remarkable impact of stroke in public health and the subsequent high demand for effective and efficient tools to diagnose and treat stroke. Moreover, the complexity and variety of stroke casuistry make it a good target for AI solutions, which are especially suited to process large amounts of data from a wide range of sources, identify patterns and trends in large data sets, and learn and adapt to new data.
A domain where those advances have produced particularly good results is natural language processing (NLP), which is a promising tool for medicine to unlock the full potential of electronic health records (EHRs), since it might be used to automatically transform clinical text into structured clinical data that can guide clinical decisions [, ]. The potential of NLP in the analysis of EHR data is particularly appealing given the great quantity of data contained in these records. Notwithstanding their importance, such data are intractable with conventional mathematical methods, since they are recorded in clinical reports, prescriptions, annotations on medical images, and generally unstructured texts [ ].
NLP can assist in the identification of patterns and trends in large data sets, which can improve the understanding of factors that contribute to the development of diseases and can in turn help to define more effective prevention and treatment strategies. NLP can also be used in the analysis of particular cases to guide decisions and potentially delay or prevent the onset of the disease. NLP can also be used to develop intelligent systems to find relevant information in the medical literature .
Nevertheless, NLP poses particular challenges, including the protection of privacy in the extraction of data, since personal information is often mixed with other data; the variety of the quality and format of EHR data, which depend on the source and software used to collect them; and the difficulty of annotating data samples for training . Therefore, to unlock the potential of NLP in the exploitation of EHRs, researchers and developers need to combine different advanced ML techniques, apply careful data management, and gain a deep understanding of the clinical domain. There is, however, a paucity of guidance on selecting appropriate methods tailored to the health care industry [ ].
This scoping review aimed to gather the knowledge that might help in that guidance by investigating how NLP is used to deliver a smarter health care in different phases of stroke disorders (prevention, diagnosis, treatment, and prognosis). The primary questions that served as a guide for the review are: (1) In which phases or contexts of stroke management is NLP used (prevention, diagnosis, treatment, and/or prognosis)? (2) Which are the main benefits of applying NLP to stroke management, related to clinical, social, and economic factors? and (3) What types of clinical data are collected and used by NLP in stroke management (ie, demographic data, medical notes, physical and functional examination, reports of laboratory or medical devices)?
This review also focused on the following secondary questions: (1) What NLP methods, AI algorithms, and tools are used in stroke studies? (2) Which AI techniques or frameworks are used to process and analyze the data? (3) Are there algorithms and NLP software specifically tuned for stroke? and (4) Which tools have the best performance and how do they compare to others?
The unregistered protocol for this review was created following the PRISMA-ScR (Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews) guidelines  and the JBI Manual for Scoping Reviews [ ].
The target patient population of this scoping review included adults that had suffered stroke and people at risk of stroke due to a history of predisposing vascular background or other conditions that increase the risk of developing stroke, including mental illness or heart diseases such as a reduced ejection fraction.
The main concept of interest was the use of NLP in stroke management in public or private health care systems, including use cases and the data and technologies involved in those applications. We considered both the application of NLP for monitoring and decision-making of individual patients as well as for the planification of care resources in the management of stroke cases.
We were interested in any context where prevention, treatment, or rehabilitation of stroke might take place, ranging from early detection outside or inside clinical settings, diagnosis and evaluation of cases, clinical decision-making, administration and monitoring of rehabilitation, and postrehabilitation management.
The types of evidence sources taken into account included articles from peer-reviewed journals, books, and conference papers, considering both primary research studies and systematic or scoping reviews, as well as reports from scientific, medical, or government institutions.
The search was performed in the electronic databases of Scopus and Medline through PubMed, using the keywords “natural language processing” and “stroke,” restricted to articles published in the last 10 years, between 2013 and 2022.
The results of the search were imported into the Zotero Reference Manager software (Corporation for Digital Scholarship, Virginia), which was used to filter out duplicate records. Titles and abstracts of the filtered list were screened independently by two reviewers to ascertain their eligibility according to the inclusion criteria. Disagreements were resolved in a discussion session between the reviewers to obtain a consensus.
The full text of the papers was read by two independent reviewers to extract the relevant data as described below. An internal cross-validation by three other experts on the topic was also considered. Works whose content did not meet the eligibility criteria or did not contain sufficient information to answer the primary questions were excluded and those that reported the same results from the same study were treated as duplicates. The record of rejected works was shared by the reviewers to confirm the decisions of either part.
Data Extraction and Presentation of Results
The reviewers filled out a table with the following data from each work included in the final selection: type of study, primary diagnosis, related diseases that were used either as inclusion criteria or as predictors in the data analysis, sample size (if suitable), and qualitative responses to the primary and secondary questions.
Works were classified depending on whether or not they reported experimental studies, and those that did were further subclassified as clinical trials or different types of observational studies: cross-sectional, retrospective or prospective, and cohort or case-control studies.
A dictionary of terms was defined for the tabulated records of the primary and secondary questions and their relative frequencies were calculated. In addition, the relationships between answers were analyzed in two different multiple correspondence analyses (MCAs), which can be employed to detect and represent underlying structures in categorical data sets (ie, frequent co-occurrence of specific categories in two or more variables) . One of the MCAs focused on the primary questions, seeking relationships between the context of application (eg, classification of diagnostics, prognosis of outcomes) and the types of data that were processed. The other MCA focused on the secondary questions, seeking relationships between NLP methods and software tools. In both analyses, the type of AI models (general ML, DL, or rule-based algorithms) was also included as a variable. The analysis was performed in R [ ], using the packages factoMineR [ ] and factoextra [ ] for MCA and its graphical representation.
General Description of the Studies
A total of 115 unique papers were identified out of 223 records obtained in the search; 29 studies were eventually included for data extraction and analysis after screening by title and abstract and reading of the full text (see the flow diagram in).
The general characteristics of the 29 reviewed studies (year, type of study, target diseases, and sample size), together with the items extracted from the primary and secondary questions are respectively presented in, , and .
|Reference||Year||Type of study||Sample sizea||Type of stroke||Other conditions|
|Zhao et al ||2021||Cohort study||4914||Transient ischemic attack, hemorrhagic stroke||AFb|
|Zanotto et al ||2021||Retrospective cross-sectional cohort study||188||Ischemic stroke||AF, CADc, DMd, dyslipidemia, hypertension, smoking, othere|
|Sung et al ||2022||Retrospective cohort study||3847||Acute ischemic stroke||AF, CHFf, DM, cancer, hyperlipidemia, hypertension|
|Sung et al ||2021||Retrospective cohort study||3847||Acute ischemic stroke||AF, CHF, DM, cancer, hyperlipidemia, hypertension|
|Miller et al ||2022||Retrospective cohort study||918||Ischemic stroke||Other|
|Mayampurath et al ||2021||Cohort study||965||Acute ischemic stroke, hemorrhagic stroke||Other|
|Lineback et al ||2021||Retrospective cohort study||2855||Ischemic stroke, hemorrhagic stroke||AF, CAD, CHF, DM, cancer, hyperlipidemia, hypertension, other|
|Kogan et al ||2020||Retrospective cohort study||7149||Ischemic stroke, hemorrhagic stroke, transient ischemic attack||None|
|Heo et al ||2020||Retrospective cohort study||1810||Acute ischemic stroke||DM, dyslipidemia, hyperglycemia, hypertension, smoking, other|
|Deng et al ||2022||Feasibility study||1000 (simulated)||Hemorrhagic stroke||DM, hypertension|
|Bacchi et al ||2019||Cohort study||2201||Transient ischemic attack||None|
|Yu et al ||2021||Cohort study||1320||Ischemic stroke, hemorrhagic stroke||None|
|Wheater et al ||2019||Cohort study||2160||Ischemic stroke, hemorrhagic stroke||None|
|Sung et al ||2020||Cohort study||4640||Acute ischemic stroke||None|
|Sung et al ||2018||Feasibility study||90||Acute ischemic stroke||Hyperglycemia, other|
|Shek et al ||2021||Cohort study||2327||Stroke comorbidities||AF, CHF, DM, hypertension|
|Rannikmäe et al ||2021||Cohort study||207||Intracerebral hemorrhage, subarachnoid hemorrhage, and ischemic stroke||None|
|Ong et al ||2020||Cohort study||721||Acute ischemic stroke||None|
|Mowery et al ||2016||Cohort study||498||Ischemic stroke||CAD, CHF, DM, hypertension|
|Li et al ||2021||Cohort study||3971||Acute or subacute ischemic stroke||None|
|Leung et al ||2021||Cohort study||182||Not applicable||Other|
|Kim et al ||2019||Cohort study||3204||Acute ischemic stroke||None|
|Kent et al ||2021||Retrospective cohort study||261,960||Ischemic stroke||AF, CAD, CHF, DM, hyperlipidemia, hypertension, other|
|Lin et al ||2021||Retrospective cohort study||1700||Acute ischemic stroke||Other|
|Guan et al ||2021||Cohort study||1598||Ischemic stroke||CHF, other|
|Garg et al ||2019||Cohort study||1091||Ischemic stroke||AF, CAD, DM, hyperlipidemia, hypertension|
|Farran et al ||2022||Retrospective cohort study||16,916||Not applicable||AF|
|Elkin et al ||2021||Cohort study||96,681||Not applicable||AF|
|Bacchi et al ||2022||Cohort study||438||Ischemic stroke, hemorrhagic stroke||None|
aNumber of patients involved.
bAF: atrial fibrillation.
cCAD: coronary artery disease.
dDM: diabetes mellitus.
eOther refers to conditions that are not already listed in the table.
fCHF: coronary heart failure.
The vast majority were cohort studies that analyzed clinical aspects, along with societal or economic aspects of the disease in some cases, at the moment of data gathering. Approximately one third of the papers (n=10) also included a retrospective analysis and 2 of them were limited to feasibility studies. Although the search included a time span of 10 years, only one of the studies included in the review was older than 5 years  and most studies (n=19) had been published in the last 2 years (2021 or 2022).
Most studies (n=24) focused on ischemic stroke (either acute, subacute, or transient); the second most frequent type of stroke was hemorrhagic stroke (n=9), which in the majority of cases was in addition to and not excluding ischemic stroke (only 2 papers dealt exclusively with hemorrhagic stroke). Many studies considered other clinical conditions that were used to select the patients or were included as information taken into account by the models. The most common conditions were atrial fibrillation, diabetes mellitus, and hypertension; each of them was considered in one third of the reviewed papers (n=10). Other diseases that were considered with smaller frequency were hyper- or dyslipidemia, hyperglycemia, hypercholesterolemia, coronary heart failure, smoking, or cancer.
The sample size of the cohort studies was highly varied, ranging between 182 patients  and more than 260,000 patients [ ], with a median sample size of 2160 patients. The two feasibility studies were conducted either with simulated cases [ ] or with a smaller sample of 90 patients [ ].
shows the frequency of each category used to classify the answers to the primary and secondary questions, except for the question about the specificity of algorithms and NLP tools for stroke, since there was little variability in those answers.
|Reference||Context for NLPa use||Expected benefits||Types of clinical datab|
|Zhao et al ||Prevention and diagnosis (classification)||CLINICAL: improved triage||Demographic data, laboratory test results, medical history, medication|
|Zanotto et al ||Prognosis (outcomes)||CLINICAL: care information management, characterize patients, prediction of outcomes, risk assessment; SOCIETAL: supporting research studies; ECONOMIC: public health management||Diagnostic reports|
|Sung et al ||Prognosis (outcomes)||CLINICAL: prediction of outcomes||Annotated medical images, clinical scales, demographic data, diagnostic reports, medical history, patient treatments|
|Sung et al ||Prognosis (outcomes)||CLINICAL: prediction of outcomes, risk assessment||Annotated medical images, clinical scales, demographic data, diagnostic reports, functional outcomes data|
|Miller et al ||Prognosis (outcomes)||CLINICAL: prediction of outcomes, risk assessment||Annotated medical images, diagnostic reports|
|Mayampurath et al ||Diagnosis (classification)||CLINICAL: improved triage||Diagnostic reports|
|Lineback et al ||Prognosis (recurrence)||CLINICAL: care information management||Demographic data, diagnostic reports, medical history, medication, patient treatments|
|Kogan et al ||Prognosis (outcomes)||CLINICAL: administration of treatments, care information management, improved triage, prediction of outcomes||Demographic data, clinical scales, medical history, patient treatments, medication|
|Heo et al ||Prognosis (outcomes)||CLINICAL: prediction of outcomes||Annotated medical images, diagnostic reports|
|Deng et al ||Diagnosis (details); treatment||CLINICAL: administration of treatments||Annotated medical images, clinical scales, diagnostic reports, medical history|
|Bacchi et al ||Diagnosis (classification)||CLINICAL: stroke cause prediction||Annotated medical images, diagnostic reports, medical history, medication|
|Yu et al ||Diagnosis (details)||CLINICAL: improved triage; ECONOMIC: public health management||Annotated medical images, diagnostic reports|
|Wheater et al ||Diagnosis (classification)||CLINICAL: disease surveillance, improved triage; ECONOMIC: public health management||Annotated medical images, diagnostic reports|
|Sung et al ||Prevention and diagnosis (classification)||CLINICAL: administration of treatments, care information management, disease surveillance; ECONOMIC: public health management||Diagnostic reports|
|Sung et al ||Diagnosis (details); treatment||CLINICAL: administration of treatments||Diagnostic reports, laboratory test results, medical history|
|Shek et al ||Diagnosis (comorbidities)||CLINICAL: care information management||Demographic data, medical history|
|Rannikmäe et al ||Diagnosis (classification)||CLINICAL: improved triage||Annotated medical images, diagnostic reports|
|Ong et al ||Diagnosis (details)||CLINICAL: administration of treatments, prediction of outcomes; SOCIETAL: supporting research studies||Annotated medical images, diagnostic reports|
|Mowery et al ||Prevention||CLINICAL: risk assessment||Diagnostic reports|
|Li et al ||Diagnosis (classification)||CLINICAL: improved triage||Annotated medical images, diagnostic reports|
|Leung et al ||Diagnosis (details)||CLINICAL: care information management, characterize patients||Annotated medical images, diagnostic reports|
|Kim et al ||Diagnosis (classification)||CLINICAL: care information management, characterize patients||Annotated medical images, laboratory results, demographic data, diagnostic reports, functional outcomes data|
|Kent et al ||Prognosis (outcomes)||CLINICAL: care information management, characterize patients, stroke cause prediction||Annotated medical images, diagnostic reports|
|Lin et al ||Diagnosis (details); prognosis (recurrence)||SOCIETAL: supporting research studies||Diagnostic reports|
|Guan et al ||Diagnosis (classification)||CLINICAL: improved triage||Clinical scales, diagnostic reports|
|Garg et al ||Diagnosis (classification)||CLINICAL: improved triage, risk assessment||Annotated medical images, diagnostic reports, medical history|
|Farran et al ||Diagnosis (classification); prognosis (outcomes)||CLINICAL: stroke cause prediction, disease surveillance; ECONOMIC: public health management||Clinical scales, demographic data, medical history, patient treatments|
|Elkin et al ||Diagnosis (classification)||Not applicable||Clinical scales, demographic data|
|Bacchi et al ||Diagnosis (classification)||Not applicable||diagnostic reports, patient treatment|
aNLP: natural language processing.
bSeefor the definitions of clinical data types, following Jiang et al [ ].
|Reference||AIa technique||NLPb methodsc||Other statistical methodsc||Software packagesc,d||Performance metricsc||Best performing methods|
|Zhao et al ||MLe||Regular expressions||LRf, RFg||MedTagger, Weka||PPVh, NPVi, F1, sensitivity||RF|
|Zanotto et al ||ML||Ontologies (OWLj), BERTk, BOWl, TF-IDFm||CNNn, K-NNo, RF, SVMp, naïve Bayes||spaCy||PPV, F1, sensitivity||SVM ontological rules|
|Sung et al ||ML||Negation extraction ontologies (UMLSq)||Gradient boosting||Jazzy spell checker, MetaMap, XGBoostr||AUCs, IDIt, NRIu||Not applicable|
|Sung et al ||DL||BOW, BERT (ClinicalBERT)||Not applicable||Jazzy spell checker||AUC, IDI, NRI||Not applicable|
|Miller et al ||DL rule-based||BOW, negation extraction, TF-IDF, BERT (BioClinicalBERT)||LASSOv, K-NN, RF, MLPw||scikit-learn||AUC, PPV, sensitivity, specificity||BioClinicalBERT (except for rare and continuous outcomes)|
|Mayampurath et al ||ML||N-grams (1- or 2-)||SVM||Not applicable||AUC, PPV, NPV, sensitivity, specificity||Not applicable|
|Lineback et al ||ML||N-grams (1- or 2-), TF-IDF, Word-embedding (Word2Vec)||LASSO, LR, PCAx, RF, SVM, gradient boosting, naïve Bayes||XGBoost||AUC||ML methods in general|
|Kogan et al ||ML rule-based||Not applicable||RF, gradient boosting, MLP||Not applicable||Correlations, RMSEy||Not applicable|
|Heo et al ||DL||BOW, Word-embedding (sent2vec, BioWordVec)||Decision trees, CNN, LASSO, LSTMz, MLP, RF, SVM||Quanteda, NLTKaa, Tensorflow, Keras||AUC||Document-level methods, CNN|
|Deng et al ||DL rule-based||BERT||Not applicable||Not applicable||AUC, PPV, NPV, sensitivity, specificity||Not applicable|
|Bacchi et al ||DL||BOW, negation extraction||Decision trees, CNN, LSTM, RF||Not applicable||AUC, PPV, NPV, sensitivity, specificity||CNN|
|Yu et al ||Rule-based||Regular expressions||Not applicable||CHARTextract||PPV, NPV, accuracy, sensitivity, specificity||Not applicable|
|Wheater et al ||Rule-based||Regular expressions, grammatical analysis, ontologies (custom), negation extraction||Not applicable||BRAT rapid annotation tool||PPV, sensitivity, specificity||Not applicable|
|Sung et al ||ML rule-based||Grammatical analysis (part-of-speech), negation extraction, ontologies (UMLS)||Decision trees (CARTbb), K-NN, LR, RF, SVM||Google spell checker, MetaMap, Weka||Accuracy, κ||Mixed results|
|Sung et al ||Not applicable||Grammatical analysis (part-of-speech), negation extraction, ontologies (UMLS)||Not applicable||Google spell checker, MetaMap, Stata||NPV, F1, sensitivity, specificity||Document-level methods|
|Shek et al ||DL||Grammatical analysis, Negation extraction, Ontologies (SNOMEDcc)||Not applicable||MedCAT||NPV, F1, sensitivity, specificity||Not applicable|
|Rannikmäe et al ||ML rule-based||Ontologies (UMLS)||Not applicable||SemEHR||PPV, sensitivity||Mixed results|
|Ong et al ||DL||BOW, TF-IDF, Word-embedding (GloVEdd)||Decision trees (CART), K-NN, LR, LSTM, RF||scikit-learn, Tensorflow||AUC, F1, accuracy, sensitivity, specificity||GloVE + LSTM|
|Mowery et al ||Rule-based||Regular expressions||Not applicable||pyConTexT||PPV, NPV, sensitivity, specificity||Not applicable|
|Li et al ||ML||BOW, N-gram (2- and 3-), negation extraction||RF||scikit-learn, NLTK||F1, accuracy||Not applicable|
|Leung et al ||DL rule-based||Not applicable||Not applicable||MedTagger||PPV, NPV, accuracy, sensitivity, specificity||Not applicable|
|Kim et al ||ML||N-gram (1- and 2-), TF-IDF||Decision trees, LR, naïve Bayes, RF, SVM||Quanteda||AUC, F1||Single decision trees|
|Kent et al ||DL rule-based||Ontologies (named entity recognition)||Not applicable||MedTagger||PPV, NPV, accuracy, sensitivity, specificity||Not applicable|
|Lin et al ||DL||BERT (ClinicalBERT, StrokeBERT)||Not applicable||spaCy||AUC, F1||StrokeBERT|
|Guan et al ||ML||Regular expressions, negation extraction||Decision trees (CART), K-NN, LR, RF, SVM||Quanteda||AUC, PPV, NPV, F1, accuracy, specificity||RF|
|Garg et al ||ML||BOW, N-grams (1- to 3-)||Decision trees, K-NN, stacking LR, PCA, RF, SVM, gradient boosting||cTAKES, spaCy, XGBoost||AUC, sensitivity, κ||Stacking, LR, gradient boost|
|Farran et al ||ML||Ontologies (SNOMED), negation extraction||Not applicable||MedCAT||Accuracy||Not applicable|
|Elkin et al ||ML||Ontologies (SNOMED)||Not applicable||HD-NLPee||PPV, NPV, sensitivity, specificity||Not applicable|
|Bacchi et al ||ML||BOW, N-grams (1- to 3-), negation extraction||Decision trees, LR, RF||scikit-learn, NLTK||AUC, PPN, NPP, sensitivity, specificity||RF|
aAI: artificial intelligence.
bNLP: natural language processing.
cSee brief descriptions of the NLP tools, statistical methods, software packages, and performance metrics in[ - ].
dExcluding general programming frameworks like Python or R.
eML: machine learning.
fLR: logistic regression.
gRF: random forest.
hPPV: positive predictive value.
iNPV: negative predictive value.
jOWL: Web Ontology Language.
kBERT: Bidirectional Encoder Representations from Transformers.
mTF-IDF: term frequency-inverse document frequency.
nCNN: convolutional neural network.
oK-NN: K-nearest neighbor.
pSVM: support vector machine.
qUMLS: Unified Medical Language System.
rXGBoost: extreme gradient boosting.
sAUC: area under the curve.
tIDI: integrated discrimination index.
uNRI: Net Reclassification Index.
vLASSO: least absolute shrinkage and selection operator.
wMLP: multilayer perceptron.
xPCA: principal component analysis.
yRMSE: root mean squared error.
zLSTM: long short-term memory.
aaNLTK: Natural Language Processing toolkit for Python.
bbCART: classification and regression tree.
ccSNOMED: Systematized Nomenclature of Medicine.
ddGLoVE: Global Vectors for Word Representation.
eeHD-NLP: high-definition natural language processing.
|Variable and categoryb||Studies, n (%)|
|Diagnostic (classification)||13 (45)|
|Diagnostic (details)||6 (21)|
|Prognostic (outcomes)||8 (28)|
|Prognostic (recurrence)||2 (7)|
|Improved triage||9 (31)|
|Care information management||8 (28)|
|Prediction of outcomes||7 (24)|
|Administration of treatments||5 (17)|
|Risk assessment||5 (17)|
|Patient characterization||4 (14)|
|Disease surveillance||3 (10)|
|Stroke causes||3 (10)|
|Diagnostic reports||24 (83)|
|Annotated images||15 (52)|
|Medical history||10 (34)|
|Demographic data||9 (31)|
|Clinical scales||7 (24)|
|Laboratory results||3 (10)|
|Functional outcomes data||2 (7)|
|Artificial intelligence technique|
|Natural language processing tools|
|Negation extraction (NEGEX)||11 (38)|
|Bidirectional Encoder Representations from Transformers (BERT)||5 (17)|
|Regular expressions (REG-EXPR)||5 (17)|
|Grammatical analysis||4 (14)|
|Other statistical tools|
|Random forest (RF)||14 (48)|
|Decision trees||8 (28)|
|Support vector machine (SVM)||7 (24)|
|Logistic regression (LR)||7 (24)|
|K-nearest neighbor (K-NN)||6 (21)|
|Gradient boosting||4 (14)|
|Naïve Bayes||3 (10)|
|Multilayer perceptron (MLP)||3 (10)|
|Long short-term memory (LSTM)||3 (10)|
|Principal component analysis (PCA)||2 (7)|
|Based on ratios (PPVh, NPVi, F1, accuracy, sensitivity, or specificity)||23 (79)|
|Based on ROCj curves (AUCk, C-statistic)||14 (48)|
|Differential measures (NRIl, IDIm)||2 (7)|
aOnly the items that occurred more than once are reported in this table; however, since different items often overlapped in each study, the frequencies of each variable normally sum to more than 100%.
bSee brief descriptions of the NLP tools, statistical methods, software packages, and performance metrics in[ - ].
cML: machine learning.
dDL: deep learning.
eTF-IDF: term frequency-inverse document frequency.
fNLTK: Natural Language Processing toolkit for Python.
gXGBoost: extreme gradient boosting.
hPPV: positive predictive value.
iNPV: negative predictive value.
jROC: receiver operating characteristic.
kAUC: area under the curve.
lNRI: Net Reclassification Index.
mIDI: integrated discrimination index.
The most frequent context of stroke in which the studies were applied was the diagnostic phase, followed by the prognosis of outcomes. The potential benefit of the results on clinical processes (eg, improving the triage of patients depending on the type or severity of stroke, more efficient management of care information) was the main focus of all studies but one , which chiefly focused on the societal aspect of supporting research studies, similar to two other studies that also evaluated that aspect along with clinical applications. Five of the 29 studies (17%) also considered the potential economic benefit of NLP, in terms of reducing the costs of stroke for the public health sector.
The most frequent source of data for NLP models was diagnostic reports (n=24), followed in many cases by annotations on medical images such as radiographs and scans (n=15). General ML models were used more frequently than DL or rule-based algorithms to process the data (n=15 for ML vs n=10 papers for either DL or rule-based techniques). NLP tools, other statistical methods, and the software packages that were used to implement them highly varied across papers, although there were some associations with the AI technique and other variables (see the next subsection).
In nearly all studies, the AI architectures and algorithms had been adapted to deal with stroke-related data, except for one study that used an ML model for patients with severe mental illness at risk of stroke . One of the studies actually used a software tool that was specifically designed for stroke [ ], StrokeBERT, which is a language representation model based on Google’s Bidirectional Encoder Representations from Transformers (BERT) [ ]. Other studies used models that were adapted to broader medical terminology, including ClinicalBERT [ ], BioClinicalBERT [ ], and BioWordVec [ ], or models tuned with standard medical vocabularies such as Systematized Nomenclature of Medicine (SNOMED) [ ] or Unified Medical Language System (UMLS) [ ].
The methods used to compare the performance of the models were also highly varied, although in the greatest majority of cases (n=23) they were metrics based on the ratios of true/false-positive or -negative values (positive predictive value, negative predictive value, sensitivity, specificity, F1 score, or accuracy), and many were based on the receiver operating characteristic curve (n=14); a few studies (n=2) also used measures of classification improvements such as the net reclassification index and the integrated discrimination index , and only one study used other statistics such as correlation coefficients or the root mean squared error [ ].
Owing to the variety of methods and tools used in the studies, there were few coincidences in the selection of the best ones. The only methods that were chosen as the best performing in more than one study were random forest (n=3), convolutional neural network (n=2), and BERT (n=2).
Multiple Correspondence Analysis
and show the proximity of the categories that exhibited the closest relationships in the two first dimensions obtained in the MCA.
The common variable used in the analysis (AI technique) was clearly distinguished in the first two dimensions of the MCA plot, which on the one hand separated rule-based techniques from ML and DL and on the other hand separated general ML from DL.
In the first MCA (), it could be observed that the studies focusing on the classification of diagnostics (often used for the triage of patients) and prospects of recurrent stroke were often those that also used ML techniques with demographic data and information on treatments. Although the other categories were less tightly related, the text associated with clinical tests and the annotations on images were related more closely to prognostics of outcomes than to other contexts of application, with annotated images also being used to ascertain details of the stroke episode. Both types of studies were frequently approached by DL and sometimes by rule-based techniques.
In the other MCA (), AI techniques were separated between ML, DL, and rule-based methods in the two main dimensions of the projected space, although only general ML and DL were closely related to other items.
ML was related to NLP methods that are used in the first steps of the processing pipeline, such as the extraction of text tokens in the form of n-grams, detection of negated terms, and use of standard vocabularies. This was mostly performed with software tools such as MetaMap, MedCAT, Quanteda, and extreme gradient boosting.
Conversely, DL was more associated with the usage of BERT, a language representation model based on transformers , and NLP methods applied to numerical and vectorized representations of the language tokens, such as the “bag-of-words,” term frequency-inverse document frequency word embeddings, and other word embeddings. This was chiefly performed with software packages such as Tensorflow through Keras and scikit-learn. Other software packages that are often used for NLP, such as Natural Language Processing toolkit for Python, were observed in the middle of the primary axis of the MCA plot, halfway between the general ML and DL architectures.
The research on AI for stroke management has gained greater interest and impact in the last few years , and the growing rate of publications found in this scoping review reveals that the same trend is occurring in research on NLP, which is a particular field of AI, applied to the same clinical condition. However, in other aspects, the studies focused on NLP show their own specific trends.
Although the search for this scoping review was very broad, and did not limit the type and phase of stroke to be studied, the vast majority of studies were focused on ischemic stroke in its acute, subacute, or transient stage, and the purpose of using NLP was to improve processes in a clinical context. This focus on clinical contexts is related to the relevance that is attributed to the unstructured information contained in EHRs, (ie, in notes, reports, and annotated images) as predictors of outcomes and complications, which are crucial for proper decision-making, together with the difficulty of processing that information automatically with traditional tools. The deployment of NLP models integrated in the pipelines of an EHR, programmed to automatically ingest and process incoming records , or even the patients’ commentaries in emergency through voice-to-text [ ], may be used to identify patients at high risk and requiring prompt access to specific treatments; find signs to anticipate impending stroke; or evaluate its severity, type, and risks of complications.
Efficient triage of patients in emergency and early consultations, more accurate diagnostics, or prognostics of outcomes and recurrence were the main intended applications of NLP models in the reviewed studies. Accordingly, the main sources of information exploited by NLP algorithms were clinical data of the patients obtained from their history, especially the diagnostic reports of the current stroke episode. Administration and monitoring of rehabilitation, or postrehabilitation management, were not dealt with in the final selection of studies that were the object of the review.
NLP is itself a broad concept, which involves many types of computational techniques. In its more general sense, NLP comprises all methods and tools that can be used to analyze texts in order to represent human languages, based either on theory of language constructs, semantic mappings, or emulation of linguistic processes occurring in the human brain . The relationships between these tools, types of statistical and ML models, data sources, and applications found by the MCA help to understand how each subset of techniques can be used to solve different problems, and can also help to interpret some trends in the evolution of this technology applied to the clinical management of stroke.
Some of these methods rely on text-processing algorithms that use predefined rules and vocabularies, such as the tokenization of long texts into smaller items, categorization of those items in parts of speech, and construction of syntactic structures, and they have been widely used since long before the recent revolution of big data and DL fields. What this revolution has provided to the field of NLP is the maturity of more complex representations of language data, such as the word embeddings into large-dimensional numeric vectors and their effective processing through deep neural networks, as well as the exploitation of huge databases of texts, such as the Common Crawl data set that includes petabytes of text data, crawled monthly from dozens of billions of web pages .
In this context, the state of the art in NLP is represented by DL architectures such as GPT, XLNet, or BERT . Among these, BERT has been found to be particularly widely used in the medical field in general, and for stroke in particular, along with specialized versions fitted to these applications that improve their performance [ , ]. More basic ML algorithms and hybrid approaches with rule-based techniques are still more present than advanced DL networks in the recent research on NLP for stroke, and in some cases, tailored rule-based systems outperformed BERT and its derivatives [ , ]. Support vector machine methods were also found to perform better than BERT in one study [ ], although random forest was reported to have the best performance more frequently than any other ML method in the set of reviewed studies [ , , ]. Some of these results may seem unexpected, given the remarkable performance of DL in general, and particularly large language models (LLMs), in other areas. However, the computational complexity and large data sets needed to train LLMs can limit their current scalability, not outperforming other ML methods that work better on limited training data such as the data sets of the mentioned studies.
The prevalence of studies based on traditional ML methods over those that use DL neural networks may be partly due to the recency of the more complex DL architectures, as well as to the need of larger sets of data to train those models, which raises the bar to conduct studies with that approach. However, it is also interesting to observe that the choice of the AI technique also relates to the type of data that are processed and the context of application of NLP, such that DL is more closely related to studies that involve medical imaging with annotations to prognosticate the outcomes of stroke.
Taking into account these pieces of evidence, and considering the future of NLP in stroke, further development of LLMs in the biomedical field may be expected. LLMs emerged in 2018 as a class of language models that use neural networks with billions of parameters trained on huge amounts of unlabeled text data through self-supervised learning. LLMs are often based on transformers, a self-attention mechanism to compute contextual relationships between the input tokens . However, innovation in the NLP field will come from the development of these models for medical specialties such as stroke. These biomedical LLMs can be trained not only with data sources from EHRs but also from scientific and clinical publications and social network posts from specialized fields. The particularity is that these models need to be trained on much larger databases than those used by classical ML algorithms to achieve adequate performance metrics. This involves combining computational resources and very large data sources, an option that is not always available for the existing resources in research.
This review was conducted within the framework of the IBERUS project Technological Network of Biomedical Engineering Applied to Degenerative Pathologies of the Neuromusculoskeletal System in Clinical and Outpatient Settings (CER-20211003) and the CERVERA Network financed by the Ministry of Science and Innovation through the Center for Industrial Technological Development (CDTI), charged to the General State Budgets 2021 and the Recovery, Transformation, and Resilience Plan.
Conflicts of Interest
Categories of clinical data.DOCX File , 15 KB
Description of artificial intelligence (AI), natural language processing (NLP), and statistical tools.DOCX File , 20 KB
PRISMA-ScR checklist.PDF File (Adobe PDF File), 103 KB
- Stinear CM, Lang CE, Zeiler S, Byblow WD. Advances and challenges in stroke rehabilitation. Lancet Neurol 2020 Apr;19(4):348-360 [CrossRef] [Medline]
- Sirsat MS, Fermé E, Câmara J. Machine learning for brain stroke: a review. J Stroke Cerebrovasc Dis 2020 Oct;29(10):105162 [CrossRef] [Medline]
- Abedi V, Khan A, Chaudhary D, Misra D, Avula V, Mathrawala D, et al. Using artificial intelligence for improving stroke diagnosis in emergency departments: a practical framework. Ther Adv Neurol Disord 2020 Aug 25;13:1756286420938962 [https://journals.sagepub.com/doi/10.1177/1756286420938962?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub 0pubmed] [CrossRef] [Medline]
- Thompson MP, Fanaroff AC, Parker JD, Vallabhajosyula S, Sterling MR. Focusing on the future of cardiovascular outcomes research: highlights From the American Heart Association/American Stroke Association Quality of Care and Outcomes Research 2018 Scientific Sessions. Circ Cardiovasc Qual Outcomes 2018 Jun;11(6):e004871 [CrossRef] [Medline]
- Luvizutto GJ, Silva GF, Nascimento MR, Sousa Santos KC, Appelt PA, de Moura Neto E, et al. Use of artificial intelligence as an instrument of evaluation after stroke: a scoping review based on international classification of functioning, disability and health concept. Top Stroke Rehabil 2022 Jul 11;29(5):331-346 [CrossRef] [Medline]
- Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol 2017 Dec;2(4):230-243 [https://svn.bmj.com/lookup/pmidlookup?view=long&pmid=29507784] [CrossRef] [Medline]
- Sheikhalishahi S, Miotto R, Dudley JT, Lavelli A, Rinaldi F, Osmani V. Natural language processing of clinical notes on chronic diseases: systematic review. JMIR Med Inform 2019 Apr 27;7(2):e12239 [https://medinform.jmir.org/2019/2/e12239/] [CrossRef] [Medline]
- Adnan K, Akbar R, Khor S, Ali ABA. Role and challenges of unstructured big data in healthcare. In: Sharma N, Chakrabarti A, Balas VE, editors. Data management, analytics and innovation. Advances in intelligent systems and computing. Singapore: Springer; 2020:301-323
- Sneiderman CA, Rindflesch TC, Aronson AR. Finding the findings: identification of findings in medical literature using restricted natural language processing. Proc AMIA Annu Fall Symp 1996:239-243 [https://europepmc.org/abstract/MED/8947664] [Medline]
- Li I, Pan J, Goldwasser J, Verma N, Wong WP, Nuzumlalı MY, et al. Neural natural language processing for unstructured data in electronic health records: a review. Comput Sci Rev 2022 Nov;46:100511 [CrossRef]
- Shahid N, Rappon T, Berta W. Applications of artificial neural networks in health care organizational decision-making: a scoping review. PLoS One 2019;14(2):e0212356 [https://dx.plos.org/10.1371/journal.pone.0212356] [CrossRef] [Medline]
- Tricco AC, Lillie E, Zarin W, O'Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med 2018 Oct 02;169(7):467-473 [https://www.acpjournals.org/doi/abs/10.7326/M18-0850?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub 0pubmed] [CrossRef] [Medline]
- Peters M, Godfrey C, McInerney P, Munn Z, Tricco A, Khalil H. Chapter 11: Scoping reviews. In: Aromataris E, Munn Z, editors. JBI Manual for Evidence Synthesis. Adelaide, Australia: JBI Collaboration; 2020.
- Husson F, Josse J. Multiple correspondence analysis. In: Blasius J, Greenacre M, editors. Visualization and verbalization of data. Boca Raton, FL: Chapman and Hall/CRC; 2014.
- R Core Team. R: A Language and Environment for Statistical Computing. 2020. URL: http://www.R-project.org/ [accessed 2022-12-12]
- Lê S, Josse J, Husson F. FactoMineR: an R package for multivariate analysis. J Stat Soft 2008;25(1):1-18 [CrossRef]
- Kassambara A, Mundt F. Factoextra: extract and visualize the results of multivariate data analyses. CRAN R Project. 2020. URL: https://CRAN.R-project.org/package=factoextra [accessed 2022-12-12]
- Zhao Y, Fu S, Bielinski SJ, Decker PA, Chamberlain AM, Roger VL, et al. Natural language processing and machine learning for identifying incident stroke from electronic health records: algorithm development and validation. J Med Internet Res 2021 Mar 08;23(3):e22951 [https://www.jmir.org/2021/3/e22951/] [CrossRef] [Medline]
- Zanotto BS, Beck da Silva Etges AP, Dal Bosco A, Cortes EG, Ruschel R, De Souza AC, et al. Stroke outcome measurements from electronic medical records: cross-sectional study on the effectiveness of neural and nonneural classifiers. JMIR Med Inform 2021 Nov 01;9(11):e29120 [https://medinform.jmir.org/2021/11/e29120/] [CrossRef] [Medline]
- Sung S, Hsieh C, Hu Y. Early prediction of functional outcomes after acute ischemic stroke using unstructured clinical text: retrospective cohort study. JMIR Med Inform 2022 Feb 17;10(2):e29806 [https://medinform.jmir.org/2022/2/e29806/] [CrossRef] [Medline]
- Sung S, Chen C, Pan R, Hu Y, Jeng J. Natural language processing enhances prediction of functional outcome after acute ischemic stroke. J Am Heart Assoc 2021 Dec 21;10(24):e023486 [https://www.ahajournals.org/doi/10.1161/JAHA.121.023486?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub 0pubmed] [CrossRef] [Medline]
- Miller MI, Orfanoudaki A, Cronin M, Saglam H, So Yeon Kim I, Balogun O, et al. Natural language processing of radiology reports to detect complications of ischemic stroke. Neurocrit Care 2022 Aug 09;37(Suppl 2):291-302 [https://europepmc.org/abstract/MED/35534660] [CrossRef] [Medline]
- Mayampurath A, Parnianpour Z, Richards CT, Meurer WJ, Lee J, Ankenman B, et al. Improving prehospital stroke diagnosis using natural language processing of paramedic reports. Stroke 2021 Aug;52(8):2676-2679 [https://europepmc.org/abstract/MED/34162217] [CrossRef] [Medline]
- Lineback CM, Garg R, Oh E, Naidech AM, Holl JL, Prabhakaran S. Prediction of 30-day readmission after stroke using machine learning and natural language processing. Front Neurol 2021 Jul 13;12:649521 [https://europepmc.org/abstract/MED/34326805] [CrossRef] [Medline]
- Kogan E, Twyman K, Heap J, Milentijevic D, Lin JH, Alberts M. Assessing stroke severity using electronic health record data: a machine learning approach. BMC Med Inform Decis Mak 2020 Jan 08;20(1):8 [https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-019-1010-x] [CrossRef] [Medline]
- Heo TS, Kim YS, Choi JM, Jeong YS, Seo SY, Lee JH, et al. Prediction of stroke outcome using natural language processing-based machine learning of radiology report of brain MRI. J Pers Med 2020 Dec 16;10(4):286 [https://www.mdpi.com/resolver?pii=jpm10040286] [CrossRef] [Medline]
- Deng B, Zhu W, Sun X, Xie Y, Dan W, Zhan Y, et al. Development and validation of an automatic system for intracerebral hemorrhage medical text recognition and treatment plan output. Front Aging Neurosci 2022 Apr 8;14:798132 [https://europepmc.org/abstract/MED/35462698] [CrossRef] [Medline]
- Bacchi S, Zerner T, Oakden-Rayner L, Kleinig T, Patel S, Jannes J. Deep learning in the prediction of ischaemic stroke thrombolysis functional outcomes: a pilot study. Acad Radiol 2020 Feb;27(2):e19-e23 [CrossRef] [Medline]
- Yu AYX, Liu ZA, Pou-Prom C, Lopes K, Kapral MK, Aviv RI, et al. Automating stroke data extraction from free-text radiology reports using natural language processing: instrument validation study. JMIR Med Inform 2021 May 04;9(5):e24381 [https://medinform.jmir.org/2021/5/e24381/] [CrossRef] [Medline]
- Wheater E, Mair G, Sudlow C, Alex B, Grover C, Whiteley W. A validated natural language processing algorithm for brain imaging phenotypes from radiology reports in UK electronic health records. BMC Med Inform Decis Mak 2019 Sep 09;19(1):184 [https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-019-0908-7] [CrossRef] [Medline]
- Sung S, Lin C, Hu Y. EMR-based phenotyping of ischemic stroke using supervised machine learning and text mining techniques. IEEE J Biomed Health Inform 2020 Oct;24(10):2922-2931 [CrossRef]
- Sung S, Chen K, Wu DP, Hung L, Su Y, Hu Y. Applying natural language processing techniques to develop a task-specific EMR interface for timely stroke thrombolysis: A feasibility study. Int J Med Inform 2018 Apr;112:149-157 [CrossRef] [Medline]
- Shek A, Jiang Z, Teo J, Au Yeung J, Bhalla A, Richardson MP, et al. Machine learning-enabled multitrust audit of stroke comorbidities using natural language processing. Eur J Neurol 2021 Dec 29;28(12):4090-4097 [CrossRef] [Medline]
- Rannikmäe K, Wu H, Tominey S, Whiteley W, Allen N, Sudlow C, et al. Developing automated methods for disease subtyping in UK Biobank: an exemplar study on stroke. BMC Med Inform Decis Mak 2021 Jun 15;21(1):191 [https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-021-01556-0] [CrossRef] [Medline]
- Ong CJ, Orfanoudaki A, Zhang R, Caprasse FPM, Hutch M, Ma L, et al. Machine learning and natural language processing methods to identify ischemic stroke, acuity and location from radiology reports. PLoS One 2020 Jun 19;15(6):e0234908 [https://dx.plos.org/10.1371/journal.pone.0234908] [CrossRef] [Medline]
- Mowery DL, Chapman BE, Conway M, South BR, Madden E, Keyhani S, et al. Extracting a stroke phenotype risk factor from Veteran Health Administration clinical reports: an information content analysis. J Biomed Semantics 2016 May 10;7(1):26 [https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-016-0065-1] [CrossRef] [Medline]
- Li M, Lang M, Deng F, Chang K, Buch K, Rincon S, et al. Analysis of stroke detection during the COVID-19 pandemic using natural language processing of radiology reports. AJNR Am J Neuroradiol 2021 Mar;42(3):429-434 [http://www.ajnr.org/cgi/pmidlookup?view=long&pmid=33334851] [CrossRef] [Medline]
- Leung LY, Fu S, Luetmer PH, Kallmes DF, Madan N, Weinstein G, et al. Agreement between neuroimages and reports for natural language processing-based detection of silent brain infarcts and white matter disease. BMC Neurol 2021 May 11;21(1):189 [https://bmcneurol.biomedcentral.com/articles/10.1186/s12883-021-02221-9] [CrossRef] [Medline]
- Kim C, Zhu V, Obeid J, Lenert L. Natural language processing and machine learning algorithm to identify brain MRI reports with acute ischemic stroke. PLoS One 2019;14(2):e0212778 [https://dx.plos.org/10.1371/journal.pone.0212778] [CrossRef] [Medline]
- Kent DM, Leung LY, Zhou Y, Luetmer PH, Kallmes DF, Nelson J, et al. Association of silent cerebrovascular disease identified using natural language processing and future ischemic stroke. Neurology 2021 Sep 28;97(13):e1313-e1321 [https://europepmc.org/abstract/MED/34376505] [CrossRef] [Medline]
- Lin C, Hsu K, Liang C, Lee T, Liou C, Lee J, et al. A disease-specific language representation model for cerebrovascular disease research. Comput Methods Programs Biomed 2021 Nov;211:106446 [https://europepmc.org/abstract/MED/34627022] [CrossRef] [Medline]
- Guan W, Ko D, Khurshid S, Trisini Lipsanopoulos AT, Ashburner JM, Harrington LX, et al. Automated electronic phenotyping of cardioembolic stroke. Stroke 2021 Jan;52(1):181-189 [https://europepmc.org/abstract/MED/33297865] [CrossRef] [Medline]
- Garg R, Oh E, Naidech A, Kording K, Prabhakaran S. Automating ischemic stroke subtype classification using machine learning and natural language processing. J Stroke Cerebrovasc Dis 2019 Jul;28(7):2045-2051 [CrossRef] [Medline]
- Farran D, Bean D, Wang T, Msosa Y, Casetta C, Dobson R, et al. Anticoagulation for atrial fibrillation in people with serious mental illness in the general hospital setting. J Psychiatr Res 2022 Sep;153:167-173 [https://linkinghub.elsevier.com/retrieve/pii/S0022-3956(22)00350-8] [CrossRef] [Medline]
- Elkin PL, Mullin S, Mardekian J, Crowner C, Sakilay S, Sinha S, et al. Using artificial intelligence with natural language processing to combine electronic health record's structured and free text data to identify nonvalvular atrial fibrillation to decrease strokes and death: evaluation and case-control study. J Med Internet Res 2021 Nov 09;23(11):e28946 [https://www.jmir.org/2021/11/e28946/] [CrossRef] [Medline]
- Bacchi S, Gluck S, Koblar S, Jannes J, Kleinig T. Automated information extraction from free-text medical documents for stroke key performance indicators: a pilot study. Intern Med J 2022 Feb 20;52(2):315-317 [CrossRef] [Medline]
- Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. arXiv. 2019 May 24. URL: https://arxiv.org/abs/1810.04805 [accessed 2022-12-12]
- Pencina MJ, D'Agostino RB, D'Agostino RB, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 2008 Jan 30;27(2):157-72; discussion 207 [CrossRef] [Medline]
- Harkema H, Dowling JN, Thornblade T, Chapman WW. ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports. J Biomed Inform 2009 Oct;42(5):839-851 [https://linkinghub.elsevier.com/retrieve/pii/S1532-0464(09)00074-4] [CrossRef] [Medline]
- Resnick MP, LeHouillier F, Brown SH, Campbell KE, Montella D, Elkin PL. Automated modeling of clinical narrative with high definition natural language processing using Solor and Analysis Normal Form. Stud Health Technol Inform 2021 Nov 18;287:89-93 [https://europepmc.org/abstract/MED/34795088] [CrossRef] [Medline]
- Wu H, Toti G, Morley KI, Ibrahim ZM, Folarin A, Jackson R, et al. SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research. J Am Med Inform Assoc 2018 May 01;25(5):530-537 [https://europepmc.org/abstract/MED/29361077] [CrossRef] [Medline]
- Huang K, Altosaar J, Ranganath R. ClinicalBERT: modeling clinical notes and predicting hospital readmission. arXiv. 2020 Nov 29. URL: https://arxiv.org/abs/1904.05342 [accessed 2022-12-12]
- Alsentzer E, Murphy J, Boag W, Weng WH, Jindi D, Naumann T, et al. Publicly available clinical BERT embeddings. 2019 Presented at: 2nd Clinical Natural Language Processing Workshop; June 2019; Minneapolis, MN [CrossRef]
- Zhang Y, Chen Q, Yang Z, Lin H, Lu Z. BioWordVec, improving biomedical word embeddings with subword information and MeSH. Sci Data 2019 May 10;6(1):52 [CrossRef] [Medline]
- Stearns MQ, Price C, Spackman KA, Wang AY. SNOMED clinical terms: overview of the development process and project status. Proc AMIA Symp 2001:662-666 [https://europepmc.org/abstract/MED/11825268] [Medline]
- Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res 2004 Jan 01;32(Database issue):D267-D270 [https://europepmc.org/abstract/MED/14681409] [CrossRef] [Medline]
- Afshar M, Sharma B, Dligach D, Oguss M, Brown R, Chhabra N, et al. Development and multimodal validation of a substance misuse algorithm for referral to treatment using artificial intelligence (SMART-AI): a retrospective deep learning study. Lancet Digit Health 2022 Jun;4(6):e426-e435 [https://linkinghub.elsevier.com/retrieve/pii/S2589-7500(22)00041-3] [CrossRef] [Medline]
- Cho A, Min IK, Hong S, Chung HS, Lee HS, Kim JH. Effect of applying a real-time medical record input assistance system with voice artificial intelligence on triage task performance in the emergency department: prospective interventional study. JMIR Med Inform 2022 Aug 31;10(8):e39892 [https://medinform.jmir.org/2022/8/e39892/] [CrossRef] [Medline]
- Chowdhary KR. Natural language processing. In: Fundamentals of artificial intelligence. India: Springer; 2020:603-649
- Patel JM. Introduction to common crawl datasets. In: Getting structured data from the internet: running web crawlers/scrapers on a big data production scale. New York: Apress; 2020:277-324
- Topal MO, Bas A, van Heerden I. Exploring transformers in natural language generation: GPT, BERT, and XLNet. arXiv. 2021 Feb 16. URL: https://arxiv.org/abs/2102.08036 [accessed 2022-12-12]
- Fan L, Li L, Ma Z, Lee S, Yu H, Hemphill L. A bibliometric review of large language models research from 2017 to 2023. arXiv. 2023 Apr 03. URL: https://arxiv.org/abs/2304.02020 [accessed 2023-08-03]
|AI: artificial intelligence|
|BERT: Bidirectional Encoder Representations from Transformers|
|DL: deep learning|
|EHR: electronic health record|
|LLM: large language model|
|MCA: multiple correspondence analysis|
|ML: machine learning|
|NLP: natural language processing|
|PRISMA-ScR: Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews|
|SNOMED: Systematized Nomenclature of Medicine|
|UMLS: Unified Medical Language System|
Edited by C Lovis; submitted 03.05.23; peer-reviewed by J Heo, SF Sung; comments to author 05.06.23; revised version received 26.07.23; accepted 28.07.23; published 06.09.23Copyright
©Helios De Rosario, Salvador Pitarch-Corresa, Ignacio Pedrosa, Marina Vidal-Pedrós, Beatriz de Otto-López, Helena García-Mieres, Lydia Álvarez-Rodríguez. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 06.09.2023.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.