This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
Natural language processing has been established as an important tool when using unstructured text data; however, most studies in the medical field have been limited to a retrospective analysis of text entered manually by humans. Little research has focused on applying natural language processing to the conversion of raw voice data generated in the clinical field into text using speech-to-text algorithms.
In this study, we investigated the promptness and reliability of a real-time medical record input assistance system with voice artificial intelligence (RMIS-AI) and compared it to the manual method for triage tasks in the emergency department.
From June 4, 2021, to September 12, 2021, RMIS-AI, using a machine learning engine trained with 1717 triage cases over 6 months, was prospectively applied in clinical practice in a triage unit. We analyzed a total of 1063 triage tasks performed by 19 triage nurses who agreed to participate. The primary outcome was the time for participants to perform the triage task.
The median time for participants to perform the triage task was 204 (IQR 155, 277) seconds by RMIS-AI and 231 (IQR 180, 313) seconds using manual method; this difference was statistically significant (
RMIS-AI improves the promptness in performing triage tasks as compared to using the manual input method. However, to make it a reliable alternative to the conventional method, technical supplementation and additional research should be pursued.
An essential role of a hospital emergency department (ED) is to prioritize treatment for patients according to urgency and symptom severity [
Because the results derived through the triage system must be immediately recorded and shared with the medical staff in charge of the next process, a prompt triage system is crucial for an efficient ED. In addition, the results from the triage system are reported to have a significant influence on clinical outcomes [
Recent advances in machine learning and natural language processing (NLP) are a prominent development in health informatics and are relevant in emergency medicine [
This study was conducted in accordance with the revised Declaration of Helsinki and was reviewed and approved by the Institutional Review Board of Severance Hospital, South Korea (approval number 4-2020-0598).
We performed a prospective interventional study. This study was conducted at a Level 1 ED at a tertiary hospital located in northwestern Seoul (the capital city of South Korea), where 90,000 patients visit annually. The hospital’s ED is responsible for receiving patients who cannot be stabilized in this catchment area. Participants were recruited through an official announcement period from November 1, 2020, to the end of January 2021. Among the nurses performing triage work in the hospital’s ED, 19 nurses who listened to the contents and process of the study voluntarily agreed to participate in the study. They had more than three years of ED work experience. Exclusion criteria included candidates who (1) withdrew their intention to participate, or (2) had physical symptoms that made it difficult for them to wear a voice recognition microphone. Informed consent was obtained from all participants before enrollment.
Because conversations in the triage unit contained a large amount of information and noise, a device that can select and record these conversations was needed. A machine learning framework created by Selvas AI Inc (Seoul, Republic of Korea) was used in this study. The voice recognition solution provided by Selvas analyzes sound information and converts it into text, commands, and various forms of information. The application of continuous word recognition engines, which recognize unstructured speech, has expanded to different fields; for example, a speech recognition engine in this study has been exclusively developed for the medical field. In our ED, the triage nurses are supposed to record the results of performing a task in a triage note. This triage note consists of the following items: chief concern, past medical history, the presence of allergic diseases, vital signs such as systolic and diastolic blood pressure, heart rate, respiratory rate, body temperature, and oxygen saturation. To train the engine, triage nurses who agreed to participate in this study performed the clinical practice wearing Bluetooth microphones (Aftershokz Aeropex, AS 800, Aftershokz LLC). Voice recording files that passed through the engine were immediately converted into textual data, without prior editing, and stored as log records. Subsequently, the engine repeatedly trained the NLP to fill the items constituting the triage note using the transcribed textual data. The Bluetooth microphone was selected as a component of a noise-resistant system in accordance with the ED environment where various noises exist, and a mobile recording system was built to ensure its mobility. The Bluetooth microphones, voice recognition software, and systems using computers connected to them were installed in the triage unit, and voice data were recorded during the data collection period. For 6 months, 1717 triage cases were collected, and the machine learning engine was trained to recognize the sound using these voice data, convert it into textual data, and perform the subsequent NLP. Consistent with the current triage note format, the system was trained to classify the chief concern of each patient into 1 of 52 categories, and the past medical history was processed into 13 categories through NLP. In the triage note used in this ED, up to 3 chief concerns and the medical history can be entered. The presence of allergic diseases was configured to be treated as a binary input, and variables representing vital signs were treated as continuous variables.
For accurate voice interval detection in a noisy environment, the end-point detection module was optimized in the machine learning engine. By distinguishing various nonstationary noises through continuous adaptive learning for noise coming through the Bluetooth microphones, a deep neural network end-point detection module was developed with high accuracy in detecting energy-based voice sections of the existing method. The voice interval detection module optimized for the voice environment input to the Bluetooth microphone was advanced, and sound using the collected and processed purified voice database and converted textural data was applied for language model learning.
From June 4, 2021, to September 12, 2021, a real-time medical record input assistance system with voice artificial intelligence (RMIS-AI) built using a trained engine was prospectively applied to the clinical practice in the triage unit where the patients meet the medical staff for the first time. RMIS-AI is a tool that assists in recording triage notes through voices. In other words, it secures the mobility of a triage nurse by replacing the record input means with voice instead of the desktop computer keyboard. RMIS-AI was implemented on a cloud-based network separate from the hospital electronic medical record (EMR) system. During the study period, participants wearing Bluetooth microphones recorded triage data in the EMR by asking detailed questions to each patient and checked vital signs. Simultaneously, they also recorded the data through RMIS-AI in the same format using their voice. Because the participants used a closed-loop communication method that reconfirmed the meaning of the patient’s words and uttered them, the information obtained from the patients could be delivered by the participant’s voice rather than the patient’s voice. The input process of charting through RMIS-AI was blind to the nurses, and they monitored the EMR input process as usual when performing the triage task. The contents and time of the triage log finally created in both ways were stored in the hospital EMR log and cloud storage, respectively (
Two-input process of charting, RMIS-AI (real-time medical record input assistance system with voice artificial intelligence) vs manual input. EMR: electronic medical record.
The primary outcome was the time for participants to perform the triage task. It was defined as the time from the patient’s arrival at the triage unit to the completion of the triage note. We measured these times using data stored in the hospital EMR for manual input and cloud storage for RMIS-AI. The secondary outcome metrics were the record completion rate and the accuracy of RMIS-AI compared to manual input by EMR.
The sample size was calculated from the mean time taken by performing the triage task in a conventional method for 100 cases before the intervention was started. We considered that the RMIS-AI producing a mean difference of 20 seconds with standard deviation difference of 2 seconds would be considered clinically significant (
During the study period, a total of 20,155 triage cases were processed at the hospital’s ED, at an average of 194 cases per day. Among them, 1209 (6%) triage tasks were performed by the participants. After 146 cases were excluded by the criteria shown in
The median time for participants to perform the triage task was 204 (IQR 155, 277) seconds with RMIS-AI and 231 (IQR 180, 313) seconds using manual input by EMR. The difference between the 2 methods was statistically significant (
The record completion rates of both methods for all triage cases are shown in
The accuracy of reproducing records by RMIS-AI for all variables is summarized in
Flowchart of case inclusion. RMIS-AI: real-time medical record input assistance system with voice artificial intelligence.
Comparison of median time for triage task, RMIS-AI (real-time medical record input assistance system with voice artificial intelligence) vs manual input.
Record completion rates of both methods.
Variable | Record completion cases, n (%) | ||
|
RMIS-AIa | Manual input |
|
Chief concern, 1st | 870 (81.84) | 1063 (100) | <.001 |
Chief concern, 2nd | 515 (48.45) | 397 (37.35) | <.001 |
Chief concern, 3rd | 230 (21.64) | 106 (9.97) | <.001 |
History of allergic episode | 257 (24.18) | 1063 (100) | <.001 |
Past medical history, 1st | 383 (36.03) | 1030 (96.90) | <.001 |
Past medical history, 2nd | 127 (11.95) | 32 (3.01) | <.001 |
Past medical history, 3rd | 27 (2.54) | 12 (1.13) | .02 |
Systolic blood pressure | 580 (54.56) | 923 (86.83) | <.001 |
Diastolic blood pressure | 578 (54.37) | 923 (86.83) | <.001 |
Pulse rate | 613 (57.67) | 925 (87.02) | <.001 |
Respiratory rate | 382 (35.94) | 923 (86.83) | <.001 |
Body temperature | 607 (57.10) | 1061 (99.81) | <.001 |
Oxygen saturation | 584 (54.94) | 926 (87.11) | <.001 |
aRMIS-AI, real-time medical record input assistance system with voice artificial intelligence.
Accuracy of RMIS-AIa compared to the manual method.
Variable | Cases with reproduction and cases with records by manual method, n/N (%) | ||
|
|||
|
Complete reproductionb | 366/1063 (34.43) | |
|
Partial reproductionc | 190/1063 (17.87) | |
|
Failed reproductiond | 507/1063 (49.41) | |
|
|||
|
Complete reproduction | 226/1030 (21.94) | |
|
Partial reproduction | 5/1030 (0.49) | |
|
Failed to reproduction | 799/1080 (73.98) | |
History of allergic episode | 158/1063 (14.68) | ||
Systolic blood pressure | 516/923 (55.90) | ||
Diastolic blood pressure | 495/923 (53.63) | ||
Pulse rate | 352/925 (38.05) | ||
Respiratory rate | 340/923 (36.84) | ||
Body temperature | 484/1061 (45.62) | ||
Oxygen saturation | 465/926 (50.22) |
aRMIS-AI: real-time medical record input assistance system with voice artificial intelligence.
bAll the values by manual input were reproduced by RMIS-AI.
cPartial values by manual input were reproduced by RMIS-AI.
dNo values by manual input were reproduced by RMIS-AI.
Interrater reliability for continuous variables between 2 methods. ICC: intraclass correlation coefficient; RMIS-AI: real-time medical record input assistance system with voice artificial intelligence.
Previous study results have proven that prolonged waiting times and crowding are factors that reduce patient satisfaction and impair safety in the ED [
The record completion rates of RMIS-AI were inferior to the manual input by EMR in our study, especially in the input of allergy history or past medical history. In the case of categorical variables, such as allergy history or past medical history, NLP is more difficult than in the case of continuous variables, such as systolic blood pressure and pulse rate, because it is expressed in a wide variety of phrases rather than simple utterances. Korean is an agglutinative language and one of the morphologically rich and typologically diverse languages. Auxiliary, adverbial case markers, word spacing inconsistency, and the variety of expressions of predicates with the same meaning make NLP using Korean difficult. [
In our study, it is assumed that the difference between the variables with and without relatively favorable accuracy is due to the complexity of NLP. NLP is still being developed as an artificial intelligence field, and because there is no standardized format, its performance is different depending on the type and amount of training data as well as the deep learning method applied [
The reliability of pulse rate was lower than that of other vital sign values because there was a time difference between the input through RMIS-AI and the manual input because triage nurses record the pulse rate by watching the monitoring being measured as a continuous waveform. This result can be explained by the Bland-Altman plot, where the error range in the input value is narrow. In addition, the low ICC value of the respiratory rate was due to the less amount of data and low variability.
NLP is a tool that can structure unstructured textual data and enable the use of unstructured voice data that historically have not been used in the medical field. Previous studies have reported that the predictive performance of clinical outcomes is improved when unstructured textual data are used for machine learning in the medical field [
This study has several limitations. Although our study was conducted in a prospective design, a study using a randomized controlled design is needed to obtain definitive evidence that the RMIS-AI can replace the conventional method. Second, the completeness and accuracy of the triage note by the current RMIS-AI are insufficient to safely replace the conventional manual input method. If NLP for the recording of triage note recording is learned using additional training material, it can be improved
In this study, we confirmed that the promptness in performing triage tasks improved using RMIS-AI developed with STT and NLP technology compared with the manual input method, but technical supplementation was required to deal with the current level of inferiority in sensitivity and accuracy. If similar studies are conducted to confirm the potential of such technologies in clinical practice, artificial intelligence could evolve as a supportive tool to improve patient experience.
emergency department
electronic medical record
intraclass correlation coefficient
natural language processing
real-time medical record input assistance system with voice artificial intelligence
speech-to-text
The authors thank Medical Illustration and Design, part of the Medical Research Support Services of Yonsei University College of Medicine, for all artistic support related to this work. This study was supported by a faculty research grant of Yonsei University College of Medicine (6-2020-0117).
None declared.