Published on in Vol 8, No 11 (2020): November

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/19735, first published .
Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models

Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models

Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models

Authors of this article:

Xi Yang1 Author Orcid Image ;   Xing He1 Author Orcid Image ;   Hansi Zhang1 Author Orcid Image ;   Yinghan Ma1 Author Orcid Image ;   Jiang Bian1 Author Orcid Image ;   Yonghui Wu1 Author Orcid Image

Journals

  1. Wu Z, Liang J, Zhang Z, Lei J. Exploration of text matching methods in Chinese disease Q&A systems: A method using ensemble based on BERT and boosted tree models. Journal of Biomedical Informatics 2021;115:103683 View
  2. Al Sulaiman M, Moussa A, Abdou S, Elgibreen H, Faisal M, Rashwan M, Alzubi O. Semantic textual similarity for modern standard and dialectal Arabic using transfer learning. PLOS ONE 2022;17(8):e0272991 View
  3. Kalyan K, Rajasekharan A, Sangeetha S. AMMU: A survey of transformer-based biomedical pretrained language models. Journal of Biomedical Informatics 2022;126:103982 View
  4. Asubiaro T, Ajiferuke I. Semantic similarity-based credit attribution on citation paths: a method for allocating residual citation to and investigating depth of influence of scientific communications. Scientometrics 2022;127(11):6257 View
  5. Li M, Chen T, Ryu K, Jin C, Lu L. An Efficient Parallelized Ontology Network-Based Semantic Similarity Measure for Big Biomedical Document Clustering. Computational and Mathematical Methods in Medicine 2021;2021:1 View
  6. Park W, Siddiqui I, Chakraborty C, Qureshi N, Shin D. Scarcity-aware spam detection technique for big data ecosystem. Pattern Recognition Letters 2022;157:67 View
  7. Yu Z, Yang X, Sweeting G, Ma Y, Stolte S, Fang R, Wu Y. Identify diabetic retinopathy-related clinical concepts and their attributes using transformer-based natural language processing methods. BMC Medical Informatics and Decision Making 2022;22(S3) View
  8. Moradi M, Blagec K, Samwald M. Improving the Robustness and Accuracy of Intelligent Clinical Text Processing: Producing Noise for Data Augmentation. SSRN Electronic Journal 2022 View
  9. Feng T, Qu L, Haffari G. Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue Response Generation Models by Causal Discovery. Transactions of the Association for Computational Linguistics 2023;11:511 View
  10. Saraswat N, Li C, Jiang M. Identifying the Question Similarity of Regulatory Documents in the Pharmaceutical Industry by Using the Recognizing Question Entailment System: Evaluation Study. JMIR AI 2023;2:e43483 View
  11. Wang B, Xie Q, Pei J, Chen Z, Tiwari P, Li Z, Fu J. Pre-trained Language Models in Biomedical Domain: A Systematic Survey. ACM Computing Surveys 2024;56(3):1 View
  12. Lyu D, Wang X, Chen Y, Wang F. Language model and its interpretability in biomedicine: A scoping review. iScience 2024;27(4):109334 View
  13. Aygün İ, Kaya B, Kaya M. Identifying patients in need of psychological treatment with language representation models. Multimedia Tools and Applications 2024;84(1):397 View
  14. Nerella S, Bandyopadhyay S, Zhang J, Contreras M, Siegel S, Bumin A, Silva B, Sena J, Shickel B, Bihorac A, Khezeli K, Rashidi P. Transformers and large language models in healthcare: A review. Artificial Intelligence in Medicine 2024;154:102900 View
  15. Sharmila P, Anbananthen K, Gunasekaran N, Balasubramaniam B, Chelliah D. FTLM: A Fuzzy TOPSIS Language Modeling Approach for Plagiarism Severity Assessment. IEEE Access 2024;12:122597 View
  16. Chen Y, Zhang C, Bai R, Sun T, Ding W, Wang R. A review of medical text analysis: Theory and practice. Information Fusion 2025;119:103024 View
  17. Gupta S, Thakar U, Tokekar S. A comprehensive survey on techniques for numerical similarity measurement. Expert Systems with Applications 2025;277:127235 View
  18. Budler L, Chen H, Chen A, Topaz M, Tam W, Bian J, Stiglic G. A Brief Review on Benchmarking for Large Language Models Evaluation in Healthcare. WIREs Data Mining and Knowledge Discovery 2025;15(2) View
  19. Celikten T, Onan A. Exploring Text Similarity in Human and AI-Generated Scientific Abstracts: A Comprehensive Analysis. IEEE Access 2025;13:74313 View
  20. Kang S, Park H, Taira R, Kim H. Detecting Redundant Health Survey Questions by Using Language-Agnostic Bidirectional Encoder Representations From Transformers Sentence Embedding: Algorithm Development Study. JMIR Medical Informatics 2025;13:e71687 View

Books/Policy Documents

  1. Dramé K, Diallo G, Sambe G. Web Information Systems and Technologies. View
  2. Farray Rodríguez J, Fernández-García A, Verdú E. Management of Digital EcoSystems. View
  3. Avasthi S, Sanwal T, Tripathi S, Tyagi M. Mining Biomedical Text, Images and Visual Features for Information Retrieval. View

Conference Proceedings

  1. Moravvej S, Joodaki M, Maleki Kahaki M, Salimi Sartakhti M. 2021 7th International Conference on Web Research (ICWR). A method Based on an Attention Mechanism to Measure the Similarity of two Sentences View
  2. Faramarzi N, Dara A, Banerjee R. 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI). Combining Attention-based Models with the MeSH Ontology for Semantic Textual Similarity in Clinical Notes View
  3. Ban B. 2022 13th International Conference on Information and Communication Technology Convergence (ICTC). A Survey on Awesome Korean NLP Datasets View
  4. Alfianto M, Priyadi Y, Laksitowening K. 2023 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT). Semantic Textual Similarity in Requirement Specification and Use Case Description based on Sentence Transformer Model View
  5. Haddad N, Myshenkov K, Afanasiev G. 2024 6th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE). Introducing Text Analysis Algorithms in Decision Support Systems for Automated Evaluation of the Doctor Prescriptions View
  6. G C, K S, D N, P G, M K. 2024 Second International Conference on Emerging Trends in Information Technology and Engineering (ICETITE). Extractive Text Summarization of Clinical Text Using Deep Learning Models View
  7. Laksana M, Priyadi Y, Wibowo Y. 2025 10th International Conference on Signal Processing and Communication (ICSC). Formation of Use Case Scenario Based on Use Case Diagram Using Text Semantics for IdVar4CL Application Development View