%0 Journal Article %@ 2291-9694 %I JMIR Publications %V 8 %N 8 %P e16948 %T Semantic Deep Learning: Prior Knowledge and a Type of Four-Term Embedding Analogy to Acquire Treatments for Well-Known Diseases %A Arguello Casteleiro,Mercedes %A Des Diz,Julio %A Maroto,Nava %A Fernandez Prieto,Maria Jesus %A Peters,Simon %A Wroe,Chris %A Sevillano Torrado,Carlos %A Maseda Fernandez,Diego %A Stevens,Robert %+ Department of Computer Science, University of Manchester, Kilburn Building, Oxford Road, M13 9PL, Manchester, , United Kingdom, 44 161 275 6251, robert.stevens@manchester.ac.uk %K evidence-based practice %K artificial intelligence %K deep learning %K semantic deep learning %K analogical reasoning %K embedding analogies %K PubMed %D 2020 %7 6.8.2020 %9 Original Paper %J JMIR Med Inform %G English %X Background: How to treat a disease remains to be the most common type of clinical question. Obtaining evidence-based answers from biomedical literature is difficult. Analogical reasoning with embeddings from deep learning (embedding analogies) may extract such biomedical facts, although the state-of-the-art focuses on pair-based proportional (pairwise) analogies such as man:woman::king:queen (“queen = −man +king +woman”). Objective: This study aimed to systematically extract disease treatment statements with a Semantic Deep Learning (SemDeep) approach underpinned by prior knowledge and another type of 4-term analogy (other than pairwise). Methods: As preliminaries, we investigated Continuous Bag-of-Words (CBOW) embedding analogies in a common-English corpus with five lines of text and observed a type of 4-term analogy (not pairwise) applying the 3CosAdd formula and relating the semantic fields person and death: “dagger = −Romeo +die +died” (search query: −Romeo +die +died). Our SemDeep approach worked with pre-existing items of knowledge (what is known) to make inferences sanctioned by a 4-term analogy (search query −x +z1 +z2) from CBOW and Skip-gram embeddings created with a PubMed systematic reviews subset (PMSB dataset). Stage1: Knowledge acquisition. Obtaining a set of terms, candidate y, from embeddings using vector arithmetic. Some n-gram pairs from the cosine and validated with evidence (prior knowledge) are the input for the 3cosAdd, seeking a type of 4-term analogy relating the semantic fields disease and treatment. Stage 2: Knowledge organization. Identification of candidates sanctioned by the analogy belonging to the semantic field treatment and mapping these candidates to unified medical language system Metathesaurus concepts with MetaMap. A concept pair is a brief disease treatment statement (biomedical fact). Stage 3: Knowledge validation. An evidence-based evaluation followed by human validation of biomedical facts potentially useful for clinicians. Results: We obtained 5352 n-gram pairs from 446 search queries by applying the 3CosAdd. The microaveraging performance of MetaMap for candidate y belonging to the semantic field treatment was F-measure=80.00% (precision=77.00%, recall=83.25%). We developed an empirical heuristic with some predictive power for clinical winners, that is, search queries bringing candidate y with evidence of a therapeutic intent for target disease x. The search queries -asthma +inhaled_corticosteroids +inhaled_corticosteroid and -epilepsy +valproate +antiepileptic_drug were clinical winners, finding eight evidence-based beneficial treatments. Conclusions: Extracting treatments with therapeutic intent by analogical reasoning from embeddings (423K n-grams from the PMSB dataset) is an ambitious goal. Our SemDeep approach is knowledge-based, underpinned by embedding analogies that exploit prior knowledge. Biomedical facts from embedding analogies (4-term type, not pairwise) are potentially useful for clinicians. The heuristic offers a practical way to discover beneficial treatments for well-known diseases. Learning from deep learning models does not require a massive amount of data. Embedding analogies are not limited to pairwise analogies; hence, analogical reasoning with embeddings is underexploited. %M 32759099 %R 10.2196/16948 %U https://medinform.jmir.org/2020/8/e16948 %U https://doi.org/10.2196/16948 %U http://www.ncbi.nlm.nih.gov/pubmed/32759099