TY - JOUR AU - Li, Xingyuan AU - Liu, Ke AU - Lang, Yanlin AU - Chai, Zhonglin AU - Liu, Fang PY - 2024 DA - 2024/11/15 TI - Exploring the Potential of Claude 3 Opus in Renal Pathological Diagnosis: Performance Evaluation JO - JMIR Med Inform SP - e65033 VL - 12 KW - artificial intelligence KW - Claude 3 Opus KW - renal pathology KW - diagnostic performance KW - large language model KW - LLM KW - performance evaluation KW - medical diagnosis KW - AI language model KW - diagnosis KW - pathology images KW - pathologist KW - clinical relevance KW - accuracy KW - language fluency KW - pathological diagnosis AB - Background: Artificial intelligence (AI) has shown great promise in assisting medical diagnosis, but its application in renal pathology remains limited. Objective: We evaluated the performance of an advanced AI language model, Claude 3 Opus (Anthropic), in generating diagnostic descriptions for renal pathological images. Methods: We carefully curated a dataset of 100 representative renal pathological images from the Diagnostic Atlas of Renal Pathology (3rd edition). The image selection aimed to cover a wide spectrum of common renal diseases, ensuring a balanced and comprehensive dataset. Claude 3 Opus generated diagnostic descriptions for each image, which were scored by 2 pathologists on clinical relevance, accuracy, fluency, completeness, and overall value. Results: Claude 3 Opus achieved a high mean score in language fluency (3.86) but lower scores in clinical relevance (1.75), accuracy (1.55), completeness (2.01), and overall value (1.75). Performance varied across disease types. Interrater agreement was substantial for relevance (κ=0.627) and overall value (κ=0.589) and moderate for accuracy (κ=0.485) and completeness (κ=0.458). Conclusions: Claude 3 Opus shows potential in generating fluent renal pathology descriptions but needs improvement in accuracy and clinical value. The AI’s performance varied across disease types. Addressing the limitations of single-source data and incorporating comparative analyses with other AI approaches are essential steps for future research. Further optimization and validation are needed for clinical applications. SN - 2291-9694 UR - https://medinform.jmir.org/2024/1/e65033 UR - https://doi.org/10.2196/65033 DO - 10.2196/65033 ID - info:doi/10.2196/65033 ER -