TY - JOUR AU - Sampa, Masuda Begum AU - Hossain, Md Nazmul AU - Hoque, Md Rakibul AU - Islam, Rafiqul AU - Yokota, Fumihiko AU - Nishikitani, Mariko AU - Ahmed, Ashir PY - 2020 DA - 2020/10/8 TI - Blood Uric Acid Prediction With Machine Learning: Model Development and Performance Comparison JO - JMIR Med Inform SP - e18331 VL - 8 IS - 10 KW - blood uric acid KW - urban corporate population KW - machine learning KW - noncommunicable diseases KW - Bangladesh KW - boosted decision tree regression model AB - Background: Uric acid is associated with noncommunicable diseases such as cardiovascular diseases, chronic kidney disease, coronary artery disease, stroke, diabetes, metabolic syndrome, vascular dementia, and hypertension. Therefore, uric acid is considered to be a risk factor for the development of noncommunicable diseases. Most studies on uric acid have been performed in developed countries, and the application of machine-learning approaches in uric acid prediction in developing countries is rare. Different machine-learning algorithms will work differently on different types of data in various diseases; therefore, a different investigation is needed for different types of data to identify the most accurate algorithms. Specifically, no study has yet focused on the urban corporate population in Bangladesh, despite the high risk of developing noncommunicable diseases for this population. Objective: The aim of this study was to develop a model for predicting blood uric acid values based on basic health checkup test results, dietary information, and sociodemographic characteristics using machine-learning algorithms. The prediction of health checkup test measurements can be very helpful to reduce health management costs. Methods: Various machine-learning approaches were used in this study because clinical input data are not completely independent and exhibit complex interactions. Conventional statistical models have limitations to consider these complex interactions, whereas machine learning can consider all possible interactions among input data. We used boosted decision tree regression, decision forest regression, Bayesian linear regression, and linear regression to predict personalized blood uric acid based on basic health checkup test results, dietary information, and sociodemographic characteristics. We evaluated the performance of these five widely used machine-learning models using data collected from 271 employees in the Grameen Bank complex of Dhaka, Bangladesh. Results: The mean uric acid level was 6.63 mg/dL, indicating a borderline result for the majority of the sample (normal range <7.0 mg/dL). Therefore, these individuals should be monitoring their uric acid regularly. The boosted decision tree regression model showed the best performance among the models tested based on the root mean squared error of 0.03, which is also better than that of any previously reported model. Conclusions: A uric acid prediction model was developed based on personal characteristics, dietary information, and some basic health checkup measurements. This model will be useful for improving awareness among high-risk individuals and populations, which can help to save medical costs. A future study could include additional features (eg, work stress, daily physical activity, alcohol intake, eating red meat) in improving prediction. SN - 2291-9694 UR - https://medinform.jmir.org/2020/10/e18331 UR - https://doi.org/10.2196/18331 UR - http://www.ncbi.nlm.nih.gov/pubmed/33030442 DO - 10.2196/18331 ID - info:doi/10.2196/18331 ER -