Title |
A machine-learning clustering approach for reference interval estimation of liver enzymes from hospital laboratory big-data
|
Authors |
Prakruti Dash1 & Saurav Nayak2,*
|
Affiliation |
1Department of Biochemistry, AIIMS, Bhubaneswar, India; 2Department of Biochemistry, IMS & Sum Hospital Campus 2, Phulnakhara, Bhubaneswar, Odisha, India; *Corresponding author
|
|
Prakruti Dash - E - mail: biochem_prakruti@aiimsbhubaneswar.edu.in Saurav Nayak - E - mail: drsauravn@gmail.com
|
Article Type |
Research Article
|
Date |
Received May 1, 2025; Revised May 31, 2025; Accepted May 31, 2025, Published May 31, 2025
|
Abstract |
It is of interest to establish clinically valid reference intervals (RIs) for the liver enzymes aspartate transaminase (AST) and Alanine aminotransferase (ALT) using a combination of unsupervised machine learning clustering and robust outlier detection applied to real-world laboratory big data. Four outlier detection methods were each combined with four clustering algorithms to identify homogeneous subgroups and the largest cluster from each combination was used to estimate RIs based on percentile cut-offs. Among the tested combinations, DBSCAN with Tukey’s fences or Local Outlier Factor achieved optimal performance, covering 100% of the validation data. The widest intervals were derived using Local Outlier Factor, while Isolation Forest yielded the narrowest. Ultimately, the study estimated the reference intervals for AST and ALT to be 15–41 U/L and 11–46 U/L, respectively. |
Keywords |
Reference interval, clustering, machine learning
|
Citation |
Dash & Nayak, Bioinformation 21(5): 1069-1074 (2025)
|
Edited by |
P Kangueane
|
ISSN |
0973-2063
|
Publisher |
|
License |
This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.
|
|