HOME   |    PDF   |   


Title

A machine-learning clustering approach for reference interval estimation of liver enzymes from hospital laboratory big-data

 

Authors

Prakruti Dash1 & Saurav Nayak2,*

 

Affiliation

1Department of Biochemistry, AIIMS, Bhubaneswar, India; 2Department of Biochemistry, IMS & Sum Hospital Campus 2, Phulnakhara, Bhubaneswar, Odisha, India; *Corresponding author

 

Email

Prakruti Dash - E - mail: biochem_prakruti@aiimsbhubaneswar.edu.in

Saurav Nayak - E - mail: drsauravn@gmail.com

 

Article Type

Research Article

 

Date

Received May 1, 2025; Revised May 31, 2025; Accepted May 31, 2025, Published May 31, 2025

 

Abstract

It is of interest to establish clinically valid reference intervals (RIs) for the liver enzymes aspartate transaminase (AST) and Alanine aminotransferase (ALT) using a combination of unsupervised machine learning clustering and robust outlier detection applied to real-world laboratory big data. Four outlier detection methods were each combined with four clustering algorithms to identify homogeneous subgroups and the largest cluster from each combination was used to estimate RIs based on percentile cut-offs. Among the tested combinations, DBSCAN with Tukey’s fences or Local Outlier Factor achieved optimal performance, covering 100% of the validation data. The widest intervals were derived using Local Outlier Factor, while Isolation Forest yielded the narrowest. Ultimately, the study estimated the reference intervals for AST and ALT to be 15–41 U/L and 11–46 U/L, respectively.

 

Keywords

Reference interval, clustering, machine learning

 

Citation

Dash & Nayak, Bioinformation 21(5): 1069-1074 (2025)

 

Edited by

P Kangueane

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.