A machine-learning approach for nonalcoholic steatohepatitis susceptibility estimation

Ghadiri, Fatemeh; Husseini, Abbas Ali; Öztaş, Oğuzhan

dc.contributor.author	Ghadiri, Fatemeh
dc.contributor.author	Husseini, Abbas Ali
dc.contributor.author	Öztaş, Oğuzhan
dc.date.accessioned	2023-11-04T10:26:27Z
dc.date.available	2023-11-04T10:26:27Z
dc.date.issued	2022	en_US
dc.identifier.issn	0254-8860
dc.identifier.issn	0975-0711
dc.identifier.uri	https://hdl.handle.net/11363/6222
dc.description.abstract	Background Nonalcoholic steatohepatitis (NASH), a severe form of nonalcoholic fatty liver disease, can lead to advanced liver damage and has become an increasingly prominent health problem worldwide. Predictive models for early identification of highrisk individuals could help identify preventive and interventional measures. Traditional epidemiological models with limited predictive power are based on statistical analysis. In the current study, a novel machine-learning approach was developed for individual NASH susceptibility prediction using candidate single nucleotide polymorphisms (SNPs). Methods A total of 245 NASH patients and 120 healthy individuals were included in the study. Single nucleotide polymorphism genotypes of candidate genes including two SNPs in the cytochrome P450 family 2 subfamily E member 1 (CYP2E1) gene (rs6413432, rs3813867), two SNPs in the glucokinase regulator (GCKR) gene (rs780094, rs1260326), rs738409 SNP in patatinlike phospholipase domain-containing 3 (PNPLA3), and gender parameters were used to develop models for identifying at-risk individuals. To predict the individual’s susceptibility to NASH, nine different machine-learning models were constructed. These models involved two different feature selections including Chi-square, and support vector machine recursive feature elimination (SVM-RFE) and three classification algorithms including k-nearest neighbor (KNN), multi-layer perceptron (MLP), and random forest (RF). All nine machine-learning models were trained using 80% of both the NASH patients and the healthy controls data. The nine machine-learning models were then tested on 20% of both groups. The model’s performance was compared for model accuracy, precision, sensitivity, and F measure. Results Among all nine machine-learning models, the KNN classifier with all features as input showed the highest performance with 86% F measure and 79% accuracy. Conclusions Machine learning based on genomic variety may be applicable for estimating an individual’s susceptibility for developing NASH among high-risk groups with a high degree of accuracy, precision, and sensitivity.	en_US
dc.language.iso	eng	en_US
dc.publisher	SPRINGER INDIA, 7TH FLOOR, VIJAYA BUILDING, 17, BARAKHAMBA ROAD, NEW DELHI 110 001, INDIA	en_US
dc.relation.isversionof	10.1007/s12664-022-01263-2	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject	Algorithm	en_US
dc.subject	Artificial intelligence	en_US
dc.subject	Disease susceptibility	en_US
dc.subject	Fatty liver	en_US
dc.subject	Gene	en_US
dc.subject	Machine learning	en_US
dc.subject	Neural network model	en_US
dc.subject	Nonalcoholic fattyliver disease	en_US
dc.subject	Nonalcoholic steatohepatitis	en_US
dc.subject	Single nucleotide polymorphism	en_US
dc.subject	Support vectormachine	en_US
dc.title	A machine-learning approach for nonalcoholic steatohepatitis susceptibility estimation	en_US
dc.type	article	en_US
dc.relation.ispartof	Indian Journal of Gastroenterology	en_US
dc.department	Sağlık Hizmetleri Meslek Yüksekokulu	en_US
dc.authorid	https://orcid.org/0000-0001-5249-2914	en_US
dc.identifier.volume	41	en_US
dc.identifier.issue	5	en_US
dc.identifier.startpage	475	en_US
dc.identifier.endpage	482	en_US
dc.relation.publicationcategory	Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı	en_US
dc.institutionauthor	Husseini, Abbas Ali