Population Specific Classification of Colorectal Cancer with Meta-Analysis of Metagenomic Data

Temiz, Mustafa; Yousef, Malik; Bakir-Gungor, Burcu

Gelişmiş Arama

Göster/Aç

Konferans Ögesi (398.3Kb)

Erişim

info:eu-repo/semantics/closedAccess

Tarih

2023

Yazar

Temiz, Mustafa
Yousef, Malik
Bakir-Gungor, Burcu

Üst veri

Tüm öğe kaydını göster

Özet

Yeni nesil dizilemedeki ve "-omik" teknolojilerdeki gelişmeler, insan bağırsak mikrobiyomunu karakterize etmeyi mümkün kılmaktadır. Bu mikroorganizmaların bazıları bağışıklık sistemimizin temel düzenleyicileriyken, mikrobiyotanın modülasyonu çeşitli hastalıklara yol açar. Dünya çapında üçüncü yaygın kanser türü olan kolorektal kanser (KRK), genetik mutasyonlar, çevresel koşullar ve bağırsak mikrobiyotasındaki anomalilerin etkisiyle oluşmaktadır. Bu çalışma, tür seviyesinde metagenomik veri setleri üzerinde çeşitli makine öğrenmesi yöntemleri kullanarak farklı popülasyonlar için meta-analiz gerçekleştirmeyi; bu sayede KRK teşhisine yardımcı olabilecek sınıflandırma modelleri oluşturmayı amaçlamaktadır. Bu çalışmada, 8 farklı ülke ve 9 farklı metagenomik veri seti üzerinde popülasyon içi, popülasyonlar arası ve leave one dataset out (LODO) yöntemi kullanılarak 3 farklı meta-analiz gerçekleştirilmiştir. KRK teşhisine yardımcı model geliştirirken 4 farklı sınıflandırma algoritması (Rastgele Orman (RF), Logitboost, Adaboost ve Karar Agaci (DT)) kullanılmaktadır. Yapılan deneylerde en üstün performans olarak, popülasyonlar arası performans değerlendirmesinde eğitim veri seti için JP ve test veri seti için JPN popülasyonları kullanıldığında Random Forest algoritması ile 0.98 AUC elde etmiştir.

Advances in next-generation sequencing and "- omics" technologies makes it possible to characterize the human gut microbiome. While some of these microorganisms are important regulators of our immune system, modulation of the microbiota leads to a variety of diseases. Colorectal cancer (CRC), the third most common cancer worldwide, is caused by genetic mutations, environmental conditions, and abnormalities in the gut microbiota. Using various machine learning methods and meta-analysis techniques, this study aims to build a classification model that can help in CRC diagnosis by analyzing metagenomic datasets of different populations obtained at the species level. Using 8 different countries and 9 different metagenomic datasets, 3 different meta-analyzes are performed: within-population, cross-population, and one population is selected for testing and the rest is used as a training dataset (LODO). For CRC classification, 4 different classification algorithms (Random Forest (RF), Logitboost, Adaboost, and Decision Tree (DT)) are used. The best performance among these methods was obtained with the Random Forest algorithm with an AUC of 0.98 by using JP for the training data set and JPN populations for the test data set in the cross-population performance evaluation.

Kaynak

2023 Innovations in Intelligent Systems and Applications Conference, ASYU 2023

Bağlantı

https://doi.org/10.1109/ASYU58738.2023.10296760
https://hdl.handle.net/20.500.12573/2087

Koleksiyonlar

Bilgisayar Mühendisliği Bölümü Koleksiyonu [305]
Scopus İndeksli Yayınlar Koleksiyonu [1614]