NUST Institutional Repository

Single-Cell Profiling and Machine Learning: An Integrative Approach for HCV, SARS-CoV-2 and HIV Classification

Show simple item record

dc.contributor.author Naveed, Sawera
dc.date.accessioned 2024-07-04T07:46:45Z
dc.date.available 2024-07-04T07:46:45Z
dc.date.issued 2024
dc.identifier.other 400473
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/44528
dc.description.abstract Viruses, as obligate intracellular parasites, pose significant challenges to global health, with their diverse forms and mechanisms of infection often presenting diagnostic and therapeutic complexities while exhibiting high heterogeneity and complexity at both molecular and cellular levels. They often spread to distant organs which poses challenges for diagnosis, prognosis and treatment. While, there is much more information that may need exploration specially at cell level to find main underlying cause to counter diseases or infections caused by viruses. The major issue at hand is comprehending the underlying molecular mechanisms of viruses, with a particular focus on identification of cellular heterogeneity, cell type identification, and gene regulation dynamics across various viruses. Single cell RNA sequencing is a compelling technology which revolutionized the field of genomics in recent years by allowing to study patterns of gene expression at the single-cell level. This study applied scRNA-Seq to healthy and diseased samples of HCV, SARS-CoV-2 and HIV. The examination yielded valuable insights into cellular heterogeneity at an individual cell level. In case of HCV, cells such as CD8+T cells are found to be in greater abundance followed by monocytes. For SARS-CoV-2 the most prominently occurring cell type is CD8+ T cells followed by NK T cells and CD4+ T cells, B cells and CD8+ T cells for HIV. While, four main cell types, CD8+ T cells, CD4+ T cells, Monocytes and NK cells are found to be present in all three diseases. In addition, DGE analysis and enrichment analysis provides insights into the prevalence of gene expression within different cell populations along with enriched biological pathways and functional categories associated with those genes. Furthermore, machine learning classification models are built to distinguish one disease from another on the basis of identified cell types. The accuracy of Random Forest model is found to be 93.9%. While for SVM, the accuracy is 94.9% and for RNN the optimal accuracy is found to be 92.8%. SVM outperformed RF and RNN in terms of each model evaluation metric parameter suggesting that the build model gives most optimal result in terms of classifying identified cell types in their respective classes of three viral diseases. These findings significantly enhance the understanding of the intricate molecular aspects underlying selected viral diseases. Future prospects encompass ongoing research to refine understanding of viruses and their variants and develop more targeted and effective treatments, ultimately enhancing the quality of life for individuals grappling with these infections and long-term diseases. en_US
dc.description.sponsorship Supervised By Dr. Rehan Zafar Paracha en_US
dc.language.iso en_US en_US
dc.publisher (School of Interdisciplinary Engineering and Sciences, (SINES) en_US
dc.title Single-Cell Profiling and Machine Learning: An Integrative Approach for HCV, SARS-CoV-2 and HIV Classification en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [159]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account