Using transfer learning from prior reference knowledge to improve the clustering of single-cell RNA-Seq data
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Mieth, Bettina | - |
dc.contributor.author | Hockley, James R. F. | - |
dc.contributor.author | Goernitz, Nico | - |
dc.contributor.author | Vidovic, Marina M-C | - |
dc.contributor.author | Mueller, Klaus-Robert | - |
dc.contributor.author | Gutteridge, Alex | - |
dc.contributor.author | Ziemek, Daniel | - |
dc.date.accessioned | 2021-08-31T19:37:51Z | - |
dc.date.available | 2021-08-31T19:37:51Z | - |
dc.date.created | 2021-06-19 | - |
dc.date.issued | 2019-12-30 | - |
dc.identifier.issn | 2045-2322 | - |
dc.identifier.uri | https://scholar.korea.ac.kr/handle/2021.sw.korea/60864 | - |
dc.description.abstract | In many research areas scientists are interested in clustering objects within small datasets while making use of prior knowledge from large reference datasets. We propose a method to apply the machine learning concept of transfer learning to unsupervised clustering problems and show its effectiveness in the field of single-cell RNA sequencing (scRNA-Seq). The goal of scRNA-Seq experiments is often the definition and cataloguing of cell types from the transcriptional output of individual cells. To improve the clustering of small disease- or tissue-specific datasets, for which the identification of rare cell types is often problematic, we propose a transfer learning method to utilize large and well-annotated reference datasets, such as those produced by the Human Cell Atlas. Our approach modifies the dataset of interest while incorporating key information from the larger reference dataset via Non-negative Matrix Factorization (NMF). The modified dataset is subsequently provided to a clustering algorithm. We empirically evaluate the benefits of our approach on simulated scRNA-Seq data as well as on publicly available datasets. Finally, we present results for the analysis of a recently published small dataset and find improved clustering when transferring knowledge from a large reference dataset. Implementations of the method are available at https://github.com/nicococo/scRNA. | - |
dc.language | English | - |
dc.language.iso | en | - |
dc.publisher | NATURE PUBLISHING GROUP | - |
dc.subject | SEQUENCING DATA | - |
dc.subject | HETEROGENEITY | - |
dc.subject | ARCHITECTURE | - |
dc.subject | DIVERSITY | - |
dc.subject | DEFINES | - |
dc.title | Using transfer learning from prior reference knowledge to improve the clustering of single-cell RNA-Seq data | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Mueller, Klaus-Robert | - |
dc.identifier.doi | 10.1038/s41598-019-56911-z | - |
dc.identifier.scopusid | 2-s2.0-85077220581 | - |
dc.identifier.wosid | 000508985100036 | - |
dc.identifier.bibliographicCitation | SCIENTIFIC REPORTS, v.9 | - |
dc.relation.isPartOf | SCIENTIFIC REPORTS | - |
dc.citation.title | SCIENTIFIC REPORTS | - |
dc.citation.volume | 9 | - |
dc.type.rims | ART | - |
dc.type.docType | Article | - |
dc.description.journalClass | 1 | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Science & Technology - Other Topics | - |
dc.relation.journalWebOfScienceCategory | Multidisciplinary Sciences | - |
dc.subject.keywordPlus | SEQUENCING DATA | - |
dc.subject.keywordPlus | HETEROGENEITY | - |
dc.subject.keywordPlus | ARCHITECTURE | - |
dc.subject.keywordPlus | DIVERSITY | - |
dc.subject.keywordPlus | DEFINES | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
(02841) 서울특별시 성북구 안암로 14502-3290-1114
COPYRIGHT © 2021 Korea University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.