Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

BACKGROUND: Advances in single cell genomics provide a way of routinely generating transcriptomics data at the single cell level. A frequent requirement of single cell expression analysis is the identification of novel patterns of heterogeneity across single cells that might explain complex cellular states or tissue composition. To date, classical statistical analysis tools have being routinely applied, but there is considerable scope for the development of novel statistical approaches that are better adapted to the challenges of inferring cellular hierarchies. RESULTS: We have developed a novel agglomerative clustering method that we call pcaReduce to generate a cell state hierarchy where each cluster branch is associated with a principal component of variation that can be used to differentiate two cell states. Using two real single cell datasets, we compared our approach to other commonly used statistical techniques, such as K-means and hierarchical clustering. We found that pcaReduce was able to give more consistent clustering structures when compared to broad and detailed cell type labels. CONCLUSIONS: Our novel integration of principal components analysis and hierarchical clustering establishes a connection between the representation of the expression data and the number of cell types that can be discovered. In doing so we found that pcaReduce performs better than either technique in isolation in terms of characterising putative cell states. Our methodology is complimentary to other single cell clustering techniques and adds to a growing palette of single cell bioinformatics tools for profiling heterogeneous cell populations.

Original publication

DOI

10.1186/s12859-016-0984-y

Type

Journal article

Journal

BMC Bioinformatics

Publication Date

22/03/2016

Volume

17

Keywords

Gene expression, Hierarchical clustering, Single cell RNA-Seq, Algorithms, Animals, Cell Line, Cluster Analysis, Humans, Principal Component Analysis, RNA, Sequence Analysis, RNA, Single-Cell Analysis, Transcriptome