Assessment-of-log-transformation-and-z-scores-for-scRNA-seq-data-analysis

Normie

Team

Roshan Sharma @ NYGC
Heather Geiger @ NYGC
Ravneet Kaur @ Emory
Vincent Liu @ MSKCC

Problem

Numerous methods have been proposed for the normalization of single-cell RNA-seq (scRNA-seq) data. Yet, these methods have not been thoroughly benchmarked to assess robustness and influence on downstreasm analysis. A critical step in the analysis pipeline is to account for unwanted biological and technical effects that mask the signal of interest.

Proposed Solution and Methods

Here we benchmark the performance of eight commonly used scRNA-seq normalization methods, listed in the Methods Tested section, on four 10x datasets. Reasoning that each normalization method has its own objectives that may not be shared by other methods, we believe there is no single metric that can fully represent the quality of any given method. Therefore, we explore three propertieis of the resulting normalization: robustness, correlation to library size (# transcripts per cell), and influence on gene expression.

To assess the robustness of a normalization method, we first downsample the counts from data to leave all cells with a pre‐specified percentage of counts or fewer. For each normalization method, we then assess correlation of the normalized, downsampled data to normalized full data. Downsampling can deliver a more realistic representation of what cellular expression profiles would look like at similar count depths. We will look at the correlation of highly variable genes and differentially expressed genes on various different data sets and compare them to library size. Correlation of genes before and after mean and standard deviation. This gives us the set of differentially expressed genes.

Methods Tested

Median
Median + Log
Median + Log + Z-score
Median + Log + Linear Regression
GLMPCA
scVI
scRAN
scTransform
Linnorm

Datasets Used

Dropbox

10x PBMC 3k

Issues

We had no coffee.
Lunches would have been great too.

Lessons Learned

Lesson 1: Always a pain to set up environment for various methods.

Lesson 2: Always a pain to run R code from Python and vice-versa.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
src		src
LICENSE		LICENSE
README.md		README.md
rand_index_plot.py		rand_index_plot.py
rand_scores_from_downsample_levels_for_a_norm_method.py		rand_scores_from_downsample_levels_for_a_norm_method.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Assessment-of-log-transformation-and-z-scores-for-scRNA-seq-data-analysis

Normie

Team

Problem

Proposed Solution and Methods

Methods Tested

Datasets Used

Issues

Lessons Learned

About

Releases

Packages

Contributors 5

Languages

License

NCBI-Codeathons/Assessment-of-log-transformation-and-z-scores-for-scRNA-seq-data-analysis

Folders and files

Latest commit

History

Repository files navigation

Assessment-of-log-transformation-and-z-scores-for-scRNA-seq-data-analysis

Normie

Team

Problem

Proposed Solution and Methods

Methods Tested

Datasets Used

Issues

Lessons Learned

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages