Skip to content

BF528 | Applications in Translational Bioinformatics Final Project

Notifications You must be signed in to change notification settings

daisyhan97/bf528

Repository files navigation

Single Cell RNA-Seq Analysis of Pancreatic Cells

BF528 | Applications in Translational Bioinformatics Final Project

The pancreas is a complex organ comprised of a diverse set of cell types. Proper function of the pancreas is required to maintain healthy metabolism, and pancreatic dysfunction leads to serious illnesses. In their 2016 study, A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-Cell Population Structure, Baron et al. performed single cell RNA sequencing in a set of post-mortem human donor pancreatic cells from four subjects and two mouse models to better understand the cellular diversity in the pancreas. Analysis of the data identified previously known cell types as well as rare and novel cell type subpopulations, and created a more detailed characterization of the diversity of those cell types. In this project, we will attempt to replicate their primary findings using current analytical methodology and software packages.

This is a continuation of BF528 Project 4, which can be found here

Project Goals:

  • Process the barcode reads of a single cell sequencing dataset
  • Perform cell-by-gene quantification of UMI counts
  • Perform quality control on a UMI counts matrix
  • Analyze the UMI counts to identify clusters and marker genes for distinct cell type populations
  • Ascribe biological meaning to the clustered cell types and identify novel marker genes associated with them

Original Analysis:

Baron M, Veres A, Wolock SL, et al. A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure. Cell Syst. 2016;3(4):346-360.e4. doi:10.1016/j.cels.2016.08.011

Repository Contents and Suggested Workflow

  1. salmon_prep.qsub - Using the Gencode v.37 humange genome, this file creates a reference index of the human transcriptome and transcript-to-gene map, to be used for Salmon Alevin
  2. whitelist.qsub - This file generates a whitelist of barcodes that meet a particular minimum sequencing depth threshold. The mean number of reads for each file was used as the threshold
  3. salmon_alevin.qusb - Runs Salmon Aleven program, both on individual files and on all files simultaneously. More information on Salmon Alevin can be found here.
  4. data_curator.R - Generates summary statistics on the UMI counts matrices generated by Salmon Alevin, including dumulative distribution plots of the distinct UMIs per barcode
  5. programmer.R - The UMI count matrix generated above is loaded, and processed using the Seurat standard pre-processing workflow. Low quality reads are filtered, and cells are clustered into subpopulations
  6. analyst.R - The cell subpopulations are classified into distinct cell types, using the marker genes provided in the Supplementary Data of Baron et al.. Marker genes for each of these cell types are retained, with a list of novel marker genes exported for further analysis

Final Report: A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure

About

BF528 | Applications in Translational Bioinformatics Final Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published