Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to analyze public Single cell RNAseq datasets in Seurat? #4165

Closed
Biobarani22 opened this issue Mar 1, 2021 · 6 comments
Closed

How to analyze public Single cell RNAseq datasets in Seurat? #4165

Biobarani22 opened this issue Mar 1, 2021 · 6 comments

Comments

@Biobarani22
Copy link

Biobarani22 commented Mar 1, 2021

I'm trying to look through a published dataset to check for gene expression for my own project. I am having trouble finding materials to learn how to analyze single-cell RNA sequencing data on Seurat R from the GEO NCBI database. Since multiple samples such as barcodes.tsv.gz,genes.tsv.gz and matrix.mtx.gz files are there in this dataset. How to give input to seurat and analyze those multiple samples? The example data format is given below for your kind reference. Your help is much appreciated. Thanks in advance!

GEO_Accession number of this Dataset

GSE128423, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE128423 

datasets looks like this

GSM3674224_std1.barcodes.tsv.gz		GSM3674231_ctrl_16May.matrix.mtx.gz	GSM3674239_b1.genes.tsv.gz
GSM3674224_std1.genes.tsv.gz		GSM3674232_ctrl_26May.barcodes.tsv.gz	GSM3674239_b1.matrix.mtx.gz
GSM3674224_std1.matrix.mtx.gz		GSM3674232_ctrl_26May.genes.tsv.gz	GSM3674240_b2.barcodes.tsv.gz
GSM3674225_std2.barcodes.tsv.gz		GSM3674232_ctrl_26May.matrix.mtx.gz	GSM3674240_b2.genes.tsv.gz
GSM3674225_std2.genes.tsv.gz		GSM3674233_ctrl_7Jun.barcodes.tsv.gz	GSM3674240_b2.matrix.mtx.gz
GSM3674225_std2.matrix.mtx.gz		GSM3674233_ctrl_7Jun.genes.tsv.gz	GSM3674241_b3.barcodes.tsv.gz
GSM3674226_std3.barcodes.tsv.gz		GSM3674233_ctrl_7Jun.matrix.mtx.gz	GSM3674241_b3.genes.tsv.gz
GSM3674226_std3.genes.tsv.gz		GSM3674234_ctrl_8May.barcodes.tsv.gz	GSM3674241_b3.matrix.mtx.gz
GSM3674226_std3.matrix.mtx.gz		GSM3674234_ctrl_8May.genes.tsv.gz	GSM3674242_b4.barcodes.tsv.gz
@Biobarani22 Biobarani22 changed the title Public Single cell RNAseq datasets How to analyze public Single cell RNAseq datasets in Seurat? Mar 1, 2021
@bassanio
Copy link

bassanio commented Mar 2, 2021

@samuel-marsh
Copy link
Collaborator

Hi,

Not member of dev team but hopefully can be helpful. If I understand correctly you are wondering how to import those files as Read10X expects 10X formatted files without any such file prefixes and not the actual analysis itself as @bassanio directed?

If so then one solution just using current package version is to rename those files but will need to create and move them to their own directories first (but be careful).

Alternatively, you can see a few solutions currently being discussed in #4101 to allow for easier reading in of files from GEO or other repos.

Best,
Sam

@Biobarani22
Copy link
Author

@bassanio ,

Thank you very much. I will check it out.

@Biobarani22
Copy link
Author

@samuel-marsh,

Thank you very much. I understand and split into separate files. worked. But another issue is while scaling it shows memory exhausted. If you have a solution, please help. Thanks once again.

Sample code is here:

data_S <- ScaleData(data_S)
Centering and scaling data matrix
Error: vector memory exhausted (limit reached?)

@samuel-marsh
Copy link
Collaborator

Hi,

So that issue means that the dataset/memory that R needs exceeds what is available on your computer/server/cloud. The only solution there is to select smaller portion of the data to do the analysis on or move the analysis to platform that has more memory available.

Best,
Sam

@Biobarani22
Copy link
Author

@samuel-marsh,

Ok.understand. Thanks a lot for your valuable suggestions.

@timoast timoast closed this as completed Mar 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants