Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with bc_transform_func for GEX job in Snakemake #522

Open
ptk1601 opened this issue Dec 5, 2024 · 4 comments
Open

Help with bc_transform_func for GEX job in Snakemake #522

ptk1601 opened this issue Dec 5, 2024 · 4 comments

Comments

@ptk1601
Copy link

ptk1601 commented Dec 5, 2024

Hello!

I am trying to fix a discrepancy between my barcodes in my GEX and ATAC data. My GEX data barcodes' format is currently as follows: stat3ff_AAACAGCCATTGACAT-1 and my ATAC data barcodes looks like: AAACAGCCATTGACAT-1_stat3ff. What is a lambda function for bc_transform_func I can use to adjust this? Unfortunately when I try to split by underscores, it seems that SCENIC+/Python isn't recognizing it?

@ptk1601
Copy link
Author

ptk1601 commented Dec 5, 2024

I've fixed this issue just by changing the format of the cell_names part of the cistopic_obj, but I am now getting this error. Is there anything I can do to fix this?

File "/gpfs/gsfs12/users/kimpt/conda/envs/scenicplus/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 6168, in _raise_if_missing
raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index(['CACATTAAGTTCCTGC-1', 'TCATTACTCTCACAAA-1', 'TTAGCAATCACACAGT-1',\n 'TTGGGTTAGCTTCTCA-1', 'TCTTGTCCAAAGCTCC-1', 'CTAAATGTCATGCGTG-1',\n 'TGAGGTGCATAAAGCA-1', 'GACATAGAGACAACGA-1', 'TAATGGACACCGGCTA-1',\n 'GGGTTATTCGGTTTCC-1',\n ...\n 'CTAATCCGTATACTGG-1', 'GCCTACTTCCCTGGAA-1', 'CCCGCAACAACACCTA-1',\n 'ACAGTATGTTTGGTTC-1', 'AATCATCCAAGCTTTG-1', 'ATGAGCCGTGGATTGC-1',\n 'AGGTCAAAGATGGAGC-1', 'TGCCATTGTGTGCAAC-1', 'TTATAGCCATACCCGG-1',\n 'CTAATAGTCCCAGTAG-1'],\n dtype='object', length=18273)] are in the [columns]"

I did get a message recommending to call .obs_names_make_unique, so does this have any effect on this error?

@sheyiphunmi
Copy link

sheyiphunmi commented Dec 6, 2024

Have you considered formatting the cell_names(indices) in your GEX object instead of the cistopic_obj?

NOTE:
I am not a part of the scenic+ team; I am just a user who reads some of the issues.

@ptk1601
Copy link
Author

ptk1601 commented Dec 6, 2024

Have you considered formatting the cell_names(indices) in your GEX object instead of the cistopic_obj?

NOTE: I am not a part of the scenic+ team; I am just a user who reads some of the issues.

Hi! Thank you so much for the suggestion. I am now working with the GEX object. I have copied the cell barcodes to a column in the GEX object, but yet it seems that SCENIC+ cannot find/detect them...

@sheyiphunmi
Copy link

The problem might be more than copying the barcodes to a new column because this doesn't change the row index or names of your metadata and GEX count matrix.

For instance, the index of your metadata (adata_atac.obs) will not be changed. Do you have a Seurat or SCE version of this data? If yes, I suggest formatting the row names of your count matrix before converting to an "h5ad" object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants