InvalidIndexError: (slice(None, None, None), 30751) while using coverage plot #493

yojetsharma · 2024-10-25T18:07:09Z

genes_violin_plot = 'MAFIP'
region='chr9:39287963-39288463'
# Create a coverage plot
fig = coverage_plot(
    SCENICPLUS_obj=scplus_obj,  # Your SCENICPLUS object
    bw_dict=bw_dict,
    region=region,
    genes_violin_plot=genes_violin_plot,
    gene_height=1,
    exon_height=4,
    meta_data_key='broader_annotation',  # Adjust based on your metadata structure
    figsize=(8, 10)  # Adjust figure size as needed
)
scplus_obj.gene_names:
Index(['MIR1302-2HG', 'AL627309.1', 'AL627309.5', 'AL627309.4', 'AP006222.2',
       'AL669831.2', 'LINC01409', 'FAM87B', 'LINC01128', 'LINC00115',
       ...
       'MAFIP', 'AC011043.1', 'AC011043.2', 'AC011841.1', 'AL354822.1',
       'AL592183.1', 'AC240274.1', 'AC004556.3', 'AC007325.4', 'AC007325.2'],
      dtype='object', length=30761)

The text was updated successfully, but these errors were encountered:

SeppeDeWinter · 2024-11-11T14:17:47Z

Hi @yojetsharma

Can you provide the full error output please?

All the best,

Seppe

yojetsharma · 2024-11-11T16:26:33Z

The code in question was run using a different pr_annotation file (i will get back to you on that complete shortly). But the code I tried recently gave the following error (pr_annotation file was the same that is used for pycistopic analysis for this):

from scenicplus.plotting.coverageplot import coverage_plot
scenicplus.plotting.coverageplot.coverage_plot(
    scplus_obj,
    bw_dict=bw_dict,
    region='chr14:100725892-100739224',
    meta_data_key='broader_annotation',
    pr_consensus_bed=pr_consensus_bed,
    pr_gtf=pr_annotation,
    genes_violin_plot='DLK1'
)

/home/praghu/yojetsharma/.conda/envs/scenicplus/lib/python3.11/site-packages/pyranges/methods/intersection.py:52: FutureWarning: In a future version, df.iloc[:, i] = newvals will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either df[df.columns[i]] = newvals or, if columns are non-unique, df.isetitem(i, newvals)
scdf.loc[:, "Start"] = new_starts
/home/praghu/yojetsharma/.conda/envs/scenicplus/lib/python3.11/site-packages/pyranges/methods/intersection.py:53: FutureWarning: In a future version, df.iloc[:, i] = newvals will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either df[df.columns[i]] = newvals or, if columns are non-unique, df.isetitem(i, newvals)
scdf.loc[:, "End"] = new_ends

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[75], line 2
      1 from scenicplus.plotting.coverageplot import coverage_plot
----> 2 scenicplus.plotting.coverageplot.coverage_plot(
      3     scplus_obj,
      4     bw_dict=bw_dict,
      5     region='chr14:100725892-100739224',
      6     meta_data_key='broader_annotation',
      7     pr_consensus_bed=pr_consensus_bed,
      8     pr_gtf=pr_annotation,
      9     genes_violin_plot='DLK1'
     10 )

File ~/.conda/envs/scenicplus/lib/python3.11/site-packages/scenicplus/plotting/coverageplot.py:288, in coverage_plot(SCENICPLUS_obj, bw_dict, region, genes_violin_plot, genes_arcs, gene_height, exon_height, meta_data_key, pr_consensus_bed, region_bed_height, pr_gtf, pr_interact, bw_ymax, color_dict, cmap, plot_order, figsize, fontsize_dict, gene_label_offset, arc_rad, arc_lw, cmap_violinplots, violinplots_means_color, violinplots_edge_color, violoinplots_alpha, width_ratios_dict, height_ratios_dict, sort_vln_plots, add_custom_ax)
    283 gtf_region_intersect = pr_gtf.intersect(pr_region)
    284 # only keep exon and gene info
    285 gtf_region_intersect = gtf_region_intersect[np.logical_and(
    286     np.logical_or(gtf_region_intersect.Feature == 'gene',
    287                   gtf_region_intersect.Feature == 'exon'),
--> 288     gtf_region_intersect.gene_type == 'protein_coding')]
    289 # iterate over all genes in intersect
    290 ax = axs_bw[subplot_idx]

File ~/.conda/envs/scenicplus/lib/python3.11/site-packages/pyranges/pyranges.py:269, in PyRanges.__getattr__(self, name)
    244 """Return column.
    245 
    246 Parameters
   (...)
    264 Name: Start, dtype: int32
    265 """
    267 from pyranges.methods.attr import _getattr
--> 269 return _getattr(self, name)

File ~/.conda/envs/scenicplus/lib/python3.11/site-packages/pyranges/methods/attr.py:67, in _getattr(self, name)
     65     return pd.concat([df[name] for df in self.values()])
     66 else:
---> 67     raise AttributeError("PyRanges object has no attribute", name)

AttributeError: ('PyRanges object has no attribute', 'gene_type')

yojetsharma · 2024-12-15T08:02:13Z

Hi @SeppeDeWinter,
Any suggestions on how to prevent the above error?

SeppeDeWinter · 2024-12-16T07:23:30Z

Hi @yojetsharma

Can you show the output of

It looks like it does not have a gene_type column.

pr_annotation

Best,

Seppe

yojetsharma · 2024-12-31T18:20:28Z

Hello @SeppeDeWinter
Hope you're doing well and Happy New Year.
This is how the pr_annotation looks like now after renaming the column name:

	Chromosome	Start	End	gene_type	Score	Strand	Transcript_type	Transcription_Start_Site
0	GL000009.2	58375	58376	NaN	.	-	protein_coding	58375
1	GL000194.1	115017	115018	NaN	.	-	protein_coding	115017
2	GL000194.1	115054	115055	MAFIP	.	-	protein_coding	115054
3	GL000195.1	49163	49164	NaN	.	-	protein_coding	49163
4	GL000213.1	139654	139655	NaN	.	-	protein_coding	139654
...	...	...	...	...	...	...	...	...
87556	chrY	6911751	6911752	AMELY	.	-	protein_coding	6911751
87557	chrY	6872607	6872608	AMELY	.	-	protein_coding	6872607
87558	chrY	21918031	21918032	RBMY1E	.	-	protein_coding	21918031
87559	chrY	24047968	24047969	CDY1B	.	-	protein_coding	24047968
87560	chrY	24048018	24048019	CDY1B	.	-	protein_coding

And now after running the following:

import pyranges as pr

pr_annotation = pd.read_table(
        os.path.join("/home/praghu/yojetsharma/131024/", "qc", "tss.bed") ##from pycisTopic workflow
    ).rename(
        {"Name": "Gene", "# Chromosome": "Chromosome"}, axis = 1)
pr_annotation["Transcription_Start_Site"] = pr_annotation["Start"]
pr_annotation = pr.PyRanges(pr_annotation)
pr_annotation
pr_annotation.to_csv('genome_annotation.tsv', sep='\t')
from scenicplus.plotting.coverageplot import coverage_plot
bigwig_dir = '/home/praghu/yojetsharma/pseudobulk_bw_files/'
bw_dict = {x.replace('.bw', ''): os.path.join(bigwig_dir, x) for x in os.listdir(bigwig_dir) if '.bw' in x}
pr_consensus_bed = pr.read_bed('/home/praghu/yojetsharma/consensus_regions.bed')
scenicplus.plotting.coverageplot.coverage_plot(
    scplus_obj,
    bw_dict=bw_dict,
    region='chr14:100725892-100739224',
    meta_data_key='broader_annotation',
    pr_consensus_bed=pr_consensus_bed,
    pr_gtf=pr_annotation,
    genes_violin_plot='DLK1'
)

Gives the following error:

File ~/miniconda3/envs/scenicplus/lib/python3.11/site-packages/scenicplus/plotting/coverageplot.py:286, in coverage_plot(SCENICPLUS_obj, bw_dict, region, genes_violin_plot, genes_arcs, gene_height, exon_height, meta_data_key, pr_consensus_bed, region_bed_height, pr_gtf, pr_interact, bw_ymax, color_dict, cmap, plot_order, figsize, fontsize_dict, gene_label_offset, arc_rad, arc_lw, cmap_violinplots, violinplots_means_color, violinplots_edge_color, violoinplots_alpha, width_ratios_dict, height_ratios_dict, sort_vln_plots, add_custom_ax)
    283 gtf_region_intersect = pr_gtf.intersect(pr_region)
    284 # only keep exon and gene info
    285 gtf_region_intersect = gtf_region_intersect[np.logical_and(
--> 286     np.logical_or(gtf_region_intersect.Feature == 'gene',
    287                   gtf_region_intersect.Feature == 'exon'),
    288     gtf_region_intersect.gene_type == 'protein_coding')]
    289 # iterate over all genes in intersect
    290 ax = axs_bw[subplot_idx]

File ~/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pyranges/pyranges.py:269, in PyRanges.__getattr__(self, name)
    244 """Return column.
    245 
    246 Parameters
   (...)
    264 Name: Start, dtype: int32
    265 """
    267 from pyranges.methods.attr import _getattr
--> 269 return _getattr(self, name)

File ~/miniconda3/envs/scenicplus/lib/python3.11/site-packages/pyranges/methods/attr.py:67, in _getattr(self, name)
     65     return pd.concat([df[name] for df in self.values()])
     66 else:
---> 67     raise AttributeError("PyRanges object has no attribute", name)

AttributeError: ('PyRanges object has no attribute', 'Feature')

Am I going wrong with my pr_annotation file?

yojetsharma · 2024-12-31T20:34:53Z

Since the previous error showed that "Feature" issue, I tried to solve that by downloading the gtf from https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_32/gencode.v32.annotation.gtf.gz . Then created using that created pr_annotation as follows:

# Read the GTF content into a DataFrame
pr_annotation = pd.read_csv(io.StringIO(content), sep='\t', comment='#', header=None,
                            names=['Chromosome', 'Source', 'Feature', 'Start', 'End', 'Score', 'Strand', 'Frame', 'Attributes'])

# Process the Attributes column to extract gene_id and gene_name
pr_annotation['gene_id'] = pr_annotation['Attributes'].str.extract('gene_id "(.*?)"')
pr_annotation['gene_name'] = pr_annotation['Attributes'].str.extract('gene_name "(.*?)"')
pr_annotation['gene_type'] = pr_annotation['Attributes'].str.extract('gene_type "(.*?)"')
pr_annotation['transcript_type'] = pr_annotation['Attributes'].str.extract('transcript_type "(.*?)"')
pr_annotation['transcript_namne'] = pr_annotation['Attributes'].str.extract('transcript_name "(.*?)"')

# Add the Transcription_Start_Site column
pr_annotation["Transcription_Start_Site"] = pr_annotation["Start"]

# Convert to PyRanges object
pr_annotation = pr.PyRanges(pr_annotation)
pr_annotation

And this is how it looks like now:


Chromosome | Source | Feature | Start | End | Score | Strand | Frame | gene_id | gene_name | gene_type | transcript_type | transcript_namne | Transcription_Start_Site
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
chr1 | HAVANA | gene | 11869 | 14409 | . | + | . | ENSG00000223972.5 | DDX11L1 | transcribed_unprocessed_pseudogene | NaN | NaN | 11869
chr1 | HAVANA | transcript | 11869 | 14409 | . | + | . | ENSG00000223972.5 | DDX11L1 | transcribed_unprocessed_pseudogene | lncRNA | DDX11L1-202 | 11869
chr1 | HAVANA | exon | 11869 | 12227 | . | + | . | ENSG00000223972.5 | DDX11L1 | transcribed_unprocessed_pseudogene | lncRNA | DDX11L1-202 | 11869
chr1 | HAVANA | exon | 12613 | 12721 | . | + | . | ENSG00000223972.5 | DDX11L1 | transcribed_unprocessed_pseudogene | lncRNA | DDX11L1-202 | 12613
chr1 | HAVANA | exon | 13221 | 14409 | . | + | . | ENSG00000223972.5 | DDX11L1 | transcribed_unprocessed_pseudogene | lncRNA | DDX11L1-202 | 13221
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ...
chrY | HAVANA | exon | 57214350 | 57214397 | . | - | . | ENSG00000227159.8_PAR_Y | DDX11L16 | unprocessed_pseudogene | unprocessed_pseudogene | DDX11L16-201 | 57214350
chrY | HAVANA | exon | 57213880 | 57213964 | . | - | . | ENSG00000227159.8_PAR_Y | DDX11L16 | unprocessed_pseudogene | unprocessed_pseudogene | DDX11L16-201 | 57213880
chrY | HAVANA | exon | 57213526 | 57213602 | . | - | . | ENSG00000227159.8_PAR_Y | DDX11L16 | unprocessed_pseudogene | unprocessed_pseudogene | DDX11L16-201 | 57213526
chrY | HAVANA | exon | 57213204 | 57213357 | . | - | . | ENSG00000227159.8_PAR_Y | DDX11L16 | unprocessed_pseudogene | unprocessed_pseudogene | DDX11L16-201 | 57213204
chrY | HAVANA | exon | 57212184 | 57213125 | . | - | . | ENSG00000227159.8_PAR_Y | DDX11L16 | unprocessed_pseudogene | unprocessed_pseudogene | DDX11L16-201 | 57212184

Then followed the same code as above to generate the coverage_plot:

from scenicplus.plotting.coverageplot import coverage_plot
bigwig_dir = '/home/praghu/yojetsharma/pseudobulk_bw_files/'
bw_dict = {x.replace('.bw', ''): os.path.join(bigwig_dir, x) for x in os.listdir(bigwig_dir) if '.bw' in x}
pr_consensus_bed = pr.read_bed('/home/praghu/yojetsharma/consensus_regions.bed')
scenicplus.plotting.coverageplot.coverage_plot(
    scplus_obj,
    bw_dict=bw_dict,
    region='chr14:100725892-100739224',
    meta_data_key='broader_annotation',
    pr_consensus_bed=pr_consensus_bed,
    pr_gtf=pr_annotation
)

I removed the genes_violin_plot parameter as that was giving SliceError.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InvalidIndexError: (slice(None, None, None), 30751) while using coverage plot #493

InvalidIndexError: (slice(None, None, None), 30751) while using coverage plot #493

yojetsharma commented Oct 25, 2024

SeppeDeWinter commented Nov 11, 2024

yojetsharma commented Nov 11, 2024 •

edited

Loading

yojetsharma commented Dec 15, 2024

SeppeDeWinter commented Dec 16, 2024

yojetsharma commented Dec 31, 2024 •

edited

Loading

yojetsharma commented Dec 31, 2024 •

edited

Loading

InvalidIndexError: (slice(None, None, None), 30751) while using coverage plot #493

InvalidIndexError: (slice(None, None, None), 30751) while using coverage plot #493

Comments

yojetsharma commented Oct 25, 2024

SeppeDeWinter commented Nov 11, 2024

yojetsharma commented Nov 11, 2024 • edited Loading

yojetsharma commented Dec 15, 2024

SeppeDeWinter commented Dec 16, 2024

yojetsharma commented Dec 31, 2024 • edited Loading

yojetsharma commented Dec 31, 2024 • edited Loading

yojetsharma commented Nov 11, 2024 •

edited

Loading

yojetsharma commented Dec 31, 2024 •

edited

Loading

yojetsharma commented Dec 31, 2024 •

edited

Loading