-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-Ensembl GTF testing #110
Conversation
@MarieLataretu actually I think that's all for now. So this branch works with my custom GTF-styled annotation file. Sure, there can be more generalized and such, but I think it's fine for now. If you agree, we can merge (how? should we first merge the current master into that branch? o_O) |
bin/deseq2.R
Outdated
plot.heatmap.top_fc <- function(out.dir, resFold, trsf_data, trsf_type, ntop, pcutoff='', samples.info=df.samples.info, genes.info=df.gene.anno) { | ||
selected.ensembl.ids <- row.names(resFold[order(resFold$log2FoldChange, decreasing=TRUE), ])[1:ntop] | ||
# check how many elements are in the dataframe | ||
# if less elements are in the dataframe than selected by ntop, reduce ntop | ||
if (length(resFold$log2FoldChange) < ntop) { | ||
ntop = length(resFold$log2FoldChange) | ||
} | ||
if (ntop > 1) { | ||
selected.ensembl.ids <- row.names(resFold[order(resFold$log2FoldChange, decreasing=TRUE), ])[1:ntop] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed that also in #123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed that also in #123
:D
I just run the test profile (so normal Ensembl annotation):
non-Ensembl test is scheduled for today.
My first guess would be to merge this branch into master (there were no changes in the master in this module). |
Ah I see. Just checked #123 and agree, we could merge the changes from #123 into this branch #110, check if everything still works (I can also do a test again with my customized GTF file) and then merge into master? |
What is the difference to a normal Ensembl annotation? Right now we still expect 'gene |
yep, the main difference I checked was that there is ne ENSxxxxxxx ID. So my structure of the GTF is still a valid hierarchical GTF and looks like this:
So I think that at least any valid GTF formatted file with a gene and exon feature and a gene_id in the descr column should work. |
…ab/rnaflow into MarieLataretu/issue116
Marie lataretu/issue116
These three commands run without errors, the
From my side we can merge into master! |
yeah! I scrolled the changes again - from my side please merge! Then we also do a new release? And should we reply to Ahmed (#116) again? I think this issue thread started the whole transcript/exon input change |
Okay! Looks like a minor release to me.
Yes, at least the non-Ensembl part should work! |
I'm testing the workflow for genome/annotation that are not derived from Ensembl. But still, the GTF follows the Ensembl structure (gene, transcript, exon).
The test worked surprisingly well :) I just run into a general issue with a plotting function that failed because after p-value filtering fewer genes were remaining than defined by
ntop
.Please leave this PR open: I will check the output and improve some of the scripts to plot still meaningful output even if we can not extract information such as
gene_biotype
from the GTF. Also, we don't want to link to Ensembl with a GTF that does not have matching IDs.