-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add minimum number of cells (3) for merge #801
Conversation
Merge changes from #799 into main
belt and suspenders
Tests passed, ready for review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one suggestion for R code if you want it!
.filter{it[2].exists() && it[2].size() > 0} | ||
// only include libraries that have been processed through scpca-nf and have at least 3 cells | ||
.filter{it[2].exists() && it[2].size() > 0 && Utils.getMetaVal(it[3], "processed_cells") >= 3} | ||
.map{it[0..2]} // remove metadata file from tuple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gather 0..2
syntax is the groovy equivalent of R's 0:2
(...ok, 1:3
what with indexing from 1 😄 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. I could have done a number of things to get rid of the last element. Probably the best is it.dropRight(1)
to remove the last element, but this seemed fine here.
Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>
Following #799, here I am also filtering the merged data to not include samples with 2 or fewer cells. This turns out to be necessary for the calculation of tehe grouped PCA (at least that is what I think is going on).
I took a bit of a belt and suspenders approach here, checking both in the metadata during the workflow itself and in the R script where I check the SCEs before merging.
There is also a little cleanup in one of the
Util
functions so it doesn't throw an error if a file doesn't exist.I did a small test with this and it seemed to work, but a larger test is still in progress, which is why I am filing this as a draft.