This repository has been archived by the owner on Jun 21, 2023. It is now read-only.
Updates on selection strategy, and v12 added "stranded" files. #374
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose/implementation Section
What scientific question is your analysis addressing?
A quick look at what is going on with the stranded samples added in v12, to try to determine why the new saples were clustering with the poly-A samples from the earlier data.
What was your approach?
I incorporated the additions and changes into a new file (02-selection-strategies-update.rmd) from d40097c that are part of #366. I then looked for genes that were correlated with cluster assignments based on the UMAP data.
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Which areas should receive a particularly close look?
As an intermediate analysis, I do not expect this to be a long term analysis, so the quality of the plots should be sufficient. But if there is a more refined analysis that people would like to see, this can be extended.
Is there anything that you want to discuss further?
We should discuss the next steps, given that the data do not seem to be what we had hoped to obtain with the re-sequencing. We should definitely seek clarification on the precise methods that were employed in generating these data.
Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?
n/a
Results
What types of results are included (e.g., table, figure)?
A notebook and some figures.
What is your summary of the results?
The main result is this figure showing expression levels of individual genes, divided by UMAP cluster and colored by library preparation method.
The main takeaway is that histones and noncoding RNAs have much lower expression in the cluster that contains poly-A samples, which is as expected. However, the new stranded samples have the same biases, indicating that they may have been subjected to poly-A selection during preparation.
Reproducibility Checklist
analyses/README.md
.