-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get top explanatory features from PCoA loadings #541
Comments
Great, thanks for suggesting. Some considerations:
I think it could be useful to create a method and take it into use, then we have more experience using it before concluding how to develop further. |
One reason for not making this too simple is to avoid application by users who don't understand the methods and implications.. therefore one option is also to make a first version (including visualization style) just for the ordinary PCA, for which proper component loadings exist. Then we can use that to streamline the procedure and develop the ideal visualization, while considering how to do this for the non-linear methods. |
Thank you for the thorough remarks! Considering what you said, perhaps would it be more convenient to extend Do you think adding this functionality to |
If these two cannot be solved on the same go at least new issues could be opened on each? |
We could first try to add support for reducedDim directly to getCrossAssociation. If that makes the function too complex, we could create new function that still utilizes getCrossAssociation internally. (Better to have one workhorse that we keep improving) |
I quickly checked getCrossAssociation and I think it would take me a very long time to familiarise with it and add the reducedDim functionality. Unless you would like to give it a try, I might opt for a new function specific for assay to dimred comparison. |
Seems best to go with an option that utilizes getCrossAssociation internally. The crossAssociation solution could have use for correlating reducedDims not just to assay features but also to colData. That is also relevant (e.g. which diet components from colData correlate best with the first PC axis etc. and how to visualize that). But the main motivation for this currently seems to be the ability to characterize driving features on each PC axis for PCA & PCoA. Potentially also for NMDS. I would not recommend UMAP and t-SNE by default, and interpretation this way would be even more problematic for those methods. This still makes me wonder whether we could first focus on providing the rigorous and theoretically justified visualization solutions for PCA & PCoA (or one of these) first? Links to the solutions are provided above. |
It sounds like a good option. And while we don't have a solution that uses getCrossAssociation, for now we could calculate correlations on the fly within the visualisation function? |
Hmm, there shouldn't be a need to calculate correlations for the rigorous option (PCA and PCoA)? We can work through an example but essentially, we can dig out variable loadings for PCA and PCoA from the method itself. For PCA directly, for PCoA a bit more indirectly. |
@ElySeraidarian can you close as this becomes solved? |
Hi! Currently, when performing ordination (PCoA), there is no straightforward way to retrieve the original features that contribute the most to the reduced dimensions. Every time, the user would need to manually correlate features to PCoA loadings for example like this:
I would propose to create a function that takes dimred of interest and correlates loading to an assay.type of interest, and then returns the top n features for every dimension in the form of a dataframe. If relevant, both get and add methods can be developed.
I have a working draft so a PR about this can be opened soon. Do you think that
getPrincipalFeatures
(in reference to principal components) would be a suitable name? Where is the best place to define it, maybe insummaries.R
next to the othergetFeatures
utilities?The text was updated successfully, but these errors were encountered: