Confusions about RunCCA result #535

ysbioinfo · 2018-06-09T09:28:36Z

HI,
So sorry for asking such a simple question, I am not good at statistics and confused about the RunCCA process.
I read your NBT paper (2018), it seems that you do CCA on two matrices: X and Y, which have the same number of rows (genes): n, but different number of columns (cells): m and p. And then CCA returns vector u and v, by which you can define the metagene for downstream alignment. Based on my understanding of CCA, the vector it returns should have different length, i.e. the length of u and v should be m and p respectively, which means a linear combination of each cell in the two groups. However, in the result of RunCCA (object@dr$cca), I see not only a weight for each cell, but also a weight for each gene. So I am confused, why there is also a linear combination for each gene? Where do these weights come from?
Thank you very much!

andrewwbutler · 2018-06-18T14:28:43Z

We define the gene loadings for CCA by multiplying the scaled expression matrix by the cell embeddings. In the paper, this is described in the first equation under the "Identification of rare non-overlapping subpopulations" subsection in the Online Methods (A = Xu).

check nn.name input type

mojaveazure added the Analysis Question label Jun 11, 2018

andrewwbutler closed this as completed Jun 18, 2018

mojaveazure pushed a commit that referenced this issue Mar 16, 2021

Merge pull request #535 from satijalab/feat/check_nn.name

855e271

check nn.name input type

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusions about RunCCA result #535

Confusions about RunCCA result #535

ysbioinfo commented Jun 9, 2018

andrewwbutler commented Jun 18, 2018

Confusions about RunCCA result #535

Confusions about RunCCA result #535

Comments

ysbioinfo commented Jun 9, 2018

andrewwbutler commented Jun 18, 2018