Selecting high-quality regulons #58

liz-is · 2022-11-01T09:32:16Z

liz-is
Nov 1, 2022

Hi SCENIC+ team,

I'm wondering what your advice is on the best way to select high quality regulons for downstream analysis/visualisation?

In the main tutorial you suggest using the correlation between TF expression and region-based regulon AUC, calculated with TF_cistrome_correlation(). In the preprint, however, it seems like most often you're using the correlation between gene-based regulon AUC and region-based regulon AUC to select regulons. Can you comment on what are the key considerations for using one or the other and how you go about choosing a threshold? Of course, choosing a threshold will always be somewhat arbitrary, but it would be helpful to know any key points to consider :)

Also, I wanted to calculate the correlation between gene-based regulon AUC and region-based regulon AUC for my data, but I couldn't find a function to do this - at least in the cistromes module, I wasn't sure where else to look. If I understand correctly TF_cistrome_correlation() can only calculate correlations between a TF expression and either gene-based or region-based AUC. Could you point me in the direction of the function for this, if there is one, or possibly share code for the calculation if it's not included in the package?

Thanks a lot in advance!

SeppeDeWinter · 2022-11-01T10:05:06Z

SeppeDeWinter
Nov 1, 2022
Maintainer

Hi @liz-is

Thanks for the nice question.

In our experience both work quite well for selecting high quality eRegulons.

The reason why we also use AUC vs AUC correlation is mostly because the gene-based AUC values are less sparse and more continuous, compared to TF expression, so there is a bit more power to calculate correlations. However, TF expression based correlation does also work quite well (certainly when you generate pseudobulk gene expression values).

At this moment we don't have a function in the package to calculate AUC vs AUC correlations, however it is not difficult to add. If I find some time I'll try to remember to add some code for it.

In the mean time, @cbravo93 might be able to give you a snippet of code, it will probably be in R though. Also feel free to try and calculate it yourself, happy to give feedback on (and potentially include) your code.

Hope this helps.

Best,

Seppe

0 replies

YH-Zheng · 2024-05-09T03:38:38Z

YH-Zheng
May 9, 2024

Hi @SeppeDeWinter

I have a question about selecting high-quality eRegulon through correlation. I noticed that the correlation threshold you choose is different in different data, so how high the correlation is considered to be a better threshold? In my data when I subset of cell types,like CD4 T cell from all PBMCs, the change in background will increase the correlation of the same eRegulon. These eRegulons will only be selected if the threshold is lowered to 0.4 for all cell types as background, but threshold of 0.6 can be chosen when subset, and many have been reported in the literature, but I worry that choosing a threshold of 0.4 for all cell types is too low. And I noticed that although the correlation is only 0.4, the P value of these eRegulon is very small, almost 0. Can the P value be used as a screening criterion?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Selecting high-quality regulons #58

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Selecting high-quality regulons #58

liz-is Nov 1, 2022

Replies: 2 comments

SeppeDeWinter Nov 1, 2022 Maintainer

YH-Zheng May 9, 2024

liz-is
Nov 1, 2022

SeppeDeWinter
Nov 1, 2022
Maintainer

YH-Zheng
May 9, 2024