Save PCs for projection #3759
Replies: 5 comments
-
Note The following post was exported from discuss.hail.is, a forum for asking questions about Hail which has since been deprecated. (Oct 10, 2016 at 05:36) jbloom said:Thanks for the request! In fact, you can do this already using aggregables in the Hail expression language. Run the following from the cloned mkdir example
# assign samples at random to ref panel or study group
hail \
importvcf src/test/resources/sample.vcf \
splitmulti \
annotatesamples expr -c 'sa.inRefPanel = pcoin(.5)' \
write -o example/sample.vds
hail \
read -i example/sample.vds \
filtersamples expr -c 'sa.inRefPanel' --keep \
write -o example/refPanel.vds
hail \
read -i example/sample.vds \
filtersamples expr -c 'sa.inRefPanel' --remove \
write -o example/study.vds
# compute and export AF and PC loadings for reference panel
hail \
read -i example/refPanel.vds \
variantqc \
exportvariants -c 'Variant = v, refPanelAF = va.qc.AF' \
-o example/refPanelAF.tsv \
pca -k 3 \
-o example/refPanel.scores.tsv \
-l example/refPanel.loadings.tsv \
-e example/refPanel.evals.tsv
# annotate variants with AF and PC loadings from reference panel
# combine loadings into array annotation
# annotate samples with PC scores, random sex, random case-control status
# run linear regression of case-control status against sex and PC scores
# compute and export polygenic risk
hail \
read -i example/study.vds \
variantqc \
annotatevariants table example/refPanelAF.tsv \
-e Variant \
-c 'va.refPanelAF = table.refPanelAF' \
--impute \
annotatevariants table example/refPanel.loadings.tsv \
-e 'Variant(chrom, pos, ref, alt)' \
-t 'chrom: String' \
-c 'va = merge(va, select(table, PC1, PC2, PC3))' \
--impute \
annotatevariants expr -c 'va.PCs = [va.PC1, va.PC2, va.PC3]' \
annotatesamples expr -c 'sa.PCs = gs.map(g => let p = va.refPanelAF in if (p == 0 || p == 1) [0.0, 0.0, 0.0] else (g.gt - 2 * p) / sqrt(2 * p * (1 - p)) * va.PCs).sum(), sa.isFemale = pcoin(.5), sa.isCase = pcoin(.5)' \
linreg \
-y sa.isCase \
-c sa.isFemale,sa.PCs[0],sa.PCs[1],sa.PCs[2] \
exportsamples -c 'Sample = s, polyRisk = gs.map(g => g.gt.toDouble.orElse(2 * va.qc.AF) * va.linreg.beta).sum()' \
-o example/polyRisk.tsv \
printschema \
write -o example/study.pcs.linreg.risk.vds As described in the PCA docs, for each sample, the projection is given by the product of the 1 x m vector of standardized genotypes from the study and the m x 3 matrix of PC loadings from the reference panel: 'sa.PCs = gs.map(g => let p = va.refPanelAF in if (p == 0 || p == 1) [0.0, 0.0, 0.0] else (g.gt - 2 * p) / (2 * p * (1 - p)) * va.PCs).sum()' Following linear regression and the risk computation, the key elements of the resulting annotation schema in Sample annotation schema:
sa: Struct {
...
PCs: Array[Double],
isFemale: Boolean,
isCase: Boolean
}
Variant annotation schema:
va: Struct {
...
PCs: Array[Double],
linreg: Struct {
beta: Double,
se: Double,
tstat: Double,
pval: Double
} Note that polygenic risk does not appear as a sample annotation because we directly exported the result of the computation; we could have instead annotated by polygenic risk and then export of the resulting annotation. We have several changes coming that should improve this workflow and will update this thread accordingly. |
Beta Was this translation helpful? Give feedback.
-
Note The following post was exported from discuss.hail.is, a forum for asking questions about Hail which has since been deprecated. (Oct 12, 2016 at 17:09) bavila said:This works well. I think the genotypes should be normalized as |
Beta Was this translation helpful? Give feedback.
-
Note The following post was exported from discuss.hail.is, a forum for asking questions about Hail which has since been deprecated. (Oct 14, 2016 at 00:40) jbloom said:Yes, made the change. And more are coming since we just changed PCA to directly annotate rather than export. |
Beta Was this translation helpful? Give feedback.
-
Note The following post was exported from discuss.hail.is, a forum for asking questions about Hail which has since been deprecated. (May 12, 2020 at 15:52) danking said: |
Beta Was this translation helpful? Give feedback.
-
Note The following post was exported from discuss.hail.is, a forum for asking questions about Hail which has since been deprecated. (May 12, 2020 at 15:53) danking said: |
Beta Was this translation helpful? Give feedback.
-
Note
The following post was exported from discuss.hail.is, a forum for asking questions about Hail which has since been deprecated.
(Oct 06, 2016 at 19:01) bavila said:
I want to save the PCs from a (reference) set I run a PCA on, then project another set of samples onto that PC space.
Beta Was this translation helpful? Give feedback.
All reactions