Skip to content

4. Cleaning

Sander W. van der Laan edited this page Dec 10, 2019 · 2 revisions

After parsing and harmonization the reformatted data will be cleaned based on the settings provided in metagwastoolkit.conf. Cleaning settings include:

  • MAF, minimum minor allele frequency to keep variants, e.g. default = "0.005"
  • MAC, minimum minor allele count to keep variants, e.g. default = "30"
  • HWE, Hardy-Weinberg equilibrium p-value at which to drop variants, e.g. default = "1E-6"
  • INFO, minimum imputation quality score to keep variants, e.g. default = "0.3"
  • BETA, maximum effect size to allow for any variant, e.g. default = "10"
  • SE, maximum standard error to allow for any variant, e.g. default = "10"

The resulting file, dataset.cdat, will be used for downstream plotting and analysis.

Clone this wiki locally