All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Note that CoMorMent containers are organized using several GitHub repositories:
- https://github.com/comorment/containers - .sif files, public reference data, documentation, common scripts
- https://github.com/comorment/reference - private reference data with access restricted to CoMorMent collaborator
All of the above repositories are covered by this CHANGELOG. They will have the same version tags on github. In addition, we have repositories containing specific tools, e.g. https://github.com/comorment/HDL, which will be covered by their own CHANGELOG.md file.
To identify the version of a .sif file, run md5sum <container>.sif
command and find the MD5 checksum in the list below.
If MD5 sum is not listed for a certain release then it means that the container hasn't been changed from the previous release.
- Add R packages AER, MendelianRandomization, gwasurvivr
- Add R packages lightgbm, EFAtools, RiskScorescvd, glmnet, survival, caret, PooledCohort, genio, HyPrColoc
- Add Python3 packages miniwdl, miniwdl-slurm, dxpy
- Add unit test runs as part of the GitHub Actions workflow for building Docker containers
- Add Python packages
imbalanced-learn, lightgbm, openpyxl
+ PRSice_linux binary topython3.sif
container - Add Conda environment file for project dependencies
- Add Python packages
scikit-survival, pandas-plink, numba, xmltodict, pyliftover, configparser, intervaltree
topython3.sif
container - Add
Haplin
,WSpiller/MVMR
,noahlorinczcomi/MRBEE
R packages tor.sif
container - Add container build and push actions for all containers:
- Action should trigger builds on pushes and pull requests targeting the main branch.
- Should build and push Docker and Singularity images for new tags with
v*.*.*
pattern in main branch. - Revise installation and usage documentation for images.
- Buttons added to README.md for Docker build status.
- Added options
--extract
,--extract-step1
,--extract-step2
,--exclude
,--exclude-step1
, and--exclude-step2
to thegwas.py
script to enable inclusion and exclusion of SNPs - Added support for additional customization through
config.yaml
file for association analyses - Added Rstudio-server and R packages info to
r.sif
container documentation
-
update R to 4.4.1 in
r.sif
container (from 4.0.5); update R packages to Posit/CRAN/BioConductor dated 2024.09.01; BioConductor version 3.19 (from 3.12) -
update testing scripts to support both Docker and Singularity containers
-
Update REGENIE binary to version 3.6 in
gwas.sif
container -
Update LDAK binary to version 6 in gwas.sif (from 5.2)
-
Rebuilt
gwas.sif
container with md5sum checksum:4e295149f3a5e25588cc4a1f1d39876c singularity/gwas.sif
-
Compile regenie with
HAS_BOOST_IOSTREAM=1
andHTSLIB_PATH
options -
Change LDpred2 usage example to use the OpenSNP based datasets
-
Bundle of sphinx documentation build updates/restructures
-
Refer to the project as "COSGAP-containers"
-
Minor changes to documentation + suggestion of TOC
-
migrate online documentation to cosgap.readthedocs.io
-
updated documentation to reflect the new project name
-
added references/urls to software tables in the documentation for singularity containers
-
update citation info
- Fixed missing ORAS CLI with
ubuntu-latest
runners in GitHub Actions - Fixed broken unit test in
tests/test_LDpred2/scripts/ld.sh
- Fixed broken unit test
tests/test_gwas.py::test_gwas_metal
with Apptainer "sandbox" mode - Workaround for pandas import before scipy in python codes via
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
- Fixed brittle tests if
TMPDIR
is not/tmp
- Removed Saige support and Saige-related files
- Miscellaneous goes here
- Fixed parsing of
IID
field inpheno.dict
- Fixed issue with files with different suffixes produced by plink2 for binary phenotypes in
gwas.py
- Added scripts to analyze and filter bigSNPR LD matrixes (
scripts/pgs/LDpred2/analyzeLD.R
,scripts/pgs/LDpred2/splitLD.R
).
-
Rebuilt
r.sif
container with md5sum checksum:3d69fc2168ef98d1eda3da05391cd6e4 singularity/r.sif
- added
CC-GWAS
R package tor.sif
container
- Fixed parsing of
--genomic-build hg18/hg38
inldpred2.R
- Added
samtools 1.19.2
,bedtools 2.31.1
,liftOver (latest)
togwas.sif
container - Added corresponding unit tests
-
Updated the following binaries (not listing apt package updates) in gwas.sif built
- bcftools to 1.19
- bolt to 2.4.1
- gcta to 1.94.1
- gctb to 2.04.3
- htslib to 1.19.1
- king to 2.3.2
- minimac4 to 4.1.6
- plink to v1.90b7.2 64-bit (11 Dec 2023)
- plink2 to v2.00a5.10LM 64-bit Intel (5 Jan 2024)
- plink2_avx2 to v2.00a5.10LM AVX2 Intel (5 Jan 2024)
- PRSice_linux to 2.3.5
- regenie to 3.4.1
- vcftools to git SHA: d511f469e87c2ac9779bcdc3670b2b51667935fe (0.1.17dev)
-
Rebuilt
gwas.sif
w. md5sum checksum:a775f4216b15b731471821d0c2a0da43 singularity/gwas.sif
-
updated installer scripts
- Broken
docker/scripts/build_docker.sh
script
- Added
gdb
debugger,ldak
andsnptest
binaries togwas.sif
container - Added tests for
ldak
andsnptest
binaries ingwas.sif
container
- updated
metal
to version2020-05-05
ingwas.sif
- updated
qctool
tov2.2.2
and added related binariesinthinnerator
,hptest
,ldbird
andselfmap
togwas.sif
- rebuilt
gwas.sif
(md5 checksum b6104b58d21f862f9d61a86d9d4802a6)
- Fixed broken ReadTheDocs documentation build
- Added
<containers>/scripts/pgs/pgs_toolkit
, a Python toolkit for computing PGS using LDpred2, PRSice2 or PLINK - Added
<containers>docker/scripts/build_docker.sh
script replacing corresponding build statement inMakefile
- Added test for
gcta
-
Updated
r.sif
build with many additional R packages, with corresponding updates to build recipes and tests -
Use
https://packagemanager.posit.co/cran/__linux__/focal/2023-02-16
as main R package repo -
r.sif
md5 checksum:1280ba24d99664d450b2e4c4a9c00587 singularity/r.sif
-
Updated GitHub workflow versions to current versions
- removed logging of
docker build ...
indocker/Makefile
(issues with piping totee
in case of build errors)
- Added phasing/imputation tools
beagle
,duohmm
,eagle
,shapeit5
,switchError
, togwas.sif
container + updated tests
- Fix issue that shell script wouldn't capture failing statements
-
Updated
gwas.sif
Dockerfile and installed shell scripts (misc. dependencies updates, installinggcta
version 1.93.3beta2) -
Rebuilt
gwas.sif
using Docker--no-cache
option to fix missingminimac4
binary, w. md5 checksum:a1dd235221902741bf5773945a584e47 singularity/gwas.sif
- Removed unused
install_miniconda.sh
script fromsrc/scripts
folder
- User-set directory option for temporary files during LDpred2 runs, by default
base::tempdir()
- Added
--genomic-build hg18/hg19/hg38
option toldpred2.R
to use correct LD reference meta filepos
column name
- Added a feature to read and convert BGEN (.bgen) files to
scripts/pgs/LDpred2/createBackingFile.R
- User-set directory for temporary files during LDpred2 runs, by default
base::tempdir()
- Ignore LDpred2
--col-bp <column>
arg in case--merge-by-rsid
is used
- Updated LDpred2 README file
- Update regenie to v3.2.8
- #187 - Regression in gwas.py in handling of info, maf, hwe and geno filters
- Removed time consuming genotype missingness check from
ldpred2.R
.
- Fixed misc. issues with cross references in online documentation
- Added unittest for uppercase chromosome column name in sumstats files, that may also contain chromosomes encoded as character(s)
- Fixed issue with character encoding in sumstats files, in case chromosome column name is uppercase.
- Added to
ldpred2.R
: Multi-threading ofsnp_ldsc
, arguments for parameters tosnp_ldpred2_auto
, and alternative effective sample-size calculation through--n-cases
andn-controls
.
- Solved error due to case-sensitive handling of
--col-chr
inldpred2.R
and naming of diagnostic plot when using--name-score
.
- Added
RELEASES.md
file explaining steps needed to make releases. - Added
PRSice_linux
tor.sif
- Added tests for
gwas.py
- Added package
GWASTools
tor.sif
. - Added confidence intervals to qq plots created by
gwas.py
usingGWASTools
R package. - Added status badges and citation.cff file
-
Updated file and folder layout, fixing minor documentation issues. Moving from
m2r2
toMyst-parser
for Sphinx-generated online docs. -
Rebuilt the R container
-
5ecbfc50f96bc6b25f61858927283e2d singularity/r.sif
-
Rebuilt the R container
23d195a10b84603b15d0e8c42df40fbd singularity/r.sif
- Set version file info to 1.2.dev (was 0.1.1dev)
- Fixed bad parsing of arbitrary length list of args in
usecases/LDpred2/complementSumstats.R
- Made
usecases/LDpred2/complementSumstats.R
write output file by default, not stdout. - Fixed print statement in
usecases/LDpred2/complementSumstats.R
causing crash w.--file-output
arg. - Fixed
ldpred2.R
script in case--file-pheno
/--col-pheno
/--col-pheno-from-fam
args were used, by removing these options altogether. - Use packagemanager.rstudio.com/cran/linux/focal/2023-02-16 as main R package repo
gwas.py --variance-standardize
option now throws an error when applied to columns with no variance
- Removed redundant
usecases/LDpred2_tutorial
files
- Python code max line length of 120 chars, ignore number of newlines between functions
- Python code max line length of 120 chars, ignore number of newlines between functions
Maintenance/feature release with the following main software incorporated into each container:
container | OS/tool | version | license |
---|---|---|---|
hello.sif | ubuntu | 20.04 | Creative Commons CC-BY-SA version 3.0 UK licence |
hello.sif | plink | v1.90b6.18 64-bit (16 Jun 2020) | GPLv3 |
gwas.sif | ubuntu | 20.04 | Creative Commons CC-BY-SA version 3.0 UK licence |
gwas.sif | plink | v1.90b6.18 64-bit (16 Jun 2020) | GPLv3 |
gwas.sif | plink2 | v2.00a3.6LM 64-bit Intel (14 Aug 2022) | GPLv3 |
gwas.sif | plink2_avx2 | v2.00a3.6LM AVX2 Intel (24 Jan 2020) | GPLv3 |
gwas.sif | PRSice_linux | 2.3.3 (2020-08-05) | GPLv3 |
gwas.sif | simu_linux | v0.9.4 | GPLv3 |
gwas.sif | bolt | v2.4 July 22, 2022 | GPLv3 |
gwas.sif | gcta64 | version 1.93.2 beta Linux | GPLv3 |
gwas.sif | gctb | 2.02 | MIT |
gwas.sif | qctool | 2.0.6, revision 18b8f17 | Boost |
gwas.sif | king | 2.2.9 - (c) | permissive |
gwas.sif | metal | version released on 2011-03-25 | - |
gwas.sif | vcftools | 0.1.17 | GPLv3 |
gwas.sif | bcftools | 1.12 (using htslib 1.12) | MIT/Expat/GPLv3 |
gwas.sif | flashpca_x86-64 | 2.0 | GPLv3 |
gwas.sif | regenie | v2.0.2.gz | MIT/Boost |
gwas.sif | GWAMA | 2.2.2 | BSD-3-Clause |
gwas.sif | minimac4 | v4.1.0 | GPLv3 |
gwas.sif | bgenix | 1.1.7 | Boost |
gwas.sif | cat-bgen | same version as bgenix | Boost |
gwas.sif | edit-bgen | same version as bgenix | Boost |
gwas.sif | HTSlib | 1.12 | MIT/Expat/Modified-BSD |
gwas.sif | shapeit4.2 | v4.2.2 | MIT |
python3.sif | ubuntu | 20.04 (LTS) | Creative Commons CC-BY-SA version 3.0 UK licence |
python3.sif | python3 | python 3.10.6 + numpy, pandas, etc. | PSF |
python3.sif | LDpred | 1.0.11 | MIT |
python3.sif | python_convert | github commit bcde562 | GPLv3 |
python3.sif | plink | v1.90b6.18 64-bit (16 Jun 2020) | GPLv3 |
r.sif | ubuntu | 20.04 | Creative Commons CC-BY-SA version 3.0 UK licence |
r.sif | R | 4.0.5 (2021-03-31) + data.table, ggplot, etc. | misc |
r.sif | gcta64 | version 1.93.2 beta Linux | GPLv3 |
r.sif | PRSice_linux | 2.3.3 (2020-08-05) | GPLv3 |
r.sif | rareGWAMA | dajiangliu/rareGWAMA@72e962d | - |
r.sif | GenomicSEM | GenomicSEM/GenomicSEM@bcbbaff | GPLv3 |
r.sif | TwoSampleMR | MRCIEU/TwoSampleMR@c174107 | unknown/MIT |
r.sif | GSMR | v1.0.9 | GPL>=v2 |
r.sif | snpStats | v1.40.0 | GPLv3 |
saige.sif | ubuntu | 16.04 | Creative Commons CC-BY-SA version 3.0 UK licence |
saige.sif | SAIGE | version 0.43 | GPLv3 |
Main changes since release version 1.0.0:
- add option to append
usecases/LDpred2/ldpred.R
score output to an existing file - add script
usecases/LDpred2/complementSumstats.R
to append chromosome and position to summary statistics - add polygenic score output tests for
usecases/LDpred2/ldpred.R
- add
usecases/LDpred2/imputeGenotypes.R
for imputing genotypes using R-package bigSNPR - add
usecases/LDpred2/calculateLD.R
for calculation LD using R-package bigSNPR. - add autobuilt online documentation from repository sources at https://comorment-containers.readthedocs.io/en/latest/
- add R libraries for LDpred2 analysis to
r.sif
+ corresponding example. - add tests for
metal
andqctool
ingwas.sif
build - add basic GitHub actions from https://github.com/precimed/container_template.git
- add
FaST-LMM
(version 0.6.3) to futurepython3.sif
, and corresponding test - add
shapeit4.2
binary (shapeit4 v.4.2.2) and HTSlib (1.11) to futuregwas.sif
builds, and corresponding test - added additional tests for software in
gwas.sif
,python3.sif
builds - add versions identifiers for all explicitly installed software across
hello.sif
,gwas.sif
,python3.sif
,r.sif
, listed in docker/README.md - replaced Ubuntu 18.04 with 20.04 (LTS) as base image for
hello.sif
,gwas.sif
,python3.sif
- replaced
src/scripts/install_miniconda3.sh
byscr/scripts/install_mambaforge.sh
which is now used in futurepython3.sif
builds - add tests for bgenix and Minimac4 software in
gwas.sif
, removing build-time dependencies for these from container - add basic test that KING software runs in
gwas.sif
- add Dockerfiles and install scripts for
gwas.sif
,hello.sif
,python3.sif
,r.sif
,saige.sif
from gwas. - add CHANGELOG.md (this file)
- add
gwas.py --analysis saige
option, allowing to run SAIGE analysis - add
gwas.py --analysis figures
option, using R qqman for QQ and manhattan plots - add
gwas.py --pheno-sep
and--dict-sep
options to specify delimiter for the phenotype file and phenotype dictionary file - add package
qqman
tor.sif
- add package
yaml
topython3.sif
- add
gctb_2.0_tutorial.zip
reference files underreference/examples/gctb_2.0_tutorial
- add
config.yaml
file with configuration options, which can be specified viagwas.py --config
option - add
--chunk-size-bp
and--bim
option, allowing to run SAIGE analysis in smaller chunks - add
--keep
and--remove
options togwas.py
, allowing to keep and remove subsets of individuals from analysis; the functions work similarly to plink2 as described here.
-
rebuilt the following containers following version pinning in Dockerfiles, install scripts, etc. (see above additions):
bb7a8e0b977e29e03067d75d19803913 singularity/gwas.sif 11ac9e8fe69df07d650bd5e1e7cdeee5 singularity/hello.sif c78d57397471ee802d37837ca5f8b797 singularity/python3.sif e8f26b23a8b44f15f3dfff2b02623780 singularity/r.sif a3f1d8411e1e3cf8670551b7f334a58d singularity/saige.sif
usecases/LDpred2/ldpred2.R
error when sumstats contain characters in chromosome column.- use
afterok
spec instead ofafterany
in SLURM dependencies so that next steps of the pipeline don't run if a previous step has failed (fix #26) - use SLURM's
cpus_per_task=1
for SAIGE step2, because it doesn't support --nThreads (see saigegit/SAIGE#9)
- removed
--geno-impute
fromusecases/LDpred2/ldpred2.R
. Functionality replaced by--geno-impute-zero
andusecases/LDpred2/imputeGenotypes.R
- removed misc. source/data files in /tools/* from container builds
- removed unused
libquadmath0
library from builds (affecting futuregwas.sif
,hello.sif
, andpython3.sif
builds) - the following command-line options are removed; instead, they can be specified via
config.yaml
file:--slurm-job-name
,--slurm-account
,--slurm-time
,--slurm-cpus-per-task
,--slurm-mem-per-cpu
,--module-load
,--comorment-folder
,--singularity-bind
. Note thatconfig.yaml
file is now required. gwas.py --analysis loci manh qq
options as removed (fix #22)--bed-fit
,--bed-test
,--bgen-fit
,--bgen-test
options ofgwas.py
are removed; use new options--geno-fit-file
and--geno-file
instead- remove
regenie.sif
andregenie3.sif
, because regenie software is also included ingwas.sif
- remove MiXeR package from
python3.sif
container, because MiXeR is now available as a separate container (https://github.com/comorment/mixer). This is also where you will find MiXeR's use-cases. - MAGMA, LAVA and ldblock software is moved to https://github.com/comorment/magma. MAGMA reference files are also moved to this repository.
- enigma-cnv.sif and enigma-cnv.sif is moved to https://github.com/comorment/iPsychCNV enigma-cnv.sif is also available here: in https://github.com/ENIGMA-git/ENIGMA-CNV/tree/main/CNVCalling/containers
- tryggve_query.sif is moved to https://github.com/comorment/Tryggve_psych
matlabruntime.sif
container is moved to https://github.com/comorment/matlabruntime. pleioFDR reference files are also moved to this repository.
-
initial release of the following containers:
70502c11d662218181ac79a846a0937a enigma-cnv.sif 1ddd2831fcab99371a0ff61a8b2b0970 gwas.sif b02fe60c087ea83aaf1b5f8c14e71bdf hello.sif 1ab5d82cf9d03ee770b4539bda44a5ba ipsychcnv.sif 6d024aed591d8612e1cc628f97d889cc ldsc.sif 2e638d1acb584b42c6bab569676a92f8 matlabruntime.sif 331688fb4fb386aadaee90f443b50f8c python3.sif cdbfbddc9e5827ad9ef2ad8d346e6b82 r.sif b8c1727227dc07e3006c0c8070f4e22e regenie.sif 97f75a45a39f0a2b3d728f0b8e85a401 regenie3.sif 20e01618bfb4b0825ef8246c5a63aec5 saige.sif 5de579f750fb5633753bfda549822a32 tryggve_query.sif
Here is the list of tools available in prebuilt containers:
container tool version hello.sif demo example gwas.sif plink v1.90b6.18 64-bit (16 Jun 2020) gwas.sif plink2 v2.00a2.3LM 64-bit Intel (24 Jan 2020) gwas.sif plink2_avx2 v2.00a2.3LM AVX2 Intel (24 Jan 2020) gwas.sif PRSice_linux 2.3.3 (2020-08-05) gwas.sif simu_linux Version v0.9.4 gwas.sif bolt v2.3.5 March 20, 2021 gwas.sif gcta64 version 1.93.2 beta Linux gwas.sif gctb GCTB 2.02 gwas.sif qctool version: 2.0.6, revision 18b8f17 gwas.sif king KING 2.2.6 - (c) gwas.sif metal version released on 2011-03-25 gwas.sif vcftools VCFtools (0.1.17) gwas.sif bcftools Version: 1.12 (using htslib 1.12) gwas.sif flashpca_x86-64 flashpca 2.0 gwas.sif regenie REGENIE v2.0.2.gz gwas.sif GWAMA GWAMA_v2.2.2.zip gwas.sif magma magma_v1.09a_static.zip gwas.sif shapeit2 Version : v2.r904 gwas.sif impute4 impute4.1.2_r300.3 gwas.sif minimac4 Version: 1.0.2; Built: Fri Sep 3 13:25:51 gwas.sif bgenix version: 1.1.7, revision gwas.sif cat-bgen same version as bgenix gwas.sif edit-bgen same version as bgenix python3.sif python3 python 3.10 + standard packages (numpy, pandas, etc) python3.sif ldpred ? python3.sif mixer mixer v1.3 python3.sif python_convert github commit bcde562f0286f3ff271dbb54d486d4ca1d40ae36 r.sif R version 4.0.3 + standard packages (data.table, ggplot, etc) r.sif seqminer ? r.sif rareGWAMA ? r.sif GenomicSEM ? r.sif TwoSampleMR ? r.sif GSMR v1.0.9 r.sif LAVA ? r.sif LAVA partitioning ? saige.sif SAIGE version 0.43 enigma-cnv.sif PennCNV version 1.0.5 ldsc.sif LDSC version 1.0.1 ipsychcnv.sif ???? missing Dockerfile matlabruntime.sif ???? work in progress regenie.sif ???? ? regenie3.sif ???? ?