Skip to content

Phylogenetic analyses of CRF19 spread in Cuba and worldwide

License

Notifications You must be signed in to change notification settings

evolbioinfo/HIV1-CRF19_Cuba

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HIV1-CRF19_Cuba

Phylogenetic analyses of CRF19 spread in Cuba and worldwide.

Citation

Zhukova A, Voznica J, Dávila Felipe M, To T-H, Pérez L, Martı́nez Y, et al. (2021) Cuban history of CRF19 recombinant subtype of HIV-1. PLoS Pathog 17(8): e1009786. doi:10.1371/journal.ppat.1009786

Analysis pipelines

The snakemake folder contains Snakemake [Köster et al., 2012] pipelines for reconstruction of evolutionary history of HIV-1 in Cuba:

Data

Multiple sequence alignments

Metadata

  • data/datasets/metadata.tab contains the combined metadata for CU and LA sequences used for these analyses (produced with Snakefile_datasets pipeline). It contains the following columns: It contains the following columns:
    • id -- identifier of the sequence used in these analyses (corresponds to the tree tips);
    • POL Accession Number -- GenBank accession number for the pol part of the sequence;
    • ENV Accession Number -- GenBank accession number for the env part of the sequence;
    • source -- source of the data: LA or CU;
    • province_of_diagnostics -- province of diagnostics for CU sequences, abroad for LA;
    • country_code -- ISO2 code for the country of sampling;
    • country -- country of sampling;
    • intregion -- generalized location of sampling, includes the following values: Australia, Cuba, Eastern Africa, Eastern Asia, Middle Africa, Northern America, Northern Europe, Russia, South America, Southern Africa, Southern Asia, Southern Europe, Western Africa, Western Asia, Western Europe;
    • subregion -- generalized location of sampling, includes the following values: Asia, Australia and New Zealand, Cuba, Europe, Northern America, Russia, South America, Sub-Saharan Africa;
    • region -- generalized location of sampling, includes the following values: Africa, Americas, Asia, Cuba, Europe, Oceania;
    • sample_date -- date of sampling;
    • diagnostics_date -- date of diagnostics;
    • gender -- gender of the infected individual: F or M;
    • sexuality -- declared behavioural category: Bisexual, HT (heterosexual), or MSM (men-who-have-sex-with-men);
    • treated -- treatment status: naive or treated;
    • subtype_annotated -- pre-annotated HIV-1 subtype;
    • subtype_jpHMM -- HIV-1 subtype detected by jpHMM;
    • subtype_consensus -- HIV-1 subtype annotations used in this study: pre-annotated ones when compatible with jpHMM (e.g. matching jpHMM breakpoints for CRF19), otherwise jpHMM prediction;
    • subtype_tree -- HIV-1 subtype detected by the phylogenetic trees in this study (same as consensus for the majority of the sequences, see \nameref{par:phylogeny*);
    • subtype_CRF_19_D -- whether the sequence is compatible with CRF19 in its D part according to jpHMM: D if yes, empty otherwise;
    • subtype_CRF_19_G -- whether the sequence is compatible with CRF19 in its G part according to jpHMM: G if yes, empty otherwise;
    • subtype_CRF_19_A1 -- whether the sequence is compatible with CRF19 in its A1 part according to jpHMM: A1 if yes, empty otherwise.
  • data/datasets/iTOL_colorstrip-subtype.txt contains an iTOL-compatible colourstrip representing HIV-1 subtype detected by the phylogenetic trees in this study;
  • data/datasets/D_CRF_19/metadata.drms.tab, data/datasets/D_CRF_19/metadata.drugs.tab contain the Surveillance DRM and ARV metadata for the D+CRF_19 sequences extracted with Sierra (see Snakefile_datasets pipeline).
  • data/datasets/D_CRF_19/lsd2.dates, data/datasets/A1_CRF_19/lsd2.dates, data/datasets/G_CRF_19/lsd2.dates contain the dates and constraints used to date the D/A1/G phylogenies with LSD2.

Phylogenies

Timetrees

BEAST

About

Phylogenetic analyses of CRF19 spread in Cuba and worldwide

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages