Skip to content

v2.0.0

Compare
Choose a tag to compare
@kevinlibuit kevinlibuit released this 16 Feb 23:47
· 104 commits to main since this release
a6df039

This major release renames workflows to utilize the TheiaCoV tag (previously Titan) and adds five new workflows for public health viral genomics.

Workflow names changed and modifications made:

  • Titan_Augur_Prep → TheiaCoV_Augur_Prep
  • Titan_Augur_Run → TheiaCoV_Augur_Run
    • Allow subsampling via user-defined builds.yml file
    • Update default nextstrain docker images (nextstrain/base:build-20210127T135203Znextstrain/base:build-20210218T081251)
  • Titan_ClearLabs
    • Update default consensus task docker container image (quay.io/staphb/artic-ncov2019:1.3.0quay.io/staphb/artic-ncov2019:1.3.0-medaka-1.4.3)
      • Note: quay.io/staphb/artic-ncov2019:1.3.0 & quay.io/staphb/artic-ncov2019-epi2me are both compatible alternative docker images
    • Use of fastq-scan rather than fastqc to calculate number of reads and pairs
    • Allow for use of a user-defined reference genome for consensus genome assembly
      • reference_genome consensus task input variable
  • Titan_Illumina_PE → TheiaCoV_Illumina_PE
    • Default minimum coverage changed from 20x to 100x (ivar consensus and ivar variants tasks)
    • Use of fastq-scan rather than fastqc to calculate number of reads and pairs
    • Allow for use of a user-defined reference genome for consensus genome assembly
      • reference_genome workflow input variable
  • Titan_Illumina_SE → TheiaCoV_Illumina_SE
    • Default minimum coverage changed from 20x to 100x (ivar consensus and ivar variants tasks)
    • Use of fastq-scan rather than fastqc to calculate number of reads and pairs
    • Allow for use of a user-defined reference genome for consensus genome assembly
      • reference_genome workflow input variable
  • Titan_ONT → TheiaCoV_ONT
    • Update default consensus task docker container image (quay.io/staphb/artic-ncov2019:1.3.0-medaka-1.4.3quay.io/staphb/artic-ncov2019-epi2me)
      • Note: quay.io/staphb/artic-ncov2019:1.3.0 & quay.io/staphb/artic-ncov2019:1.3.0-medaka-1.4.3 are both compatible alternative docker images
    • Use of fastq-scan rather than fastqc to calculate number of reads and pairs
    • Allow for use of a user-defined reference genome for consensus genome assembly
      • reference_genome consensus task input variable
  • Titan_FASTA → TheiaCoV_FASTA
  • Titan-GC → TheiaCoV-GC

Workflows Added:

  • TheiaCoV_Validate
    • Workflow that allows for the rapid comparison of critical output values generated by differing versions of TheiaCoV workflows for SARS-CoV-2 genomic characterization for bioinformatics validation purposes
  • TheiaCoV_DistanceTree
    • Workflow that allows for Augur distance trees to be generated without refinement
  • Workflows for SARS-CoV-2 Wastewater Data Analysis
    • Freyja_FASTQ
      • Workflow that allows running of the Freyja software with raw paired-end fastq files
        • This workflow will generate the required alignment that is used as input to the freya variants command that is then analyzed with freyja demix
    • Freyja_Plot
      • Workflow to visualize Freyja outputs using the freyja plot command
    • TheiaCoV_WWVC

Other modifications:

  • Default docker images updated for Pangolin (staphb/pangolin:3.1.11-pangolearn-2021-08-24quay.io/staphb/3.1.20-pangolearn-2022-02-02), VADR (staphb/vadr:1.3quay.io/staphb/1.4.1-models-1.3-2) and Nextclade (nextstrain/nextclade:1.3.0nextstrain/nextclade:1.10.3) and Nextclade dataset tag ( 2021-06-25T00:00:00Z2022-02-07T12:00:00Z) in all TheiaCOV workflows for SARS-CoV-2 genomic characterization (TheiaCoV_ClearLabs, TheiaCoV_FASTA, TheiaCoV_Illumina_PE, TheiaCoV_Illumina_SE, and TheiaCoV_ONT)
    • NOTE: In order to incorporate Nextclade ≥v1.10.0, modifications to the nextclade_one_sample were made that render it incompatible with older versions of Nextclade.
  • Inclusion of S-gene coverage calculation in all Theia_COV workflows for SARS-CoV-2 genomic characterization that incorporate an alignment step (TheiaCoV_ClearLabs, TheiaCoV_Illumina_PE, TheiaCoV_Illumina_SE, and TheiaCoV_ONT)
  • Mercury_Batch requiring Array[String] (i.e. gcp_uri) for sra_reads input (was Array[File]); this change avoids the need for localization into VM before transferring to transfer bucket for SRA read submission drastically decreasing runtime
    • This modifications means that a zipped file of reads for web portal submission is no longer produced if a gcp_bucket is not specified; instead, users are encouraged to utilize the zip_column_content workflow from the Theiagen Terra_Utilities repository to generate these files.
  • Implementation of a repository style guide