Poirot RD WGS¶
Mapping reads and BAM file processing¶
When GPUs are available Poirot can be configured to use Nvidia's Parabricks for read mapping using fq2bam tool. This tool performs read mapping with a GPU-accelerated version of BWA-mem, sorting and marking of duplicates. See the alignment hydra-genetics module or parabricks hydra-genetics module documentation for more details on the softwares. Default hydra-genetics settings/resources are used if no configuration is specified.
When only CPUs are available Poirot can be configured perform the read mapping, sorting and duplicate marking on CPU.
- read mapping BWA-MEM
- read sorting Samtools sort
- marking duplicates with Picard MarkDuplicates
Variant Calling¶
See the snv_indels hydra-genetics module or parabricks hydra-genetics module documentation for more details on the softwares for variant calling, annotation hydra-genetics module for annotation, filtering hydra-genetics module for filtering and cnv hydra-genetics module for more details on the softwares for cnv calling. Default hydra-genetics settings/resources are used if no configuration is specified.
Annotation of variant calls¶
Variant calls for both SNVs and indels, and SVs can be performed by Ensembl's VEP tool. Howver this is optional and can be set in the config. See the section on running the pipeline for details.
SNV and INDELs¶
- Parabricks DeepVariant when run on GPU or Google's DeepVariant when run on CPU
- Glnexus
- Used to create a multisample VCF file analysed with Peddy.
- Used for the creation of trio VCF files used for UPD analysis
Mitochondrial short variants¶
- GATK Mitochondrial short variant discovery (SNVs + Indels) for hydra genetics documantation mitochondrial pipeline is found here.
CNVs and SVs¶
-
CNV callers
- CNVpytor and hydra genetics documentation ExomeDepth
-
Structural variant callers
-
Mobile elements
- MELT that call ALU, HERVK, LINE1 and SVA mobile elements.
-
Merging and filtering of SV VCF files
- SVDB merge used to merge the Tiddit, Manta and CNVpytor VCF files. Hydra genetics documentation
- SVDB query used to annotate the merge VCF with frequency information from local SV databases
- Annotation of SVDB merged VCF with Gnomad v4.0 AF using the Ensembl Variant Effect Predictor
- Filtering of SV annotated VCF files based on Gnomad AF and the frequency of each SV called in local svdb databases
Repeat expansions¶
- Calling and QC of repeat expansions calls
- Expansion Hunter estimates the size of short tandem repeats from WGS illumina data. Hydra genetics documentation
- QC of repeat expansions calls
- REViewer is a tools for visualising the read support for the calls made by Expansion Hunter. Hydra genetics documentation
- Annotation and determination of pathogenic status of repeats with STRanger and Hydra genetics documentation.
Regions Of Homozygosity¶
SMN Copy Number¶
- SMNCopyNumberCaller and [Hydra genetics documentation](https://hydra-genetics-cnv-sv.readthedocs.io/en/latest/softwares/#smncopynumbercaller
UniParental Disomy¶
QC¶
See the qc hydra-genetics module documentation for more details on the softwares for the quality control. Default hydra-genetics settings/resources are used if no configuration is specified.
Poirot produces a MultiQC-report for the entire sequencing run to enable easier QC tracking. The report starts with a general statistics table showing the most important QC-values followed by additional QC data and diagrams. The entire MultiQC html-file is interactive and you can filter, highlight, hide or export data using the ToolBox at the right edge of the report.
- The MultiQC-report contains QC data from the following programs:
Coverage for genes and gene panels.¶
Results written to an excel spreadsheet with a tab for each gene panel.
To implement
- GATK CNV germline caller
- Continued work on SV calling and filtering
- Mobile elements
- Several sex-checks
- samtools idxstats helps with determining sex, can see XXY and females with highly homozygote chrX (make a table with predicted sex based on this)