Getting Started
This page will walk you through the process of building the docker image and setting up Nextflow to run the GWAS pipeline. If running this pipeline from ADWB, skip the installation section and go to the basic usage section.
Installation
Requirements
The pipeline requires the following software prerequisites and has been tested on a Linux Ubuntu 18.04 environment
Nextflow
Docker
Java JRE 8
To enable Nextflow to manage memory resources within Docker containers set the following option in
/etc/default/grub
GRUB_CMDLINE_LINUX="cgroup_enable=memory swapaccount=1"
Clone the Repository
The repository for the pipeline can be found at https://github.com/michael-ta/longitudinal-GWAS-pipeline. Run the command below to clone the repository
git clone https://github.com/michael-ta/longitudinal-GWAS-pipeline.git
Build the Docker Image
This repository comes with the Dockerfile and local resources needed to build the docker image used by the
pipeline. After cloning the repository, you can build the image with
cd longitudinal-GWAS-pipeline
sudo docker build --build-arg BUILD_VAR=$(date +%Y%m%d-%H%M%S) -t gwas-pipeline .
The parameter BUILD_VAR sets the environment variable IMAGE_BUILD_VAR to the date and time of the current
build. This can be used to track different versions of the Docker image once built. Using the default
nextflow.config the pipeline will launch containers using a local image with the label gwas-pipeline. This
behavior can be adjusted by changing the nextflow.config or setting the option at runtime. For more details,
see the Nextflow Configuration page.
Note: invoking sudo is not necessary if the Docker user has been previously added to sudoers
Basic Usage
Once the Docker image is built, the pipeline can be called by running the following command within the local cloned irepository
sudo nextflow gwas-pipeline.nf \
--input_vcf "example/data/genetic/*.vcf" \
--covarfile "examples/basic/covar.tsv" \
--phenofile "examples/basic/pheno.tsv"
The outputs from the pipeline will be saved to the directory defined by the environment variable
GWAS_OUTPUT_DIR in the Nextflow configuration. Within the output directory you'll see the following folders
and files after running the pipeline on the basic example files
results/
cor_timestamp/
*.linear
plots/
*.
cache/
p1_run_cache/
p2_qc_pipeline_cache/