Input File Format
Input Genotype
The pipeline requires genotype information to be given in the form of uncompressed VCF or compressed VCF.gz files.
If missing variants have been imputed, specify the optional parameter --r2thres to filter out poorly imputed
variants
Input Covariates
Covariates for each subject to be passed into the model can be provided via a tab-delimited file (*.tsv)
For both cross-sectional and longitudinal analysis, the pipeline expects covariates to be defined in the following format
Note: the Plink style columns #FID and PHENO can be populated with 0
#FID IID SEX PHENO study_arm apoe4 levodopa_usage age_at_baseline
0 sid-1 1 0 control 0 0 35
0 sid-2 1 0 control 0 0 40
0 sid-3 0 0 control 1 0 32
.
.
.
0 sid-98 1 0 PD 1 0 55
0 sid-99 0 0 PD 0 1 66
0 sid-100 1 0 PD 0 0 58
Input Phenotype / Outcomes
Phenotype and measured outcomes can be passed into the pipeline via a tab-delimited file (*.tsv)
For cross-sectional analysis, the pipeline expects a minimum of 2 columns in the following format
IID y
sid-1 1
sid-2 0
sid-3 1
.
.
.
sid-98 0
sid-99 0
sid-100 1