Command-line interface¶
Association analysis¶
Models the association between phenotypes and genotypes, accepting additional covariates and parameters to account for population structure and relatedness between samples. Users can choose between single-trait and multi-trait models, simple linear or linear mixed model set-ups.
usage: runAssociation [-h] [-p FILE_PHENO] [--pheno_delim PHENO_DELIM]
[-g FILE_GENOTYPES] [--geno_delim GENOTYPES_DELIM] -o
OUTDIR (-st | -mt) (-lm | -lmm) [-n NAME] [-v]
[-k FILE_RELATEDNESS]
[--kinship_delim RELATEDNESS_DELIM] [-cg FILE_CG]
[--cg_delim CG_DELIM] [-cn FILE_CN]
[--cn_delim CN_DELIM] [-pcs FILE_PCS]
[--pcs_delim PCS_DELIM] [-c FILE_COVARIATES]
[--covariate_delim COVARIATE_DELIM]
[-adjustP {bonferroni,effective,None}]
[-nrpermutations NRPERMUTATIONS] [-fdr FDR] [-seed SEED]
[-tr {scale,gaussian}] [-reg] [-traitset TRAITSTRING]
[-nrpcs NRPCS]
[--file_samplelist FILE_SAMPLELIST | --samplelist SAMPLELIST]
[--plot] [-colourS COLOURS] [-colourNS COLOURNS]
[-alphaS ALPHAS] [-alphaNS ALPHANS] [-thr THR]
[--version]
Basic required arguments¶
-p, --file_pheno | |
Path [string] to [(N+1) x (P+1)] .csv file of [P] phenotypes with [N] samples (first column: sample IDs, first row: phenotype IDs). Default: None | |
--pheno_delim | Delimiter of phenotype file. g Default: “,” |
-g, --file_geno | |
Genotype file: either [S x N].csv file (first column: SNP id, first row: sample IDs) or plink formated genotypes (.bim/.fam/.bed). Default: None | |
--geno_delim | Delimiter of genotype file (if not in plink format). Default: “,” |
-o, --outdir | Path [string] of output directory; user needs writing permission. Default: None |
-st, --singletrait | |
Set flag to conduct a single-trait association analysesDefault: False | |
-mt, --multitrait | |
Set flag to conduct a multi-trait association analysesDefault: False | |
-lm, --lm | Set flag to use a simple linear model for the associationanalysis |
-lmm, --lmm | Set flag to use a linear mixed model for the associationanalysis |
Output arguments¶
-n, --name | Name (used for output file naming). Default: |
-v, --verbose | [bool]: should analysis progress be displayed. Default: False |
Optional files¶
-k, --file_kinship | |
Path [string] to [N x (N+1)] file of kinship/relatedness matrix with [N] samples (first row: sample IDs). Required when –lmm/-lm. Default: None | |
--kinship_delim | |
Delimiter of kinship file. g Default: “,” | |
-cg, --file_cg | Required for large phenotype sizes when –lmm/-lm; computed via runLiMMBo; specifies file name for genetic trait covariance matrix (rows: traits, columns: traits). Default: None |
--cg_delim | Delimiter of Cg file. g Default: “,” |
-cn, --file_cn | Required for large phenotype sizeswhen –lmm/-lm; computed via runLiMMBo; specifies file name for non-genetic trait covariance matrix (rows: traits, columns: traits). Default: None |
--cn_delim | Delimiter of Cn file. g Default: “,” |
-pcs, --file_pcs | |
Path to [N x PCs] file of principal components from genotypes to be included as covariates (first column: sample IDs, first row: PC IDs); Default: None | |
--pcs_delim | Delimiter of PCs file. g Default: “,” |
-c, --file_cov | Path [string] to [(N+1) x C] file of covariates matrix with [N] samples and [K] covariates (first column: sample IDs, first row: phenotype IDs). Default: None |
--covariate_delim | |
Delimiter of covariates file. g Default: “,” |
Optional association parameters¶
-adjustP, --adjustP | |
Possible choices: bonferroni, effective, None Method to adjust single-trait p-values formultiple hypothesis testing when runningmultiple single-trait GWAS: bonferroni/effective number of tests `(Galwey,2009) <http://onlinelibrary.wiley.com/doi/10.1002/gepi.20408/abstract>`_Default: None | |
-nrpermutations, --nrpermutations | |
Number of permutations for computing empirical p-values; 1/nrpermutations is maximum level of testing for significance. Default: None | |
-fdr, --fdr | FDR threshold for computing empirical FDR. Default: None |
-seed, --seed | Seed [int] to inittiate random number generation for permutations. Default: 256 |
Optional data processing parameters¶
-tr, --transform_method | |
Possible choices: scale, gaussian Choose type [string] of data preprocessing: scale (mean center, divide by sd) or gaussian (inverse normalise). Default: “scale” | |
-reg, --reg_covariates | |
[bool]: should covariates be regressed out? Default: False |
Optional subsetting options¶
-traitset, --traitset | |
Comma- (for list of traits) or hyphen- (for trait range) or comma and hyphen-separated list [string] of traits (trait columns) to choose; default: None (=all traits). Default: None | |
-nrpcs, --nrpcs | |
First PCs to chose. Default: 10 | |
--file_samplelist | |
Path [string] to file with samplelist for sample selection, with one sample ID per line. Default: None | |
--samplelist | Comma-separated list [string] of samples IDs to restrict analysis to, e.g. ID1,ID2,ID5,ID9,ID10. Default: None |
Plot arguments¶
Arguments for depicting GWAS results as manhattan plot
--plot | Set flag if results of association analysis should be depicted as manhattan and quantile-quantile plot |
-colourS, --colourS | |
Colour of significant points in manhattan plot | |
-colourNS, --colourNS | |
Colour of non-significant points in manhattan plot | |
-alphaS, --alphaS | |
Transparency of significant points in manhattan plot | |
-alphaNS, --alphaNS | |
Transparency of non-significant points in manhattan plot | |
-thr, --threshold | |
Significance threshold; when –fdr specified, empirical fdr used as threshold |
Version¶
--version | show program’s version number and exit |
Variance decomposition¶
Estimates the genetic and non-genetic traitcovariance matrix parameters of a linear mixed model with random genetic and non-genetic effect via a bootstrapping-based approach.
usage: runVarianceEstimation [-h] [-p FILE_PHENO] [--pheno_delim PHENO_DELIM]
[-k FILE_RELATEDNESS]
[--kinship_delim RELATEDNESS_DELIM] -o OUTDIR
[-c FILE_COVARIATES]
[--covariate_delim COVARIATE_DELIM] [-seed SEED]
-sp S [-r RUNS] [-t]
[--minCooccurrence MINCOOCCURRENCE]
[-i ITERATIONS] [-cpus CPUS]
[-tr {scale,gaussian}] [-reg]
[-traitset TRAITSTRING]
[--file_samplelist FILE_SAMPLELIST | --samplelist SAMPLELIST]
[-dontSaveIntermediate] [-v] [--version]
Basic required arguments¶
-p, --file_pheno | |
Path [string] to [(N+1) x (P+1)] .csv file of [P] phenotypes with [N] samples (first column: sample IDs, first row: phenotype IDs). Default: None | |
--pheno_delim | Delimiter of phenotype file. g Default: “,” |
-k, --file_kinship | |
Path [string] to [N x (N+1)] file of kinship/relatedness matrix with [N] samples (first row: sample IDs). Required when –lmm/-lm. Default: None | |
--kinship_delim | |
Delimiter of kinship file. g Default: “,” | |
-o, --outdir | Path [string] of output directory; user needs writing permission. Default: None |
Optional files¶
-c, --file_cov | Path [string] to [(N+1) x C] file of covariates matrix with [N] samples and [K] covariates (first column: sample IDs, first row: phenotype IDs). Default: None |
--covariate_delim | |
Delimiter of covariates file. g Default: “,” |
Bootstrapping parameters¶
-seed, --seed | seed [int] used to generate bootstrap matrix. Default: 234 |
-sp, --smallp | Size [int] of phenotype subsamples used for variance decomposition. Default: None |
-r, --runs | Total number [int] of bootstrap runs. Default: None |
-t, --timing | [bool]: should variance decomposition be timed. Default: False |
--minCooccurrence | |
Minimum count [int] of the pairwise sampling of any given trait pair. Default: 3 | |
-i, --iterations | |
Number [int] of iterations for variance decomposition attempts. Default: 10 | |
-cpus, --cpus | Number [int] of available CPUs for parallelisation of variance decomposition steps. Default: None |
Optional data processing parameters¶
-tr, --transform_method | |
Possible choices: scale, gaussian Choose type [string] of data preprocessing: scale (mean center, divide by sd) or gaussian (inverse normalise). Default: “scale” | |
-reg, --reg_covariates | |
[bool]: should covariates be regressed out? Default: False |
Optional subsetting options¶
-traitset, --traitset | |
Comma- (for list of traits) or hyphen- (for trait range) or comma and hyphen-separated list [string] of traits (trait columns) to choose; default: None (=all traits). Default: None | |
--file_samplelist | |
Path [string] to file with samplelist for sample selection, with one sample ID per line.Default: None | |
--samplelist | Comma-separated list [string] of samples IDs to restrict analysis to, e.g. ID1,ID2,ID5,ID9,ID10. Default: None |
Output arguments¶
-dontSaveIntermediate, --dontSaveIntermediate | |
Set to suppress saving intermediate variance components. Default: True | |
-v, --verbose | [bool]: should analysis step description be printed. Default: False |
Version¶
--version | show program’s version number and exit |