Documentation
MLM AnalysisDescription
DISSECT can perform MLM analysis with genotype data used to compute the Genetic Relationship Matrix. This approach is based on fitting the equation:
\[\mathbf{y}=\mathbf{X}\mathbf{\beta}+\mathbf{W}\mathbf{u}+\mathbf{\epsilon}\]where y is a vector of phenotypes, \(\mathbf{\beta}\) is a vector of fixed effects, u is a vector of SNP effects distributed as \(\mathbf{u}∼\mathbf{N}\left(0, \mathbf{I}\sigma_u^2\right)\) , I is the identity matrix, and \(\mathbf{\epsilon}\) is a vector of residual effects distributed as \(\mathbf{\epsilon}∼\mathbf{N}\left(0, \mathbf{I}\sigma_\epsilon^2\right)\). W is a genotype matrix defined by the equation:
\[w_{ik}=\frac{ \left(s_{ik} – 2p_k\right) }{ \sqrt{2p_k\left(1-p_k\right)} }\]where \(s_{ik}\) is the number of copies of the reference allele for the SNP k of the individual i, and \(p_k\) is the frequency of the reference allele for the SNP k. Under this model, the variance of y is:
where A is the genetic relationship matrix (GRM).
This model allows the computation of heritabilities, individual breeding values, SNP effect sizes, genetic correlation between traits among others. For performing the analysis, genotype files and/or GRM files must be provided. The computational performance of the analysis can be greatly improved when using a diagonalized GRM.
Examples
Compute heritability of a trait, using the genotypes in genotypes file.
dissect --reml --bfile genotypes --pheno indiviuduals.phenos --out results
Compute heritability of phenotype using a precomputed GRM.
dissect --reml --grm grmfile --pheno indiviuduals.phenos --out results
Compute heritability, individual breeding values and SNP BLUPs. Despite GRM file is specified, genotypes file must also be specified when computing SNP BLUPs. In this case, a file with a list of genotype files is specified.
dissect --reml --grm grmfile --bfile-list genotypes.list --pheno indiviuduals.phenos --indiv-blup --snp-blup --blue --initial-h2 0.7 --out results
Perform a regional analysis using regions defined in file regions. Compute individual BLUPs and BLUEs.
dissect --reml --grm grmfile --bfile-list genotypes.list --pheno indiviuduals.phenos --groups regions --indiv-blup --blue --out results
Options
Analysis Options
-- reml |
Perform MLM analysis. |
-- bivar-reml |
Perform bivariate MLM analysis. |
Input Options
-- bfile f |
Specify a genotypes file. |
-- bfile-list f |
Specify a file with a list of genotypes files. |
-- grm f |
Specify the GRM file. |
-- pheno f |
Specify the phenotypes file. |
-- covar f |
Specify the discrete covariates file. |
-- covars f1 f2 |
Specify the discrete covariates files for bivariate analysis when covariates are different for both traits. |
-- qcovar f |
Specify the quantitative covariates file. |
-- qcovars f1 f2 |
Specify the quantitative covariates files for bivarate analysis when quantitative covariates are different for both traits. |
Output Options
-- out f |
Specify the base name for output files. |
Other
-- blue |
Compute BLUEs. |
-- indiv-blup |
Compute BLUPs of individuals. |
-- snp-blup |
Compute BLUPs of SNPs. This option requires the use of –bfile or –bfile-list options. |
-- groups f |
Perform regional analysis. The regions will be those specified in the groups file, f. Diagonalized GRMs cannot be used with this analysis. |
-- reml-maxit n |
Specify maximum REML iterations. |
-- variance-constrain x |
A factor used when constrining variances. If a variance falls below zero, it will be moved to a new positive value. Their value will be computed as a function of the variance initial value and this factor. (default: 1e-6) |
-- pheno-col n |
Specify which column use from phenotypes file. (default n = 1) |
-- pheno-cols n1 n2 |
Specify which columns use from phenotypes file for in bivariate analysis. (default n1 = 1, n2 = 2) |
-- initial-h2 x |
Specify an initial value for h2. (default x = 0.5) |
-- initial-h2s x1 x2 |
Specify an initial h2 value for traits in a bivariate analysis. (default x1 = 0.5, x2 = 0.5) |
-- no-environment-cov |
Exclude environment covariance in bivariate REML analysis. |
Output description
Different output files can be generated from MLM analysis depending on the specified options.
Variances file
file extension: .reml
This file contains estimated variances and heritabilities. Each variance has two values. The first is their estimated value, the second, the estimated error.
Means file
file extension: .blue.mean
This file contains the estimated value and error of the mean.
Quantitative covariates file
file extension: .blue.quantitative
This file contains the estimated values of the slopes for the quantiative covariates.
Individual BLUPs file
file extension: .blup.indiv
File with header indicating the BLUP for each individual. Columns are:
FID | Family ID |
IID | Individual ID |
GRM | BLUPs for each individual |
SNP BLUPs file
file extension: .blup.snps
File with header indicating the BLUP for each SNP. Columns are:
SNP | SNP name |
ALLELE | Reference allele |
BLUP | SNP BLUP |
STDEV | SNP standard deviation (i.e. \(\sqrt{2p(1-p)}\) where p is the reference allele frequency) |
MEAN | SNP mean (i.e. 2*p where p is the reference allele frequency) |
NBLUP | SNP BLUP divided by SNP standard deviation (i.e. BLUP/STDEV) |