Documentation

MLM Analysis

Description

Warning: Javascript is disabled. Equations will not be displayed properly.
DISSECT can perform MLM analysis with genotype data used to compute the Genetic Relationship Matrix. This approach is based on fitting the equation:

\[\mathbf{y}=\mathbf{X}\mathbf{\beta}+\mathbf{W}\mathbf{u}+\mathbf{\epsilon}\]

where y is a vector of phenotypes, \(\mathbf{\beta}\) is a vector of fixed effects, u is a vector of SNP effects distributed as \(\mathbf{u}∼\mathbf{N}\left(0, \mathbf{I}\sigma_u^2\right)\) , I is the identity matrix, and \(\mathbf{\epsilon}\) is a vector of residual effects distributed as \(\mathbf{\epsilon}∼\mathbf{N}\left(0, \mathbf{I}\sigma_\epsilon^2\right)\).  W is a genotype matrix defined by the equation:

\[w_{ik}=\frac{ \left(s_{ik} – 2p_k\right) }{ \sqrt{2p_k\left(1-p_k\right)} }\]

where \(s_{ik}\) is the number of copies of the reference allele for the SNP k of the individual i, and \(p_k\) is the frequency of the reference allele for the SNP k. Under this model, the variance of y is:

\[\text{var}\left(\mathbf{y}\right)=\mathbf{A}\sigma_g^2+\mathbf{I}\sigma_\epsilon^2\]

where A is the genetic relationship matrix (GRM).

This model allows the computation of heritabilities, individual breeding values, SNP effect sizes, genetic correlation between traits among others. For performing the analysis, genotype files and/or GRM files must be provided. The computational performance of the analysis can be greatly improved when using a diagonalized GRM.

Examples

Compute heritability of a trait, using the genotypes in genotypes file.

dissect --reml --bfile genotypes --pheno indiviuduals.phenos --out results

Compute heritability of phenotype using  a precomputed GRM.

dissect --reml --grm grmfile --pheno indiviuduals.phenos --out results

Compute heritability, individual breeding values and SNP BLUPs. Despite GRM file is specified, genotypes file must also be specified when computing SNP BLUPs. In this case, a file with a list of genotype files is specified.

dissect --reml --grm grmfile --bfile-list genotypes.list --pheno indiviuduals.phenos --indiv-blup --snp-blup --blue --initial-h2 0.7 --out results

Perform a regional analysis using regions defined in file regions. Compute individual BLUPs and BLUEs.

dissect --reml --grm grmfile --bfile-list genotypes.list --pheno indiviuduals.phenos --groups regions --indiv-blup --blue --out results

Options

Analysis Options

--reml Perform MLM analysis.
--bivar-reml Perform bivariate MLM analysis.

Input Options

--bfile f Specify a genotypes file.
--bfile-list f Specify a file with a list of genotypes files.
--grm f Specify the GRM file.
--pheno f Specify the phenotypes file.
--covar f Specify the discrete covariates file.
--covars f1 f2 Specify the discrete covariates files for bivariate analysis when covariates are different for both traits.
--qcovar f Specify the quantitative covariates file.
--qcovars f1 f2 Specify the quantitative covariates files for bivarate analysis when quantitative covariates are different for both traits.

Output Options

--out f Specify the base name for output files.

Other

--blue Compute BLUEs.
--indiv-blup Compute BLUPs of individuals.
--snp-blup Compute BLUPs of SNPs. This option requires the use of –bfile or –bfile-list options.
--groups f Perform regional analysis. The regions will be those specified in the groups file, f. Diagonalized GRMs cannot be used with this analysis.
--reml-maxit n Specify maximum REML iterations.
--variance-constrain x A factor used when constrining variances. If a variance falls below zero, it will be moved to a new positive value. Their value will be computed as a function of the variance initial value and this factor. (default: 1e-6)
--pheno-col n Specify which column use from phenotypes file. (default n = 1)
--pheno-cols n1 n2 Specify which columns use from phenotypes file for in bivariate analysis. (default n1 = 1, n2 = 2)
--initial-h2 x Specify an initial value for h2. (default x = 0.5)
--initial-h2s x1 x2 Specify an initial h2 value for traits in a bivariate analysis. (default x1 = 0.5, x2 = 0.5)
--no-environment-cov Exclude environment covariance in bivariate REML analysis.

Output description

Different output files can be generated from MLM analysis depending on the specified options.

Variances file

file extension: .reml

This file contains estimated variances and heritabilities. Each variance has two values. The first is their estimated value, the second, the estimated error.

Means file

file extension: .blue.mean

This file contains the estimated value and error of the mean.

Quantitative covariates file

file extension: .blue.quantitative

This file contains the estimated values of the slopes for the quantiative covariates.

Individual BLUPs file

file extension: .blup.indiv

File with header indicating the BLUP for each individual. Columns are:

FID Family ID
IID Individual ID
GRM BLUPs for each individual

SNP BLUPs file

file extension: .blup.snps

File with header indicating the BLUP for each SNP. Columns are:

SNP SNP name
ALLELE Reference allele
BLUP SNP BLUP
STDEV SNP standard deviation (i.e. \(\sqrt{2p(1-p)}\) where p is the reference allele frequency)
MEAN SNP mean (i.e. 2*p where p is the reference allele frequency)
NBLUP SNP BLUP divided by SNP standard deviation (i.e. BLUP/STDEV)