Documentation

Principal Components Analysis

Description

DISSECT can conduct principal component analysis. This type of analysis could be very computational demanding for high number of individuals and high number of markers. DISSECT parallelization opens the door to analyze huge datasets. Principal components analysis is performed by GRM diagonalization.

Examples

Perform a PCA analysis using genotypes in genotypes file.

dissect --pca --bfile genotypes --out results

Perform a PCA analysis using a precomputed GRM.

dissect --pca --grm grmfile --out results

Options

 

Analysis Options

--pca  Perform PCA analysis

 

Input Options

--bfile f Specify a genotypes file.
--bfile-list f Specify a file with a list of genotypes files.
--grm f Specify the GRM file.

 

Output Options

--out f Specify the base name for output files.

 

Others

--num-eval n Specify the number of eigenvectors/eigenvalues that will be stored.

 

Output description

Two files are generated after a PCA analysis. A file with the eigenvectors and a file with the eigenvalues.

Eigenvalues file

file extension: .pca.eigenvalues

This file contains the first bigger estimated eigenvalues. The number of eigenvalues depends on –num-eval option.

 

Eigenvectors file

file extension: .pca.eigenvectors

The first two columns contain the  individual family ID and individual ID, respectively. The following columns contain the eigenvectors. Each column contain an eigenvector for each eigenvalue in eigenvalues file. The number of eigenvectors depends on –num-eval option.