Ometer — Correlation analysis

Correlation analysis estimates the correlation between variables of a data set. Ometer can calculate two types of correlation: Pearson correlation, and Spearman (rank) correlation.

Correlation analysis is sometimes used to reveal networks between variables, where variables are linked when their correlation is above a threshold or have a P-value below a threshold. Ometer produces lists of such correlations and also creates an input file for the program Graphviz which is to be used to create a network diagram.

The data file can contain missing values (i.e. empty fields); in that case samples that do not contain data for a certain variable will not be used in estimating that variable's correlation coefficients (though the sample will still be used to estimate the correlation coefficients of the other variables).

Usage

The input data file can contain missing data (i.e. empty fields). Examples of usage:

ometer -m=correlation -o result input.dat
will analyze the data in file input.dat and put results on the following files:

All correlation relevant options:

-m correlation or --method=correlation : required!
--proj-dimension int : dimension to project data
--proj-variance double : minimum variance kept in projection
--samples-center : centers all samples before analysis (i.e. subtracts the mean value of each sample)
--vars-center : centers all variables before analysis (i.e. subtracts the mean value of each variable)
--samples-z : transforms samples into z-scores before analysis (i.e. subtracts the mean value, and divides by standard deviation of each sample)
--vars-z : transforms variables into z-scores before analysis (i.e. subtracts the mean value, and divides by standard deviation of each variable)
--samples-total double : scales all samples to the same given total before analysis
--biplot-dim int : dimension of the biplot to produce, must be 2 or 3
--biplot-alpha double : weighting factor, if 1 biplot is variables-weighted, if 0 it is samples-weighted, anything in between gives different weight to variables and samples. With alpha=0.5 the biplot is symetric.
--biplot-vars string : filename of a text file containing names of variables to include in the biplot. Names must be included one per line, and only these variables will be included in the biplot (the PCA is carried out with all variables, of course).
-f int or --format=int : input file format: variables on row=1 or columns=2
-o string or --output=string : name for output files
--proj-2dplots : create 2D data projection plots
--plot-png : gnuplot output in PNG format, only relevant if --proj-2dplots is given
--verbose int : print detailed information, higher values provide more information

All tab-delimited text files can be easily loaded into spreadsheets. The html report file should be displayed with a web browser.

back to Ometer home page