Description: Perform various matrix manipulations on matrices in RDB or white-space delimited format.
Usage:
gist-matrix -matrix1 <filename> [-matrix2 <filename>]
Input:
- -matrix1 <filename> - a tab-delimited matrix file containing floating point values, with labels in the first row and column.
- -matrix2 <filename> - similar to matrix1. This option is only necessary for some of the operations listed below.
Output: Write to standard output the result of performing the requested operation on the given matrix or matrices. The output is in the same format as the input.
Options:
- -operation <option> Specify the operation to be performed on the given matrix or matrices. These include the following:
- none - Do not modify the matrix.
- size - Print the dimensions of the matrix, row by column.
- randomize - Fill the matrix with random numbers between 0 and 1.
- covariance - Compute the covariance matrix.
- square - Multiply the matrix by itself (must be a square matrix).
- invert - Invert the matrix (must be a square matrix).
- binarize - Convert all entries in the matrix to -1 if less than a given threshold value, 1 otherwise.
- transpose - Transpose the matrix.
- eigenvectors - Compute the eigenvectors of the given matrix (must be symmetric across the diagonal).
- eigenvalues - Compute the eigenvalues of the given matrix (must be symmetric across the diagonal).
- eigenvector1 - Compute only the first eigenvector of the given matrix (symmetry not required).
- symdiag - Symmetrize the matrix by averaging across the diagonal.
- adddiag - Add a given value (specified using the -value option) to the diagonal of the matrix.
- getrowsums - Extract the row sums.
- getcolsums - Extract the column sums.
- getdiag - Extract the diagonal of the matrix.
- add - Add two matrices together (requires -matrix2 option).
- normalizerow - Compute the sum of each row in a given matrix, and divide each entry in the matrix by the sum. If matrix2 is given, then the division is carried out on that matrix.
- normalizecol - Same as normalizerow, but works vertically.
- rescalerow - Linearly rescale the values in each row so that the minimum is -1 and the maximum is 1.
- rescalecol - Same as previous, but for columns.
- normalize - Iteratively normalize the matrix rows and columns so that they sum to 1.0.
- zeromeanrow - Set the mean of each row in the matrix to zero.
- zeromeancol - Set the mean of each column in the matrix to zero.
- varone - Set the variance of each row in the matrix to one.
- colvars - Compute the variance of each column.
- scalarmult - Multiply the matrix by a scalar value (specified via the -value option).
- dotmult - Multiply corresponding entries in two matrices of the same dimensionality.
- posdef - Force a symmetric matrix to be positive definite by adding to the diagonal the opposite of the smallest eigenvalue.
- row-correlation - Compute all pairwise correlations between rows in a matrix. Admits a second, optional argument to compute correlations between rows in one matrix and rows in a second.
- col-correlation - Compute all pairwise correlations between columns in a matrix. Like the previous option, admits a second matrix.
- jacard - Compute all pairwise Jacard similarities between rows in a matrix. The Jacard similarity is J(X,Y) = \sum min(Xi/YI, Yi/XI)
- shuffle - Randomly shuffle the entries in the matrix.
- alignment - Compute the alignment score between two square matrices of the same dimensionality. This score is <M1,M2> / \sqrt(<M1,M1> <M2,M2>).
- diagonal - Compute the alignment score between a given matrix and the corresponding diagonal matrix (i.e., the matrix with zeroes for off-diagonal elements and ones for diagonal elements).
- label-align - Compute the alignment score between a given matrix and a label matrix that is created from a given label file. The label file (which is specified using the
-matrix2
option), must be a two-column file, of the type used bycompute-weights
, with labels of 1 and -1. The program multiplies the label column by its transpose and then calculates the alignment between the given matrix (specified by-matrix
and the square label matrix.- scalediag - Take a square matrix as input, and replace the diagonal elements with the sum of each row, minus the given diagonal element.
- euclidean - Convert a square kernel matrix to the corresponding matrix of Euclidean distances.
- center - Center a given kernel in feature space. The second kernel is centered using feature means from the first kernel. The first kernel must be square, and the second must have the same columns as the first.
- setdiag - Set the entries on the diagonal of a given square matrix to a given value.
- diffusion - Perform a diffusion operation on the given symmetric matrix, using the specified value as the diffusion constant.
- euclidean - Convert a square kernel matrix to a Euclidean distance matrix using d(x,y) = sqrt(k(x,x) - 2k(x,y) + k(y,y)).
- int-dim - Estimate the intrinsic dimensionality of a square kernel matrix. This is done by converting the kernel to a Euclidean distance matrix, and then computing the squared mean inter-object distance divided by the corresponding variance.
- modav - Compute the mean off-diagonal absolute value in a square matrix.
- -width <value> - Total number of digits in each output value.
- -precision <value> - Number of digits to the right of zero in each output value.
- -value <value> - Specify a value to be used in the matrix operation. When normalizing, this value is the precision to which the normalization is carried. With "adddiag," this value is the number to be added to the diagonal of the given matrix. With "scalarmult," it is the scalar value to multiply the matrix by.
- -seed <value> - Set the seed for the random number generator. By default the seed is set from the clock. The random number generator is only used in conjunction with the "normalize" option.
- -raw - Do not require on input (or produce on output) labels on each row and column. Also, use space delimiters rather than tabs between entries.
- -rdb - Allow the program to read and create RDB formatted files, which contain an additional format line after the first line of text.
- -verbose 1|2|3|4|5 - Set the verbosity level of the output to stderr. The default level is 2.
Bugs: Very little checking of command line arguments is performed.