gist-sigmoid

Usage: gist-sigmoid <train labels> <train predictions> <test predictions>

Description:

Fit a sigmoid function to the discriminant values produced by an SVM, and use the sigmoid to compute probabilities. This program is based upon pseudocode given in "Probabilistic outputs for support vector machines and comparison to regularized likelihood methods" by Platt.
Typically, to use Gist, you begin with a training set of data, a corresponding set of labels, and a test set of unannotated data that you would like to classify. In order to use this gist-sigmoid, follow these steps:

Divide your set of labeled training data into an SVM training set and a sigmoid training set. The sigmoid training set can be smaller than the SVM training set; for example, you might randomly extract 10% of your training set to be used in sigmoid training.
Run compute-weights on the SVM training set.
Run classify on the sigmoid training set, using the weights generated from the SVM training set.
Run classify on the unnannotated set of test data, also using the weights generated from the SVM training set. This is the data set for you would like to get probability estimates.
Finally, run gist-sigmoid using the sigmoid training set predictions generated by classify, the true labels of the sigmoid training set, and the predictions for the unannotated training set.
It is important that the three data sets -- SVM training set, sigmoid training set, and unannotated test set -- be disjoint. Otherwise, the probability estimates that you obtain from gist-sigmoid will be skewed.

Inputs:

train labels - an SVM label file, similar to the input provided to compute-weights.
train predictions - an SVM predictions file corrresponding to the given train labels, as produced by classify. Note that, for the probability estimates to be valid, the sigmoid training set should be disjoint from the examples used to train the SVM.
test predictions - an SVM predictions file in which the discriminant value are to be converted into probabilities.

Output:

The program prints to standard output a version of the test predictions file with an additional column. This column contains probabilities corresponding to each of the given discriminant values. The parameters of the sigmoid (A and B) are included in an additional line at the end of the header. The formula for converting a discriminant value X into a probability is 1 / (1 + exp(A * X + B)). Also, the second column of the predictions file contains binary class predictions based upon the probabilities (using a threshold of 50%), rather than based upon the discriminants.

Options:

-algorithm [platt|lin] Selecting the lin option activates an alternative optimization routine. This is based upon pseduocode provided by Lin and colleages (Lin H, Lin C, Weng R. "A note on Platt's probabilistic outputs for support vector machines", Technical report, Department of Computer Science and Information Engineering, National Taiwan University, 2003). This code was supplied by Michael E. Matheny (mmatheny@dsg.harvard.edu).

Calls: none

Gist