scfg_train [options [-grammar ifile] [-corpus ifile] [-method string " {inout}"] [-passes int " {50}"] [-startpass int " {0}"] [-spread int] [-checkpoint int] [-heap int " {210000}"] [-o ofile]
scfg_train takes a stochastic context free grammar (SCFG) and trains the probabilities with repsect to a given bracket corpus using the inside-outside algorithm. This is basically an implementation of Pereira and Schabes 1992. Note using this program properly may require months of CPU time.
-grammar ifile Grammar file, one rule per line.
-corpus ifile Corpus file, one bracketed sentence per line.
-method string " {inout}" Method for training: inout.
-passes int " {50}" Number of training passes.
-startpass int " {0}" Starting at pass N.
-spread int Spread training data over N passes.
-checkpoint int Save grammar every N passes
-heap int " {210000}" Set size of Lisp heap, needed for large corpora
-o ofile Output file for trained grammar.