Building Synthetic Voices
<<< Previous	Chapter 3. A Practical Speech Synthesis System	Next >>>

3.6. Extracting features from utterances

Many of the training techniques that are described in the following chapters extract basic features (via pathnames) from a set of utterances. This can most easily be done by the festival/examples/dumpfeats Festival script. It takes a list of feature/pathnames, as a list or from a file and saves the values for a given set of items in a single feature file (or one for each utterance). Call festival/examples/dumpfeats with the argument -h for more details.

For example suppose for all utterances we want the segment duration, its name, the name of the segment preceding it and the segment following it.

dumpfeats -feats '(segment_duration name p.name n.name)' \
-relation Segment -output dur.feats festival/utts/*.utt

If you wish to save the features in separate files one for each utterance, if the output filename contains a "%s" it will be filled in with the utterance fileid. Thus to dump all features named in the file duration.featnames we would call

dumpfeats -feats duration.featnames -relation Segment \
-output feats/%s.dur festival/utts/*.utt

The file duration.featnames should contain the features/pathnames one per line (without the opening and closing parenthesis.

Other features and other specific code (e.g. selecting a voice that uses an appropriate phone set), can be included in this process by naming a scheme file with the -eval option.

The dumped feature files consist of a line for each item in the named relation containing the requested feature values white space separated. For example

0.399028 pau 0 sh
0.08243 sh pau iy
0.07458 iy sh hh
0.048084 hh iy ae
0.062803 ae hh d
0.020608 d ae y
0.082979 y d ax
0.08208 ax y r
0.036936 r ax d
0.036935 d r aa
0.081057 aa d r
...

<<< Previous	Home	Next >>>
Utterance building	Up	Building Synthetic Voices