Time Domain Synthesis
This synthesis technique uses snippets of actual recorded speech. These
snippets can range in size from a single phoneme on up to a complete word. The
size usually chosen is called a "diphone" and extends from the center
of a given phoneme to the center of the next phoneme in the word. The diphones
are spliced or blended together to form continuous speech. The drawbacks to
this technique are:
- Only one voice per diphone database. Each new voice requires an
entirely new database each of which is in the megabyte(s) range.
- Programmers or users cannot create new voices.
- Limited pitch range. Not good for singing.
- Limited speaking rate range.
- Not all diphones in the database will splice together nicely, resulting in
gurgles, false consonants and other strange discontinuities.
Click your browser BACK button or
to SoftVoice homepage