Text to Speech Software and Voices

I’m trying understand a little more about Text to Speech technologies, and and came across a couple of helpful links: a Speech synthesis page from Wikipedia and a Text-to-Speech page at SNOW. It appears there are a number of different text to speech software packages available.

The issue seems to be further complicated in that at least some of the software/voice packages appear to require a license based on how many times a user distributes a voice recording.

My last concern is how to select a voice to use- it looks like a number of different options available. To save you some time, if you are looking for IBM’s natural voices the first page I came across was their AT&T Text to Speech Research Lab which took me to their official AT&T Natural Voices page and from there to Wizzard Software where you can actually purchase a product- although I still wasn’t sure exactly what…

Anyways, I am looking for some help on this one- does anyone know of any good resources, have a favorite piece of software of favorite voice? If not, I don’t know how updated this is, but it looks like a good place to start.

As a sidenote, I wasn’t even aware of the W3C Speech Synthesis Markup Language (SSML) – looks really interesting.

  1. I’ve spent lots of time working with Text-To-Speech (TTS) software and dealing with the many vendors that provide TTS engines, the company I work for Conversive, Inc. ( http://www.conversive.com/ ) develops customer service portals using NLP and many of our deployments use animated characters with Speech Synthesis, we use Microsoft’s Speech API 5.0 (SAPI 5.0) compatible engines only to simplify development and lyp-sync.

    SAPI 5 voices sound far superior than the older SAPI 4 voices supported by MS Agent. If you don’t want to spend much and just experiment with TTS technology I recommend Cepstral, but their voices are somewhat robotic sounding, the next step up is the AT&T voices offered by wizzard; better than cepstral, but still not as good as the really great sounding voices from Nuance and Loquendo, but those voices are thousands of dollars for a server licence for a year. If you want to try out free SAPI 4.0 voices, you can download the old L&H SAPI 4.0 voices for free, you can use SAPI 4 in any MS Agent compatible program including Conversive’s Verbot 4 Product at http://www.verbots.com/

    It’s really a shame that there aren’t more freely available high-quality TTS engines out there, I know there are some open source projects developing TTS but from what I’ve heard they all sound like Microsoft Sam.