Text to Speech Software and Voices

I’m try­ing under­stand a lit­tle more about Text to Speech tech­nolo­gies, and and came across a cou­ple of help­ful links: a Speech syn­the­sis page from Wikipedia and a Text-to-Speech page at SNOW. It appears there are a num­ber of dif­fer­ent text to speech soft­ware pack­ages avail­able.

The issue seems to be fur­ther com­pli­cated in that at least some of the software/voice pack­ages appear to require a license based on how many times a user dis­trib­utes a voice recording.

My last con­cern is how to select a voice to use– it looks like a num­ber of dif­fer­ent options avail­able. To save you some time, if you are look­ing for IBM’s nat­ural voices the first page I came across was their AT&T Text to Speech Research Lab which took me to their offi­cial AT&T Nat­ural Voices page and from there to Wiz­zard Soft­ware where you can actu­ally pur­chase a prod­uct– although I still wasn’t sure exactly what…

Any­ways, I am look­ing for some help on this one– does any­one know of any good resources, have a favorite piece of soft­ware of favorite voice? If not, I don’t know how updated this is, but it looks like a good place to start.

As a side­note, I wasn’t even aware of the W3C Speech Syn­the­sis Markup Lan­guage (SSML) — looks really interesting.

  1. I’ve spent lots of time work­ing with Text-To-Speech (TTS) soft­ware and deal­ing with the many ven­dors that pro­vide TTS engines, the com­pany I work for Con­ver­sive, Inc. ( http://www.conversive.com/ ) devel­ops cus­tomer ser­vice por­tals using NLP and many of our deploy­ments use ani­mated char­ac­ters with Speech Syn­the­sis, we use Microsoft’s Speech API 5.0 (SAPI 5.0) com­pat­i­ble engines only to sim­plify devel­op­ment and lyp-sync.

    SAPI 5 voices sound far supe­rior than the older SAPI 4 voices sup­ported by MS Agent. If you don’t want to spend much and just exper­i­ment with TTS tech­nol­ogy I rec­om­mend Cep­stral, but their voices are some­what robotic sound­ing, the next step up is the AT&T voices offered by wiz­zard; bet­ter than cep­stral, but still not as good as the really great sound­ing voices from Nuance and Loquendo, but those voices are thou­sands of dol­lars for a server licence for a year. If you want to try out free SAPI 4.0 voices, you can down­load the old L&H SAPI 4.0 voices for free, you can use SAPI 4 in any MS Agent com­pat­i­ble pro­gram includ­ing Conversive’s Ver­bot 4 Prod­uct at http://www.verbots.com/

    It’s really a shame that there aren’t more freely avail­able high-quality TTS engines out there, I know there are some open source projects devel­op­ing TTS but from what I’ve heard they all sound like Microsoft Sam.

Reply

or