AMAZON IS WORKING on a new ‘newscaster voice’ for its Alexa-powered speakers.
The e-tail giant has been working on a better cadence and tone for the native Alexa voice to make it sound natural when delivering news, with a little less robotic in delivery.
The technique being used is called “neural text-to-speech”, or NTTS for short. It uses machine learning to create more natural voices without picking apart the words into sylos of syllables which have no meaning and therefore can’t be contextualised by the AI.
In fact, this “concatenative speech synthesis” has been around for so long, it’s not that dissimilar to early speech experiments on 8-bit machines like the ZX Spectrum (remember the Currah μSpeech, anyone?).
But now, in the age of even relatively primitive artificial intelligence, the words can be strung together in a way that allows the system to apply voice patterns more appropriate to the matter at hand, far more effectively than a human could be teaching it the rules. After all, we don’t consciously think about how we say what we say, we’ve learnt it, so explaining it becomes pretty tricky.
The result is a significant improvement. You certainly won’t mistake it for a real person – but it should reduce some OF those annoYING emPHAsises that it currently has.
What it does do, however, is open up the possibility of a whole Bremnersworth of voice styles, such as a story-telling mode and a more authoritative tone for warnings.
If that sounds like a huge leap from where we are now, it’s worth knowing that Amazon actually created the newscaster voice using a few hours of training. And once you have a formula for success, doing it again and again should be a cinch.
The newscaster voice should be out for US users, maybe further afield too, in a few weeks.
Could be worse. It could be this guy. μ
Source : Inquirer