If there’s one company out there that’s completely serious about finding as many uses to Artificial Intelligence as it can, then that’s Google. In a new research paper published by Google in December, the company exposes a brand new text-to-speech system they named Tacotron 2, which they claim can imitate to near perfection the way humans speak when reading.
This is the second time Google has made an official announcement regarding a system using two deep neural networks at the same time. In this instance, the first neural network translates the text into a spectrogram, which is a visual way to represent audio frequencies over a period of time. The second network comes into work as the spectrogram is fed into WaveNet, which is a system Google’s DeepMind, the AI research lab, created. This can read the chart and generate the voice.
The research paper includes some samples, some which are generated by the AI, and some which are spoken by an actual person. Frankly, they’re impossible to discern which comes from whom.
The Tacotron 2 is also really great at pronouncing harder words, or names, and it can also take punctuation into account, as well as stress words that are capitalized.
In short, Google has managed to create an AI so good that you can’t discern whether the voice is human or robot. This is a great addition to Google’s many voice services, such as the Google Assistant on your phones, or even the Google Translate reading feature. At this point, the system only comes with this particular female voice, and it would need a do-over to accommodate a different voice, whether female or male, but if they’ve done it once, they’ll do it again.
The issues with Google’s creation
There are some issues that arise from this new technological advancement. It’s clear that it has come to a point where human voice imitation for AI and robots has reached near perfection. In many areas of the world, customer support jobs have been replaced with automated helpers to save money. Sometimes you can tell you’re not talking to another human, sometimes you can’t. While it’s clear when dealing with your home assistant or smartphone assistant that you’re not actually speaking to another human, that’s not the case for customer support, for instance.
Should we get a notification that we’re not talking to a human? How would you react to such a notification? Would you entrust the help it gives you or ask to be put through to a human? Theoretically, the automated helper should be able to offer the same solutions if not better than a human could due to the fact that it knows everything, including the things that a human may forget. It’s something we each have to answer personally, but we should keep in mind that as the capabilities of AI and robotics grow, we’re going to see this type of situation more and more often, so it’s perhaps best if we learn to adapt to the future.