New technology Microsoft allows 3D-copies of a real person to speak anylanguage
About a week ago, Julia White, a representative of the corporation, demonstrated a new technology at the conference. It allows not only to form a fairly realistic hologram (in virtual reality), but also gives this hologram the knowledge of a certain language. And the voice – tone, volume, timbre and other parameters are taken from the original hologram. Thus, the interlocutor sees in front of him a virtual copy of another person, and this copy speaks the correct language..
The technology was made possible by mixing two different solutions – mixed reality and neural text-to-speech. It seems that the technology will provide an opportunity to remove communication barriers that still exist. The Internet has enabled people to communicate in real time, and now there is the opportunity to speak the same language.
The task was solved by the corporation gradually. The first stage is the creation of a realistic hologram of White in full growth. In order to achieve this. She visited a specialized laboratory of Microsoft. Where her performance was recorded in English. The recording was conducted volumetric. In order to create a three-dimensional model of a person from the elements of the recording.
As a result, this was done –
after the completion of the stage. Any holder of Microsoft HoloLens video glasses could see her performance. Well after that work began on copying White’s voice and translating her speech. Into Japanese using text-to-speech technology based on neural networks. The result was excellent. The voice parameters were transferred almost perfectly. Of course, as far as possible, given that the final speech was in Japanese, the sound of which is, well, very different from any other languages.
Naturally, this is only a demonstration, which had to be prepared for quite a long time. But, like any technology. Over time it becomes more efficient and easier to use. Microsoft plans to further improve and complement its project.
At first, its application will be point-like –
for example, with the spread of 3D glasses. Performances by famous artists or political leaders will become more popular. They can be seen next to them, and they will speak their native language for the viewer.
You can also imagine lectures organized in a similar way. Moreover it is safe to assume that the transformation of a person into a hologram. That speaks the same language with the audience will be a matter of several hours, not days. The main thing is the equipment for recording performances in 3D and a neural network that is able to “translate” the speaker’s speech