Artificial Intelligence That Can Imitate Human Voices from Microsoft

Microsoft introduced its new artificial intelligence model VALL-E. The system can imitate a human voice based on only three-second samples.

We know that artificial intelligence has developed and gained popularity recently. From systems that create images from texts such as Midjourney and DALL-E, models such as ChatGPT, which responded to whatever we asked, made an impact all over the world. Now is from Microsoft A brand new artificial intelligence move has arrived.

US technology giant, artificial intelligence model that can create voice from text ‘VALLTO’introduced the The system, which can revolutionize artificial intelligence, can easily translate human voices. able to imitate expressed. Of course, this kind of technology also brought some concerns.

Can imitate sounds using only a 3-second sample

According to Ars Technica, VALL-E only a three-second audio sample It can imitate a human’s voice. In fact, what it can do is not limited to this, artificial intelligence can even produce results that match the tone of voice according to the speaker’s emotion.

Microsoft announced that VALL-E, a language model, was introduced by Meta in October 2022.EnCodec’ He states that he has benefited from the so-called technology. Unlike similar systems we normally see, the model draws conclusions from text and sounds. Basically, how a person sounds is analyzingThanks to EnCodec, it divides this information into separate components and matches the training data. As a result, different sentences are produced by imitating the sound in the example.

In a shared article on artificial intelligence, researchers used VALL-E, more than 7,000 from the speaker 60,000 hours of English He states that he trained with audio recordings in his language. It is said that for the system to produce a good result, the sound in the samples should be close to the sound in the training data.

RELATED NEWS

Thousands of Years Old Mummies Revived With Artificial Intelligence

Microsoft has released some samples from VALL-E on GitHub. When the examples are examined, it is seen that artificial intelligence appears in some places with a robot voice, but in others it is surprisingly surprising. realistic it seems to be. Also in the examples VALL-E preserves the speaker’s tone; even result by environment can also be seen. For example, if the original speaker is speaking from an echoing place, the system produces sound accordingly.

This kind of technology is not without its risks.

Of course, this kind of technology is somewhat alarming. Malicious people can make it look like they said something they didn’t say, can impersonate and may lead to an increase in incidents such as fraud. You can think of it as the risks of deepfake, which has become popular lately. Microsoft open source code due to risks not to do However, we can say that similar technologies may bring these risks.

RELATED NEWS

Artificial Intelligence ChatGPT Begins Banned in Schools for Harming Students’ Education

Source :
https://arstechnica.com/information-technology/2023/01/microsofts-new-ai-can-simulate-anyones-voice-with-3-seconds-of-audio/


source site-36