Q3: Summarize the following article in your own words for nontechnical readers interested in current voice cloning technologies. Reduce to roughly 25% of the original text. Include the in-text citation and end-of text reference using IEEE style.
Modern Social Engineering Voice Cloning Technologies
by S. Sokolov, O. M. Alimov October 9, 2020
Published in IEEE Xplore (pp. 513-514) doi: 10.1109/EIConRus49466.2020.9038954
Over the past few years, technologies capable of recognizing the human voice have made tremendous advances. The voice biometrics of each person is unique, which allows it to be used as an authenticator of a specific user. In 1952, the first voice recognizer from 0 to 9 was introduced, developed by Bell. This was the first major step in the development of speech recognition technology. Nowadays, people can observe a wide range of use of voice recognition technologies both in security systems and in ordinary and entertainment fields. This makes it easier to solve human problems and gradually become an integral part of life.
The first mass-produced speech recognition product was released in 1990 - it was Dragon Dictate. IBM introduced ViaVoice in 1994, and Bell introduced the Val telephone system in 1995 to automate dispatchers and call routing. At the moment, the most popular speech recognition technologies have contributed to smartphones, the development of which since 2011 began the integration of digital (or virtual) assistants.
Over time, the set of features gradually increases, but the functions and availability of services depend on the language, country, and region. Modern speech recognition technologies are also used in banking systems (voice authorization system), security systems, "smart homes", household appliances, social services for people with disabilities, in-car systems, and others.
Nowadays, the number of services using voice control is growing steadily. Many of the largest digital technology companies have or are preparing to introduce a digital assistant. This list includes Alibaba (AliGenie), Amazon (Alexa), Apple (Siri), Facebook (M / Aloha), Google (Assistant), Samsung (Bixby). Now we can say that virtual assistants are still in their infancy, but already now companies are developing their development vector aimed at achieving different goals. And the systems of "smart home", shown back in the 80s and 90s in films and on television, ceased to be expensive and inaccessible. Now this management is not much gone from the first type, people just began to include in their voices what they turned on with their hands. But these systems will be able to increase their capabilities with further development.
At the moment, technology allows people not only to recognize the speech delivered by a particular person, but also to imitate the necessary voice. Nowadays, neural networks synthesizing the voice of specific people are actively gaining popularity. A system based on a neural network is capable of cloning a human voice based on the analysis of even a very short fragment of the source material. Programs not only very well imitate human speech, but also are able to introduce their own peculiarities into it like an accent.