Entrepreneur, mathematician, Vice President of the Russian Academy of Business and Entrepreneurship Stan Yurchenko knows at first hand how rapidly technologies and digitalization change our world. He was one of the developers and organizers of the automated IT system that is still used by the Government of Moscow for accounting utility payments by Moscow residents. In 1990’s Stan Yurchenko took part in development of a speech recognition system that was one of the first in Russia. Right now Stan Yurchenko is working on investments in high-tech projects, which include projects related to blockchains and financial technologies. In his column, Stan Yurchenko tells about the rapid growth of speech recognition technologies.
It is sometimes very hard to keep track of new technologies. Parents of teenage children know this particularly well, when new applications with undecipherable functionalities appear on their cell phones. When parents ask about them, the kids look as if the parents have come straight from the Stone Age and landed right in the midst of our civilization.
Progress is in the details. For instance, it seems just the other day that the world has moved to credit card payments and cash has become a thing of the past. Then the need to pull the card out of the wallet went too – long live contactless payments! You can simply swipe your phone at the terminal. Now, there is a new step into the future: Face ID as a way of biometric authentication.
Or, let’s take another story, for example. A voice assistant used to be an unfailing attribute of sci-fi movies. Now it’s a routine element of our life. These technologies already provide a great benefit to society: for example, they are used to care for the severely ill.
At this time, many problems in natural speech recognition have been solved. It is no longer necessary to spend a long time setting up equipment for a particular speaker; ridiculous mistakes have become a thing of the past. You can now state as fact that speech recognition has become a successful part of the large IT industry.
I still remember how it all started. Back when I was a student, I took part in a pioneer project in this area. When I was a senior at the Moscow Institute of Physics and Technology [MIPT], four of us, four friends and fellow-thinkers were eager to solve the problem that no one had solved at that time: how to make a computer ‘understand’ natural human speech. It was a classical ‘garage project’, involving soldering irons, sleepless nights and disorganized living. They worked about the same way in the Silicon Valley back in those days. (Editor’s note: In 1992-1995, while studying at the MIPT, Stan Yurchenko organized and participated in a project to create speech recognition systems. Together with his fellow students he received direct investments from an American company. The developers achieved stable speech recognition results using computation capacities of the time (PC 80286). The MIPT education gave us a very strong mathematical grounding. We used Fourier series transformation and Markov models, but we believed success to lie in recognizing pairs of kinakemes. Kinakemes are, in a sense, a minimal linguistic unit. Kinakemes correspond to quarks of speech that are just as complicated and hard to analyze as quarks of which protons, neutrons and many other particles are composed. It may very well be that present-day speech recognition systems include pieces of our code.
In the last 30 years, the cost and availability of computational capacities has changed radically. These days, technology allows speech recognition with precision of up to 90%. This is quite enough for routine applications, but even that is not the limit. I think that the kinakeme approach will allow speech recognition to reach nearly 100% when using present-day equipment. And that would be any speech, even indistinct, muted or accented. The number of areas where technology can be used is unlimited.
Forensics, for instance, has great hopes for development of speech recognition systems. There is an opinion that voice identification may be more effective than fingerprints. Voice biometrics technologies are already used by law enforcement agencies in many countries. In Russia, they will become a part of the biometrics system that was launched in 2018.
One of the promising areas of technological development is simultaneous automated translation of human speech from one language to another. For now, this works quite crudely; computer lacks the imagery of human thinking for accurate translation of metaphors and comparisons or for translating the semantic musicality of speech. Artificial intelligence, however, is growing by leaps and bounds and, I think, in 15-20 years every one of us will have an earbud that will translate speech from any language to any other on the fly; something similar to the Babel Fish from the famous Hitchhiker’s Guide to Galaxy by Douglas Adams.
We live in fantastic times when, thanks to technologies that enable us to put collective effort of thousands and thousands of minds all over the world into development, the technologies themselves are growing in geometric progression. What belonged to science fiction yesterday seems routine today. Speech recognition systems are inseparably linked with artificial intelligence systems. It is quite likely that in the near future we will have to solve the problem of separating communication speech, for exchanging thoughts and ideas with people, from command speech, for interacting with computers. It may well be that a special artificial language will have to be invented, something similar to Esperanto, specifically for communicating with digital systems or even digital forms of life in the future. Then, how soon will computers be ready to compete with people for mastery of language either in a poetic or a rap battle and how soon will they start winning?
Stan Yurchenko, entrepreneur, mathematician, a graduate of the MIPT, Vice President of the Russian Academy of Business and Entrepreneurship
Stan Yurchenko was born in Tolyatti, Russia in 1970.
In 1993 Stan Yurchenko graduated from the MIPT (Department of Control and Applied Mathematics).
In 2000 received a Diploma in Management and Economics also from MIPT;
In 2011 received an MBA from the St. Petersburg University and an Executive MBA from HEC (École des Hautes études Commerciales de Paris), one of the oldest business schools in Europe.
In 1999-2003 Stan Yurchenko held the office of IT Director at the Central Agency of Air Service of Russia.
In 2004-2006 developed the Unified Information and Payment Center Automated Control System in Moscow. This is a system of utilities billing, payments and accounting that is still used as part of ‘My Documents’ Multiservice Center providing state services.
In 2006-2013 Stan Yurchenko was the CEO of the Urban Information Technologies Center. In 2011 Stan Yurchenko was awarded the ‘Russian of the Year’ award for creating a billing and payment system in Moscow
In 2013 – 2015 was Deputy Head of the Unified Information and Payment Center State Institution.
Since 2016 Stan Yurchenko has been involved in investment in technological projects, including attracting foreign investment to Russia to create Competence and Technology Development Centers.