When Microsoft and Google fight over whose artificial intelligence chat robot is better, this is not the only purpose of machine learning and language model. In addition to the rumored plan to showcase more than 20 products driven by artificial intelligence in this year’s annual I/O event, Google is moving towards the goal of establishing an artificial intelligence language model that supports 1000 different languages. In an update released on Monday, Google shared more information about the Universal Speech Model (USM), which it described as a "critical first step" to achieve its goal.
Last November, the company announced its plan to create a language model supporting 1000 most commonly used languages around the world, and also disclosed its USM model. Google describes USM as "the most advanced speech model series", which has 2 billion parameters and is trained in 12 million hours of speech and 28 billion sentences in more than 300 languages. YouTube has used USM to generate closed captions, and it also supports automatic speech recognition (ASR). This can automatically detect and translate languages, including English, Mandarin, Amharic, Cebu, Assam and so on.
Now, Google says USM supports more than 100 languages, and will be used as a "foundation" to build a wider system. Meta is developing a similar artificial intelligence translation tool, which is still in its early stage. You can read more about USM and how it works in research papers published by Google.
One goal of this technology may be to be able to detect and provide real-time translation in augmented reality glasses, just like the concept demonstrated by Google in I/O activities last year, right in front of your eyes. However, this technology seems to be a little far away, and Google’s wrong expression of Arabic during the I/O conference proves how easy it is to make mistakes.