Photo courtesy of vision china
Ai growing pains
Editor’s note In recent years, artificial intelligence has developed rapidly and its application fields have been expanding. However, at the same time, AI "rollover" cases are frequently searched. Based on this, this edition opens the column "AI Growing Pains" to focus on AI "rollover" events, analyze the phenomenon, analyze the reasons, explore the solutions, and look forward to its growth.
Intern reporter Dai Xiaopei
After the game, it is probably unprecedented for fans to ask referees to wear hats or wigs.
At the end of October, Scottish fans experienced an "unforgettable" football match. In the Scottish Football Champions League match between Furness and Elven, the AI cameras on the sidelines turned a blind eye to whether the players passed the ball or attacked with the ball. Instead, they followed a linesman and took a close-up of "C position" from time to time. It turned out that the AI camera mistakenly identified the referee’s bald head as a football, so he chased the whole game crazily.
Fans who watched the 90-minute game at home spent most of their time not watching the ball, but watching the bald heads. Many netizens joked that this game really "saw a loneliness".
Why does the AI camera regard the bald head as a football? What do we need to do to prevent AI from making similar mistakes? Can you think that AI is "weak" when there is a "rollover" incident?
Unconscious "provocation" from bald referees
From direct participation in sports events to recording athletes’ performance, to live broadcast of competitions and analysis of athletes’ health status, AI is becoming the darling of the sports world. A few months ago, Barcelona Football Club (Barcelona) also joined hands with video technology company Pixellot to create an artificial intelligence coaching solution.
Unexpectedly, AI, which has made great strides in sports, unexpectedly encountered an unconscious "provocation" from a bald linesman. Because the linesman’s bald head is too bright and the sun shines, the AI camera can’t tell which is the ball and which is the head. Previously, because the Furness team said that the AI tracking technology they used could clearly transmit the live images to the homes of every season ticket buyer, so that fans who could not go to the home due to the COVID-19 epidemic would not miss any game.
It is reported that the camera used in the live broadcast of this competition is the multi-camera system provided by Pixellot, which cooperates with Barcelona. The system is powered by NVIDIA’s graphics processor (NVIDIA GPU), and the captured video resolution can reach 8K. These cameras can be installed in a fixed position without being manipulated by the camera operator. In order to capture the key moments, Pixellot collected hundreds of thousands of hours of sports videos to train its algorithm on the NVIDIA GPU of the local workstation.
With a huge amount of available data, using deep learning algorithms and high-performance GPU computing blessing, it has three major driving forces to push AI forward. Why did Pixellot’s AI camera "roll over"?
After the event, the relevant clubs and the technology companies that make cameras reflected. The problem seems very clear: the size and shape of football are similar to those of a human head, and the direct sunlight makes the AI camera fall into "confusion". I hope this will not happen again, because the Furness team responded that it knew the problem and will improve the next game to bring a better experience to the audience.
Pixellot also said that it is not difficult to solve this problem. The existing target detection and tracking technology is mature. Pixellot didn’t consider the influence of bald head in the design stage, so it needs to collect some data of football and bald head to fine-tune the algorithm to eliminate the interference from bald head.
Some technicians said that not only "this is a ball" data set, but also a "this is not a ball" data set is needed when training the AI camera of the live ball game. Bald head, bright white shoes, lighting, balls on the training ground next to the competition field and balls used by players to warm up are all interference factors that need to be considered when training AI.
AI "poor eyesight" is the norm.
Although the performance of AI cameras can be improved by increasing data "feeding", strengthening training and improving algorithms, some professionals believe that with the gradual expansion of AI application scenarios, such "rollover" incidents will exist for a long time.
“AI‘ Rollover ’ It’s normal, no ‘ Rollover ’ It’s strange. " Professor eecs and Huang Tiejun, president of Beijing Zhiyuan Artificial Intelligence Research Institute, spoke frankly when interviewed by Science and Technology Daily reporter.
Huang Tiejun believes that on the surface, the failure of the AI ? ? camera may be due to insufficient pre-training, but the main reason is that the current computer identification system is only trained with specific data. For example, in the above example, the neural network trained with a large number of football videos has surpassed humans in identifying football, but it ignores that this network is more sensitive to bald heads.
It is common to recognize or "turn a blind eye" unknown objects at random.
Machine vision is to endow the machine with visual perception, so that the machine has the scene perception ability similar to that of biological vision system, which involves optical imaging, image processing, analysis and recognition, execution and other components.
"Take the camera as AI ‘ Eyes ’ In real scenes, there is still a long way to go for AI to recognize football and baldness like human eyes. " Huang Tiejun said.
When can I finish this road and even realize that the eye of AI surpasses the human eye?
It depends on when machine vision will bridge the gap with biological vision. "The brain in the skull senses the outside world in real time through more than 3 million nerve fibers, of which there are more than 1 million behind each eye." Huang Tiejun said, "The machine vision developed to this day is still a drop in the bucket compared with the biological vision system that took hundreds of millions of years to evolve."
The human eye has strong adaptability, can identify targets in complex and changing environments, has advanced intelligence, and can use logical analysis and reasoning ability to identify changing targets and summarize laws. On the other hand, machine vision can use artificial intelligence neural network technology, but it can’t identify the changing target well. Due to the constraints of hardware conditions, the current general image acquisition system has poor color resolution.
"Compared with biological visual neural networks, the visual neural networks of artificial intelligence are far apart in structure and scale, so their functions are much worse." Huang Tiejun said, "In real applications, machine vision ‘ Rollover ’ It is not an accident. It is only a case to identify the bald head as a football. Similar problems actually exist in large numbers. "
Huang Tiejun said: "This time, the technology provider can make up for the loophole that the bald head mistakenly thinks of football, but there are still more loopholes. Deceiving the face recognition system with antagonistic picture training is only the tip of the iceberg."
Different technical routes are racing.
"Machine vision based on deep learning has made great progress in image recognition, but it has not really solved the perception problem." Huang Tiejun believes that deep learning is far from grasping the complexity of human visual system.
Deep learning is based on the training of image and video big data, which is far from the biological vision that actively perceives the dynamic world, and it is still not out of demand for computing power. For example, if the video frame rate is increased from 30 to 30,000, the computing power of deep learning needs to be increased by 1,000 times.
Biological neural network is a pulse neural network, which is more suitable for completing visual information processing. Huang Tiejun believes that learning from the neural network structure and information processing mechanism of biological vision system and establishing a new set of brain-like vision information processing theory and technology is the hope of restarting machine vision.
Experts said that there are two main technical routes to develop artificial machine vision. One is to build a powerful intelligent system by collecting more data, increasing the amount of data and strengthening training; The second is to imitate the biological nervous system, follow the gourd painting gourd ladle, clarify the structure and even mechanism of the biological nervous system, and develop future intelligence on this basis.
Huang Tiejun believes that the second path is more effective than the first one. "In a short time, the first one is easier to achieve results. But in the long run, it is more direct to start with biological neural networks and more confident to achieve the goal. "
At present, most AI scholars support the first path, that is, to develop artificial intelligence including machine vision through "big data+big computing power". Huang Tiejun set foot on the road taken by a few people because he firmly believed that the biological visual neural network had great potential to be tapped. "Biological brain is the product of hundreds of millions of years of evolution, and it is the best transcendental structure. Powerful intelligence must rely on complex structures and stand on the shoulders of evolution. It seems difficult, but it is the fastest. "
Turing, the "father of computer science", has long expressed his admiration for the biological brain. At the beginning of 1943, Shannon suggested that "cultural things" could be instilled in the electronic brain. Turing once retorted in public: "No, I’m not interested in building a powerful brain. What I want is just an ordinary brain, just like the head of the chairman of AT&T."
Is developing machine vision, or artificial intelligence, starting from scratch, imitating biological neural networks, or is there another way? There is no conclusion yet. On different tracks, artificial intelligence is accelerating.
Although there are many cases of AI "rollover", such as the practicability of Google AI eye disease diagnosis system is greatly reduced in Thailand, Tencent AI translation made a joke at the Boao Forum in 2018. But the AI era is accelerating and unstoppable.
"It must be admitted that AI has indeed solved many practical problems, and it will gradually replace some functions of people. But we can’t exaggerate it too much. It is still far from our imaginary intelligence and needs more breakthroughs. " Huang Tiejun said that to keep an open mind and bridge the gap between "experimental simulation" and "real world", the development of AI technology has a long way to go.