Disney method relies on similarities in appearances across classes of objects
Seen from any angle, a horse looks like a horse. But it doesn't look the same from every angle. Scientists at Disney Research have developed a method to help computer vision systems avoid the confusion associated with changes in perspective, such as the marked difference in a horse's appearance from the side and from the front.
Alina Kuznetsova and fellow Disney researchers devised a system that is able to estimate the pose of an object, based in part on similarities in how different types of objects appear from the same angle. The machine learning method proved so effective that the researchers demonstrated, for the first time, that the method could predict the pose even of an object it had never seen before.
"Sometimes orientation is really important to know," said Leonid Sigal, a senior research scientist at Disney Research. "A self-driving car, for instance, would be better able to negotiate traffic safely if it can anticipate the directions that other cars and buses on the road appear to be headed."
Moreover, the method he and his colleagues developed not only can predict the orientation or pose of an object, it can also use its knowledge of pose to help identify an object, making it useful for a wide variety of computer vision applications.
The researchers will present their method at the Association for the Advancement of Artificial Intelligence conference, Feb. 12-17 in Phoenix, Arizona.
Figuring out how objects look from different angles is something that comes naturally to people, but is a challenging problem in computer vision.
"People draw inferences from other things they have seen," Sigal said. "If they know what a bicycle looks like from various angles, that can help them predict what a motorcycle might look like in different poses, because of the visual similarities among these two objects."
Kuznetsova, a Ph.D. student at Leibniz University Hannover in Germany who worked as an intern with Disney Research, and Sung Ju Hwang, a former post-doctoral researcher at Disney now on the faculty of Ulsan National Institute of Science and Technology in Korea, relied on a similar intuition as they developed their method.
A side view of a horse, for instance, has more in common with the side view of a cow than it does with the front view of a horse. They were able to use these similarities that are shared across different categories of objects to develop a metric learning approach at the heart of the predictive method.
When shown a computer mouse for the first time, for instance, the method recognized its vague similarities with the shape of a car, helping the method identify the sides, front and back of the mouse.