Stopping by Lijun Yin’s Binghamton University lab is like taking a field trip to the future. And there’s always something new to see.
One day, Yin describes how facial-recognition software could be used to treat children with autism. Another time, he builds a digital 3D likeness of himself from just two photographs. During a third visit, a graduate student advances the slides in a PowerPoint presentation using only his eyes. When children come by, Yin shows them how to create a brief animated clip using his software just by speaking into a microphone.
Some researchers focus on one topic, probing deeper and deeper over a period of decades until they have an exhaustive knowledge of the challenges and solutions related to it. Yin, on the other hand, constantly finds new applications for what he knows. His ideas may one day advance fields as diverse as education, healthcare, entertainment and homeland security.
“Our research is motivated,” he says, “by a desire to improve computers to provide something good for our society. I try to use my sophisticated technology to make computers easier to use by the nontechnical person.”
Yin, a computer scientist who also studied electrical engineering, says his fundamentally interdisciplinary work relies on psychology and mathematics as well. He speaks about the possibilities for technology to improve robotics and plastic surgery as though they’re intrinsically related. That’s because, to his mind at least, they are.
Yin wants to enable computers to understand inputs from humans that go beyond the traditional keyboard and mouse.
“Our research in computer graphics and computer vision tries to make using computers easier,” he says. “Can we find a more comfortable, intuitive and intelligent way to use the computer? It should feel like you’re talking to a friend. This could also help disabled people use computers the way everyone else does.”
Yin’s team has developed ways to provide information to the computer based on where a user is looking as well as through gestures or speech. One of the basic challenges in this area is “computer vision.” That is, how can a simple webcam work more like the human eye? Can camera-captured data understand a real-world object? Can this data be used to “see” the user and “understand” what the user wants to do?
To some extent, that’s already possible. Witness one of Yin’s graduate students giving a PowerPoint presentation and using only his eyes to highlight content on various slides. When Yin demonstrated this technology for Air Force experts last year, the only hardware he brought was a webcam attached to a laptop computer.
Yin says the next step would be enabling the computer to recognize a user’s emotional state. He works with a well-established set of six basic emotions — anger, disgust, fear, joy, sadness and surprise — and is experimenting with different ways the computer can distinguish among them. Is there enough data in the way the lines around the eyes change? Could focusing on the user’s mouth provide sufficient clues? What happens if the user’s face is only partially visible, perhaps turned to one side?
“Computers only understand zeroes and ones,” Yin says. “Everything is about patterns. We want to find out how to recognize each emotion using only the most important features.”
He’s partnering with Binghamton University psychologist Peter Gerhardstein to explore ways this work could benefit children with autism. Many people with autism have difficulty interpreting others’ emotions; therapists sometimes use photographs of people to teach children how to understand when someone is happy or sad and so forth. Yin could produce not just photographs, but three-dimensional avatars that are able to display a range of emotions. Given the right pictures, Yin could even produce avatars of a child’s family members for use in this type of therapy.
Yin and Gerhardstein’s previous collaboration led to the creation of a 3D facial expression database, which includes 100 subjects with 2,500 facial expression models. The database is available at no cost to the nonprofit research community and has become a worldwide testbed for those working on related projects in fields such as biomedicine, law enforcement and computer science.
Once Yin became more interested in human-computer interaction, he naturally grew more excited about the possibilities for artificial intelligence.
“We want not only to create a virtual-person model, we want to understand a real person’s emotions and feelings,” Yin says. “We want the computer to be able to understand how you feel, too. That’s hard, even harder than my other work.”
Imagine if a computer could understand when people are in pain. Some people’s gestures and facial expressions may change. Some may ask a doctor for help. But others — young children, for instance — cannot express themselves or are unable to speak for some reason. Yin wants to develop an algorithm that would enable a computer to determine when someone is in pain based only on a photograph.
Yin describes that healthcare application and, almost in the next breath, points out that the same system that could identify pain might also be used to figure out when someone is lying. Perhaps a computer could offer insights like the ones provided by Tim Roth’s character, Dr. Cal Lightman, on the television show Lie to Me. The fictional character is a psychologist with an expertise in tracking deception who often partners with law-enforcement agencies.
“This technology,” Yin says, “could help us to train the computer to do facial-recognition analysis in place of experts.”
A pragmatic approach
Yin may dream big, but he’s also mindful of the limitations imposed by the real world when it comes to his ideas.
His pragmatic approach to the challenges of computing in everyday life was critical to his role in developing the MPEG-4 standards now used in digital video when he was still in graduate school. The goal there was to save bandwidth by compressing raw data as much as possible without losing information or quality.
These days, he hopes to make it easier to identify suspects passing through security checkpoints at airports. But he knows that for such a security algorithm to be useful, a low-resolution camera must be able to do advanced detection work. There’s no way to bring his laboratory’s elaborate six-camera setup into every airport. And there’s no way to have each passenger pose at exactly the right distance from the camera to be identified.
Yin’s goal is to create a facial-recognition algorithm that would be able to pick a person out of a crowd, given only front and side photographs of the individual. And it has to work even if he or she passes a camera at another angle.
“Our current work,” he says, “uses a regular camera system to do this challenging job.”
About Lijun Yin
Lijun Yin, associate professor of computer science and director of the Graphics and Image Computing Laboratory, joined the Binghamton University faculty in 2001. He earned a doctorate from the University of Alberta in 2000, after receiving undergraduate and master’s degrees from schools in China. His research has been sponsored by the National Science Foundation, the Air Force Research Laboratory and the New York State Office of Science, Technology and Academic Research.