UOC R&I Talk
What is your background and expertise?
I studied Mathematics at the University of Barcelona (UB) and then did a doctoral thesis on computer science at the Universitat Autónoma de Barcelona (UAB) in the field of artificial intelligence (AI). In particular, my thesis was on computer vision, which is the branch that attempts to simulate the human visual system: we try to make machines that can have visual perception, that can understand visual information, images and videos.
You are part of the SUNAI research group: what does your work consist of?
SUNAI is the acronym of Scene Understanding and Artificial Intelligence Lab. What we do is computer vision, recognition and classification of images. We are four doctors directing the project and working on emotion recognition, among other things. With David Masip we have been studying the recognition of facial expressions and currently I am working more on scene recognition, on the interpretation of the global scene. Carles Ventura is also part of the group and looks at semantic segmentation, which means accurately detecting the different objects and parts within a scene. And Xavier Baró studies recognition of body posture and gestures. He also works a little on faces and interpretation of images using in-depth information as well as the image. Ultimately, almost everything we do is related to recognizing what people do, how they feel and how they behave.
You have had stays in prestigious institutions such as MIT in Boston: how do you assess your international experiences?
At research level, my thesis was more related to facial recognition and classification but I gradually became more interested in global scene recognition. At the Massachusetts Institute of Technology (MIT), in Boston, there is a very good scene recognition and interpretation group. In fact, they are pioneers, who, within computer vision, defined the problem of general scene recognition and how to use the context to understand specific things in the scene. I contacted them, having met them at conferences, and managed to arrange a visit, which initially was for three months but lasted three years. It was a very interesting experience that has given me a very different perspective on research to what I had when I had never left here.
How far do you need to know about the experiences of other institutions in research like yours?
I think it is important in general to work with people abroad. In any field there is an expert who is the best and unlikely to be in your city. If you want to train, if you are interested in that expert’s knowledge, the best thing you can do is to work with him or her.
What apps currently have facial or spatial recognition?
There are already commercial products that analyse your facial expressions to recognize emotions. There are companies developing software to recognize facial expression and try to recognize emotions through microexpressions or more explicit expressions. This has many uses: from marketing companies that analyse your face while you are watching an advertisement and your facial expression to see how the information is making you feel, to medical applications for people with autism or Alzheimer’s, where monitoring facial expression in response to a specific input can be very useful and facilitate a diagnosis or monitoring of the patient during the disease.
You are one of the winners of the UOC Research Showcase 2017: what does the work you submitted consist of?
The work I submitted to the Research Showcase was an article on emotion recognition that is different to previous work because it analyses the complete image (and the person and context) to determine emotional states. As I said, the most advanced technology in relation to recognition of emotions is facial expression analysis. What we posit in this project is that although it is very good to analyse the expression of the face, which transmits a great deal of information, the context is fundamental to accurately understand people’s emotional states. Just as a whole sentence gives an explicit meaning to each word, the person’s context also gives meaning to what they express with their face. You cannot just extract the face from a normal situation and hope to perfectly understand how that person feels. We humans do not do this. We are trying to design systems that model the context, understanding it as everything that surrounds people: what they are doing, their posture, the objects nearby, their environment, if there are more people around them. What we are doing is to extract, process and use all this information to complement facial expression in order to more reliably determine a person’s emotional state.
Should we be concerned about control of emotions?
In general any technological advance has positive and negative potential uses. This is inevitable. It is our responsibility to make good use of everything we develop. I do not think we can stop scientific progress because of potential bad uses, but it is our responsibility to use it correctly. And the technologies, for example, for recognizing people’s emotional states have many positive applications because there are many applications in which machines need to interact with people or help them. I think it is very important that these machines can understand that people feel differently and can approach them according to their emotional state.
Why is it important to tell society about the research you are doing?
It is important for society to understand why research is carried out, its subject and why it is useful. In the end, the researcher’s objective is to progress, discover and invent new things, and we all hope to provide some kind of service to society. I find it satisfying that people can understand the use of what we are doing and why we invest time in one thing rather than another.
Can you recommend a book on your field?
What I do is connected to the perceptual system and I create machines that interpret this world. In relation to perception there is a book called Blink, by Malcolm Gladwell, about our perception and how unconscious processes affect our reactions, judgements and decisions.