FEATURE18 November 2022

Mirella Lapata in seven

x Sponsored content on Research Live and in Impact magazine is editorially independent.
Find out more about advertising and sponsorship.

AI Data analytics Features People

In the latest of a regular Impact series, Mirella Lapata, professor at the School of Informatics at the University of Edinburgh, discusses natural language processing and the explosion of data.

Black and white photo portrait of Mirella Lapata

Mirella Lapata is a professor at the School of Informatics at the University of Edinburgh, and affiliated with the Institute for Communication and Collaborative Systems and the Edinburgh Natural Language Processing Group. She is also a fellow of the Royal Society of Edinburgh, the Association for Computational Linguistics, and Academia Europaea.

1. What is the main focus of your current research?

It focuses on natural language processing (NLP), specifically the representation, extraction and generation of semantic information from structured and unstructured data, involving text, images, video, and large-scale knowledge bases. My goal is to get computers to understand, reason with and generate natural language.

2. How do you see your role in pushing forward NLP?

I was recently awarded a UK Research and Innovation Turing AI World-Leading Researcher Fellowship to work on reasoning, a skill that machines still lack and that humans are perfectly capable of. We have mastered the art of correlating and integrating different types of information from different sources, and re-using acquired experience and expertise to transfer it to radically different challenges and domains. I am hoping to build new models that can do exactly that.

3. Do you foresee the next few years as being ones of upheaval for the data analytics industry?

As data gets bigger, processing it and storing it will become more challenging. Computers will have to get faster, computing should become cheaper, and our models will have to be better.

4. How do you think NLP will develop in the coming years?

Over the past few years, deep learning has brought a revolution in NLP, producing remarkable results. Tasks such as machine translation or sentiment analysis have made huge leaps forward with respect to earlier state-of-the-art systems. There has also been substantial global commercial activity in the deployment of digital assistants (such as Amazon’s Alexa) and smart home devices (such as Google Nest). In the future, we will see more efforts to analyse very long documents, such as books, and being able to answer questions based on their content, creating summaries for each chapter, for the book as a whole or an entire book series. We have already seen developments that I did not think would have been possible 10 years ago.

5. What do you see as the biggest barrier to getting insight from unstructured data?

Most data on the web, and within individual organisations, is in an unstructured format. The biggest barrier will be coming up with algorithms and tools than can jointly process different types of unstructured data (such as images together with text or video) and draw conclusions from incomplete and noisy data.

6. Do you think the insights industry does a good job of using text analytics to its full potential?

There has been an explosion of data recently, which has made the job of the insights industry more challenging – especially as the data is not just text, but involves multiple modalities, including audio, video and images. In addition, consumer behaviour is evolving faster nowadays than in the past, and changes in behaviour mean that analyses become obsolete quite fast, as the consumers and the data move on to the next trend.

7. What impact has Covid-19 had on your industry?

Covid-19 has had a huge impact on universities and, by extension, on NLP research. It is fair to say that universities have not yet returned to a pre-pandemic normality. Interacting with colleagues and students, attending workshops and seminars, and exchanging ideas have all been hampered by the pandemic. It is not easy to know what different labs are working on and to keep abreast of new developments. The entire community has been compartmentalised, and the collective momentum has slowed down significantly.

This article was first published in the October 2022 issue of Impact.

0 Comments