Expanding language technology to support global needs
Explore how LivePerson’s multilingual NLP makes artificial intelligence more useful across the world
LivePerson is on a mission to meet the needs of more people globally, which is why we are continuously working to improve our Conversational AI technology as it becomes a bigger part of how consumers interact with brands.
A significant amount of state-of-the-art human language technology has been developed specifically for English. At LivePerson, we value computational linguistics and developing language technologies that enable people to use their own natural language and dialect when interacting with brands. Over the past two years, we have extended LivePerson’s suite of natural language processing (NLP) and natural language understanding (NLU) models to support customer service chatbots in 10 human languages and their dialects spoken across the Americas, Europe, Asia, and the Pacific Islands. Dutch, English, French, German, Indonesian, Italian, Japanese, Portuguese, Spanish, and Tagalog are now part of our offerings.
Multilingual natural language processing is about much more than translating training data into new languages and retraining the models. In fact, we ran multiple experiments that demonstrated the importance of curating data and adapting approaches when extending a piece of technology to a new language. Throughout this work, we found that language expansion is about supporting new use cases for new cultures and settings that use different languages. There were unique challenges that we discovered while developing technology for these languages, which demonstrated the importance of working with naturally occurring language data and consulting with language experts throughout the process.
New use cases
In most cases, our English models were initially developed with the United States, Great Britain, and Australia in mind. Of course, industries vary internationally, so a part of extending our English insurance NLU models to Japanese was understanding the differences between standard coverages in the US, UK, Australia, and Japan.
We began by translating our English training data to Japanese and then evaluated the resulting model on data from consumers and brands in Japan. In this process, we learned that while bicycles are generally not insured at all in the US, insurance is required for bicycles in Japan. Because of this finding, we generated new training data to cover this use case, as it could not be directly transferred from translated English data.
New deployment settings for our expanded language technology
Deploying NLP models for new languages often means they will be used in new countries. Models that protect customer privacy must be adapted for these counties. While the language plays an important role in training a named entity recognition (NER) model, it is also important to ensure the model is trained on localized addresses and other identifiers.
As an example, extending our personal identifying information redaction model for Portuguese meant taking into account two new address formats: One for Portugal and another for Brazil. In some cases we ensured our model covered all necessary settings using data from the appropriate countries and language for training, while in other cases, we used regular expressions to hard code certain formats.
Differences in the availability of models and data
Sometimes we leverage external resources like data and NLP models to develop our products, but these can be increasingly difficult to find as we venture further from the very high-resourced languages that are commonly studied in NLP research. In the case of Indonesian, we were unable to find a SpaCy model (an open-sourced model that performs natural language processing tasks like NER, POS tagging and dependency parsing), so we trained our own.
Because linguistic structure varies from language to language and there isn’t a one-to-one mapping between words in different languages, it’s not sufficient to simply translate linguistically annotated data into new languages and map the annotations. Instead, data must be annotated by experts in the language with knowledge of linguistic structure. Acquiring this data was no easy task.
The importance of naturally occuring language data
One of the biggest challenges was finding conversational data for all of these languages to evaluate our NLU models. Still, evaluating with data that matched our customer use cases was a top priority. Without conversational examples in a particular language, it’s not uncommon for NLP practitioners to train and evaluate using data that was translated into the language from English. This can be problematic because the evaluation setting will not reveal problems like those described above that are likely to be encountered in the deployment setting. We found that this step was critical in understanding when we could leverage machine translation for training vs. when we need to modify our algorithms or find better training data for model development.
Similarly, at frequent stages throughout model development, we consulted with language experts, linguists, and native speakers of each language to understand why our models behaved the way they did. This was critical to catching errors and raising performance.
Setting the bar for global language technology
Because English has been at the center of NLP for a long time, NLP model performance on English data is very good. Accordingly, so are our expectations for these models. As we expand our research and model development to other languages, though, performance generally isn’t as high. This presents us with a difficult decision: Do we lower the bar of acceptable performance for these languages, or do we hold off on sharing our models until performance is on par with English models?
The impressive performance of English models has taken a lot of time — both to develop models and to generate data and identify shortcomings through use — and the only way to reach competitive performance for new languages means using models developed for those languages. At LivePerson we maintain a high bar for our products and have invested time and resources into making our multilingual models as good as possible with the data available. Now that they are being used by our brands, we are collecting feedback and data to iterate on these models and make them even better over time.
Throughout 2021 and 2022, LivePerson’s data science and engineering teams have released models for encoding, PII masking, language detection, intent classification, intentfulness detection, and topic detection for nine new languages. During this time, we have experimented with a number of approaches and published some of that research.
With new models in use, we are actively looking for opportunities to improve them and raise their performance. At the same time, we know that nine languages is a drop in the bucket of the roughly 7,000 languages spoken throughout the world. We look forward to extending our language technology support over the coming years.
This article on LivePerson’s human language technology expansion wouldn’t have been possible without the following contributors: George Bonev, Xi Chen, Xin Chen, Daniel Gilliam, Durga Gudipati, Andrew LaBonte, Gitit Kehat, Leanne Rolston, Jadin Tredup, Jian Wang, and Dominic Widdows.