Meta has taken another step towards creating a universal language translator.
The company has an open source AI model that translates more than 200 languages, many of which are not supported by existing systems.
The research is part of a Meta initiative launched earlier this year.
“We call this project No language left behindand the AI modeling techniques we’ve used from NLLB help us create high-quality translations on Facebook and Instagram for languages spoken by billions of people around the world,” Meta CEO Mark Zuckerberg said in a Facebook post.
NLLB focuses on languages with fewer resources, such as Maori or Maltese. Most people in the world speak these languages, but they lack the training data that AI translations typically need.
Meta’s new model is designed to overcome this challenge.
To do this, the researchers first interviewed speakers of disadvantaged languages to understand their needs. They then developed a new data mining technique that generates training sentences for: languages with few resources.
They then trained their model on a mix of the mined data and human-translated data.
The result is NLLB-200 — a massive multilingual translation system for 202 languages.
The team assessed the model’s performance against the FLORES-101 dataset, which evaluates translations of low-resource languages.
“Despite doubling the number of languages, our final model performs 40% better than the prior art on Flores-101,” the study authors wrote:†
The techniques have already improved machine translation on Facebook, Instagram and Wikipedia. Meta also has open source all their benchmarks, data scripts and models to support further research.
This can of course also benefit Meta.
Open source for everyone
Zuckerberg’s relentless drive to grow has recently run up against hurdles. In February, Facebook lost daily users for the first time in its 18-year history.
If Meta can improve the quality of its translations, it can make its apps attractive to a wider user base.
Inevitably, the company anticipates the research that’s going to play a big part in the metaverse — where growing concerns about inclusion† But it could also benefit the company’s existing apps.
Translation problems have long caused problems for Meta. In 2017, the Israeli Police arrested a Palestinian after Facebook translated a “good morning” message as “attack them.”
The company also struggles to keep an eye on disinformation and hate speech in languages with fewer resources†
The new research could reduce these risks and improve user experiences. To Meta’s credit, the company has also given rivals a chance to take advantage of the work. Open sourcing the models will hopefully also support speakers of languages that are underserved or under threat.
In this case, what benefits Meta could also benefit humanity. It also brings the fantasy of a universal translator closer to reality.