On Tuesday, the parent company of Facebook, Meta Platforms (META.O), revealed an AI translator model that is capable of translating and transcribing speech in dozens of different languages. This model might serve as a building block for technologies that enable real-time communication across language divides.
The company stated in a blog post that their SeamlessM4T model was capable of supporting translations between text and speech in approximately 100 different languages, in addition to full speech-to-speech translation for 35 different languages. This combined technology had previously only been offered in separate models.
The chief executive officer of the firm, Mark Zuckerberg, has stated that he anticipates such technologies facilitating connections between users located in different parts of the world within the metaverse, which is a collection of interconnected virtual worlds on which he is wagering the company’s future.
According to the blog post, Meta is giving the public permission to utilize the model for any purpose other than financial gain.
The largest social media firm in the world has published a flurry of primarily free AI speech translator models this year, including a massive language model called Llama that poses a major challenge to proprietary models supplied by OpenAI, which is supported by Microsoft (MSFT.O), and Google (GOOGL.O).
An open AI ecosystem, according to Zuckerberg, works to Meta’s advantage since the business stands to make more by essentially crowd-sourcing the construction of consumer-facing tools for its social platforms than it would from charging for access to the models themselves.
Despite this, Meta is confronted with the same types of legal problems as the rest of the industry is about the training data that is consumed in order to develop its models.
In July, comedian Sarah Silverman and two other authors filed copyright infringement complaints against both Meta and OpenAI. The claims accuse the firms of utilizing their books as training data without first obtaining permission from the authors.
The researchers from Meta stated in a research paper that they acquired audio training data for the SeamlessM4T model from 4 million hours of “raw audio originating from a publicly available repository of crawled web data,” but they did not disclose which repository provided the raw audio.
A spokeswoman for Meta did not provide a response to concerns regarding the origin of the audio material.
The text data originated from databases that were produced in the previous year and pulled content from Wikipedia and other websites, according to the research article.