Meta releases largest open source AI language model to date
Update May 23, 2022:
Meta is now releasing the larger OPT-66B language model, following the OPT 125 million to 30 billion parameter models (see news below). According to Meta, it is the largest open-source AI language model accessible without restrictions to date.
You can download the 66 billion parameter model and all smaller variants from the OPT Github page. As before, the largest model with 175 billion parameters is only accessible upon request.
Original article from May 4, 2022:
Meta’s AI department publishes an extensively trained language model to drive research, particularly on the vulnerabilities of language AI.
OpenAI’s GPT-3 speech AI was a milestone in the development of natural language AI systems. For the first time, an AI model was able to formulate short texts in such a believable and coherent way that they could no longer be recognized by humans as AI-generated. Subsequently, the model proved flexible for many other applications, such as code generation with only minor fine-tuning.
OpenAI has therefore seen a business model in GPT-3 for the first time and offers the speech AI via an API. The actual model is not available to researchers – but this would benefit scientific progress.
Meta is now positioning itself in this gap, offering researchers and selected partners a large AI language model to study for free.
Meta OPT: Large-scale AI language model for research
The “open-pre-trained transformer” (OPT) is comparable in performance and size to OpenAI’s GPT-3, according to Meta’s researchers. The largest version of OPT has 175 billion parameters – like GPT-3. However, according to Meta, the AI training was much more efficient: It is said to have caused only one-seventh of the CO₂ footprint of GPT-3.
Meta plans to make the largest OPT model with 175 billion parameters available to selected partners from research, industry, governments and civil society organizations on request. This will enable them to analyze, for example, the weightings and interrelationships of the neural network during language processing.
Meta makes smaller OPT models with 125 million to 30 billion parameters freely available. Together with the models, Meta also publishes the code for training and the documentation of the training process. The models cannot be used commercially (license agreement).
Bridging the gap between industry and research
Meta hopes that the scientific community will find “clear guidelines” for dealing in particular with the problems of large AI language models such as bias and toxicity in AI-generated language. Past research has shown, among other things, that GPT-3 not only reinforces existing biases in training data, but additionally generates new ones.
“Without direct access to these models, researchers are also limited in their ability to design detection and mitigation strategies for possible harm, which leaves detection and mitigation in the hands of only those with sufficient capital to access models of this scale,” Meta’s AI Team writes.
The issues described previously delay or hinder the widespread use of AI speech models in practice. Meta, on the other hand, is keen to deploy suitable models as quickly and safely as possible on its platforms, for example for social media moderation or voice assistance in VR, AR and with its smart home devices.
In this regard, OPT’s publication makes sense for Meta to accelerate scientific progress. In addition, the publication positions the company positively with the scarce next generation of AI scientists.
“We believe the entire AI community — academic researchers, civil society, policymakers, and industry — must work together to develop clear guidelines around responsible AI in general and responsible LLMs in particular, given their centrality in many downstream language applications,” Meta researchers write.
A much broader segment of the AI community needs access to large AI language models, they say, to “conduct reproducible research and collectively drive the field forward.”