TOP 10 COMPETITORS OF DEEPMIND GATO AI TO KNOW IN 2022


SOURCE: ANALYTICSINSIGHT.NET
MAY 28, 2022

DeepMind is known as one of the top AI companies with a commitment to solving artificial intelligence. It is focused on bringing new ideas and advances in machine learning, engineering, simulation, and computer infrastructure with different programs. DeepMind has launched a new multi-modal artificial intelligence system that can perform over 600 different tasks and it is known as Gato. The global tech market is appreciating the impressive all-in-one machine learning kit in recent times.

Gato is gaining popularity as a multi-modal, multi-tasking, and multi-embodiment generalist policy that can play Atari, chat, and many more. It has become a tough competitor with better smart features than OpenAI GPT-3 as well as Meta OPT. On that note let’s explore the top competitors of DeepMind’s Gato AI in 2022.

BERT (Bidirectional Encoder Representations from Transformers)

BERT is a technique for NLP pre-training, developed by Google. It utilizes the Transformer, a novel neural network architecture that’s based on a self-attention mechanism for language understanding. It was developed to address the problem of sequence transduction or neural machine translation. That means, it suits best for any task that transforms an input sequence into an output sequence, such as speech recognition, text-to-speech transformation, etc.

OPT from Meta AI

Recently, the Meta AI team shared access to its OPT LLM with the scientific and academic research community. In addition, the team released both the pre-trained models and the code required to train and use them, which is a first for a language technology system of this magnitude. As per the team, it has built OPT-175B with energy efficiency in mind by successfully training a model of this size using only 1/7th of the carbon footprint as that of GPT-3. It is one of the top competitors of DeepMind Gato.

RoBERTa (Robustly Optimized BERT Pretraining Approach)

RoBERTa is an optimized method for the pre-training of a self-supervised NLP system. It builds the language model on BERT’s language masking strategy that enables the system to learn and predict intentionally hidden sections of text. RoBERTa modifies the hyperparameters in BERT such as training with larger mini-batches, removing BERT’s next sentence pretraining objective, etc. Pre-trained models like RoBERTa are known to outperform BERT in all individual tasks on the General Language Understanding Evaluation (GLUE) benchmark and can be used for NLP training tasks such as answering questions, dialogue systems, and document classification, etc.

Switch Transformer from Google Brain

The team of researchers at Google Brain, last year, open-sourced the Switch Transformer – natural-language processing (NLP) AI model. The model scales up to 1.6T parameters and improves training time up to 7x compared to the T5 NLP model, with comparable accuracy. The Switch Transformer, as per the paper published or arXiv, uses a mixture-of-experts (MoE) paradigm to combine several Transformer attention blocks. It is one of the top competitors of Gato.

OpenAI’s GPT-3

GPT-3 is a transformer-based NLP model that performs translation, question-answering, poetry composing, and cloze tasks, along with tasks that require on-the-fly reasoning such as unscrambling words. Moreover, with its recent advancements, the GPT-3 is used to write news articles and generate codes. GPT-3 can manage statistical dependencies between different words. It is trained on over 175 billion parameters, and 45 TB of text that’s sourced from all over the internet. With this, it is one of the biggest pre-trained NLP models available.

GPT-NeoX from EleutherAI

The buzz in the AI world started as EleutherAI open-sourced its large language model (LLM) GPT-NeoX-20B. It consists of 20 billion parameters. Built on a Coreweave GPU, the language model comes pre-trained with the GPT-Neox framework. Built on the cluster of 96 state-of-the-art NVIDIA A100 Tensor core GPUs for distributed training, the GPT-NeoX-20B performs very well when compared to its counterparts that are available for public access.

XLNet

Denoising autoencoding-based language models such as BERT helps in achieving better performance than an autoregressive model for language modeling. That is why there is XLNet that introduces the auto-regressive pre-training method which offers the following benefits- it enables learning bidirectional context and helps overcome the limitations of BERT with its autoregressive formula.

ALBERT (A Lite BERT for Self-supervised Learning of Language Representations)

The increasing size of pre-trained language models helps in improving the performance of downstream tasks. However, as the model size increases, it leads to issues such as longer training times and GPU/TPU memory limitations. To address this problem, Google presented a lite version of BERT (Bidirectional Encoder Representations from Transformers).

T5 (Text-to-Text Transfer Transformer)

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practices. The Google research team suggests a unified approach to transfer learning in NLP to set a new state-of-the-art in the field.

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

The authors from Microsoft Research propose DeBERTa, with two main improvements over BERT, namely disentangled attention and an enhanced mask decoder. DeBERTa has two vectors representing a token/word by encoding content and relative position respectively. The self-attention mechanism in DeBERTa processes self-attention of content-to-content, content-to-position, and also position-to-content, while the self-attention in BERT is equivalent to only having the first two components.

Similar articles you can read