GPT-J-6B: 6B JAX-Based Transformer

GPT-J-6B: 6B JAX-Based Transformer

Open-source 6B-parameter JAX Transformer rivaling GPT-3 Curie.

Visit GPT-J-6B: 6B JAX-Based Transformer

About GPT-J-6B: 6B JAX-Based Transformer

GPT-J-6B is an open-source, 6-billion-parameter language model based on the JAX (Mesh) Transformer architecture. Designed for researchers, developers, and enthusiasts, it achieves performance comparable to GPT-3 Curie (6.7B) on various downstream tasks. The model exemplifies scalable model parallelism using xmap on JAX and can be accessed via a Colab notebook or web demo.

Pricing Plans
Open Source
$0

Resources

Product Website

Visit GPT-J-6B: 6B JAX-Based Transformer's official website for product details and getting started.

Visit website →