Large Language Models for Code
Table of contents
Language models have shown impressive capabilities for various natural language processing and software engineering tasks. This page catalogs key details and papers for influential LLMs that have been applied to software tasks.
Decoder-only Models
GPT-1
- Release Date: 2018-06
- Institute: OpenAI
- Paper: Improving Language Understanding by Generative Pre-Training
GPT-2
- Release Date: 2019-02
- Institute: OpenAI
- Paper: Language Models are Unsupervised Multitask Learners
GPT-3
- Release Date: 2020-05
- Institute: OpenAI
- Paper: Language models are few-shot learners
Codex
- Release Date: 2021-08
- Institute: OpenAI
- Paper: Evaluating Large Language Models Trained on Code
GPT-NeoX
- Release Date: 2022-04
- Access: ckpt
- Paper: GPT-NeoX-20B: An Open-Source Autoregressive Language Model
GPT-Neo
- Release Date: 2021-03
- Source: Github
CodeGen
- Release Date: 2022/03
- Paper: CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
InstructGPT
- Release Date: 2022/01
- Paper: Training language models to follow instructions with human feedback
CodeGeeX
- Title: CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X
- Year: 2023
- Paper: Link
GPT-J
- Release Date: 2023/06
- Access: GPT-J-6B, GPT4All-J
- Paper: GPT-J-6B: 6B JAX-Based Transformer
LLaMA
- Release Date: 2023-02
- Institute: Meta
- Paper: LLaMA: Open and Efficient Foundation Language Models
ChatGPT
StableLM-Alpha
- Release Date: 2023/04
- Access: StableLM-Alpha
- Paper: Stability AI Launches the First of its StableLM Suite of Language Models
InCoder
- Paper: “InCoder: A Generative Model for Code Infilling and Synthesis”
- Authors: Daniel Fried et al.
- Release Date: 2023
- Paper: Link
GPT-4
- Release Date: 2023-03
- Institute: OpenAI
- Paper: GPT-4 Technical Report
WizardCoder
- Access: [WizardCoder](https://github.com/nlpxucan/WizardLM
- Release Date: 2023
- Paper: WizardCoder: Empowering Code Large Language Models with Evol-Instruct
PanGu-Coder
- Part of: PanGu-α
- Release Date: 2020
- Paper: “PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation”
OPT
- Release Date: 2022-05
- Access: api, ckpt
- Paper: OPT: Open Pre-trained Transformer Language Models
StarCoder
- Release Date: 2023/05
- Access: starcoder
- Papers: StarCoder: A State-of-the-Art LLM for Code, StarCoder: May the source be with you!
SantaCoder
- Release Date: 2023/01
- Access: santacoder
- Paper: SantaCoder: don’t reach for the stars!
PaLM
- Release Date: 2022-04
- Institute: Google
- Paper: PaLM: Scaling Language Modeling with Pathways
Vicuna
- Release Date: 2023/03
- Blog: Link
Flan-UL2
- Release Date: 2023-03
- Institute: Google
- Blog: Flan-UL2 Blog
CPM-Bee
- Release Date: 2022-10
- Institute: Baidu
- Paper: CPM: A Large-scale Generative Chinese Pre-trained Language Model
MT-NLG
- Release Date: 2022-01
- Institute: Microsoft
- Paper: Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
GLM
- Release Date: 2022-10
- Institute: Tsinghua University
- Paper: GLM-130B: AN OPEN BILINGUAL PRE-TRAINED MODEL
YaLM
- Release Date: 2022-06
- Institute: Yandex
- Blog: YaLM Blog
Alpaca
- Release Date: 2023-03
- Institute: Stanford University
- Access: Alpaca GitHub
RWKV-4
- Release Date: 2022-09
- Institute: Independent (BlinkDL)
- Access: RWKV-4 GitHub
Sparrow
- Release Date: 2022-09
- Institute: DeepMind
- Paper: Improving alignment of dialogue agents via targeted human judgements
Falcon
- Release Date: 2023-05
- Institute: Technology Innovation Institute (TII)
- Access: Falcon Homepage
Code Llama
- Release Date: 2023
- Institute: Meta (Facebook)
- Paper: Code Llama: Open Foundation Models for Code
RedPajama-INCITE
- Release Date: Not specified
- Blog: RedPajama-INCITE Blog
DeciCoder-1B
- Release Date: 2023-08
- Institute: Deci AI
- Blog: DeciCoder Blog
OpenLLaMA
- Release Date: 2023-05
- Institute: Not specified
- Access: OpenLLaMA Access
CodeGPT
- Release Date: 2021
- Paper: CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Encoder-only Models
BERT
- Release Date: 2018-10
- Institute: Google
- Paper: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
ALBERT
- Release Date: 2019
- Paper: ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
RoBERTa
- Release Date: 2019
- Paper: RoBERTa: A Robustly Optimized BERT Pretraining Approach
CodeBERT
- Release Date: 2020-04
- Institute: Microsoft
- Paper: CodeBERT: A Pre-Trained Model for Programming and Natural Languages
GraphCodeBERT
- Release Date: 2022/03
- Access: GraphCodeBERT
- Paper: GraphCodeBERT: Pre-training Code Representations with Data Flow
Encoder-decoder Models
AlphaCode
- Release Date: 2022/02
- Access: AlphaCode
- Institute: DeepMind
T5
- Release Date: 2019
- Paper: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
- Checkpoint: Link
CodeT5
- Release Date: 2021
- Access: CodeT5
- Paper: CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
CodeT5+
- Release Date: 2023/05
- Access: CodeT5+
- Paper: CodeT5+: Open Code Large Language Models for Code Understanding and Generation
UnixCoder
- Release Date: 2022
- Access: UniXcoder on Hugging Face
- Paper: UniXcoder: Unified Cross-Modal Pre-training for Code Representation
PLBART
- Release Date: 2021
- Paper: Unified Pre-training for Program Understanding and Generation
CodeReviewer
- Release Date: 2022
- Access: CodeReviewer
- Paper: Automating Code Review Activities by Large-Scale Pre-training