Papers across LLMs

GPT-3.5
GPT-4
ChatGPT
Codex

GPT-3.5

Toward less hidden cost of code completion with acceptance and ranking models, 2021, link
A Lightweight Framework for High-Quality Code Generation, 2023
A new era in software security: towards self-healing software via large language models and formal verification, 2023, link
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets, 2023
Addressing Compiler Errors: Stack Overflow or Large Language Models?, 2023
ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation, 2023
Communicative Agents for Software Development, 2023
Deductive Verification of Chain-of-Thought Reasoning, 2023
Demystifying GPT Self-Repair for Code Generation, 2023
Enhancing Automated Program Repair through Fine-tuning and Prompt Engineering, 2023
Exploring the effectiveness of large language models in generating unit tests, 2023, link
Improving Few-shot Prompts with Relevant Static Analysis Products, 2023, link
Natural Language Commanding via Program Synthesis, 2023, link
Predicting Code Coverage without Execution, 2023
Prompt Sapper: A LLM-Empowered Production Tool for Building AI Chains, 2023
Semantic Compression with Large Language Models, 2023, link
The FormAI Dataset: Generative AI in Software Security Through the Lens of Formal Verification, 2023
ToolCoder: Teach Code Generation Models to Use API Search Tools, 2023, link
Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions, 2023, link
Towards Understanding the Capability of Large Language Models on Code Clone Detection: A Survey, 2023
Understanding Large Language Model Based Fuzz Driver Generation, 2023
VeriGen: A Large Language Model for Verilog Code Generation, 2023
WizardCoder: Empowering Code Large Language Models with Evol-Instruct, 2023

GPT-4

Addressing Compiler Errors: Stack Overflow or Large Language Models?, 2023
Automatic Model Selection with Large Language Models for Reasoning, 2023, link
ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation, 2023
Code Generation Tools (Almost) for Free? A Study of Few-shot, Pre-trained Language Models on Code, 2022, link
Demystifying GPT Self-Repair for Code Generation, 2023
Evaluating ChatGPT and GPT-4 for Visual Programming, 2023
Is GPT-4 a Good Data Analyst?, 2023, link
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation, 2023, link
Predicting Code Coverage without Execution, 2023
Selfevolve: A Code Evolution Framework via Large Language Models, 2023, link
Semantic Compression with Large Language Models, 2023, link
Towards Understanding the Capability of Large Language Models on Code Clone Detection: A Survey, 2023
Understanding Large Language Model Based Fuzz Driver Generation, 2023
Understanding the Effectiveness of Large Language Models in Code Translation, 2023
VeriGen: A Large Language Model for Verilog Code Generation, 2023
WizardCoder: Empowering Code Large Language Models with Evol-Instruct, 2023

ChatGPT

A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT, 2023, link
A Study on Prompt Design, Advantages and Limitations of ChatGPT for Deep Learning Program Repair, 2023, link
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets, 2023
ALGO: Synthesizing Algorithmic Programs with Generated Oracle Verifiers, 2023, link
An Analysis of the Automatic Bug Fixing Performance of ChatGPT, 2023, link
Analysis of ChatGPT on Source Code, 2023
Automatic Model Selection with Large Language Models for Reasoning, 2023, link
ChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Requirements Elicitation, and Software Design, 2023, link
ChatGPT: A Study on Its Utility for Ubiquitous Software Engineering Tasks, 2023, link
ChatUnitest: A ChatGPT-Based Automated Unit Test Generation Tool, 2023, link
Comparing Software Developers with ChatGPT: An Empirical Investigation, 2023, link
Deductive Verification of Chain-of-Thought Reasoning, 2023
Detecting Phishing Sites Using ChatGPT, 2023
Enabling Programming Thinking in Large Language Models Toward Code Generation, 2023, link
Evaluating AIGC Detectors on Code Content, 2023, link
Evaluating ChatGPT and GPT-4 for Visual Programming, 2023
Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT, 2023, link
Explainable Automated Debugging via Large Language Model-Driven Scientific Debugging, 2023, link
Exploring the Effectiveness of LLMs in Automated Logging Generation: An Empirical Study, 2023
Exploring the Robustness of Large Language Models for Solving Programming Problems, 2023
Extending the Frontier of ChatGPT: Code Generation and Debugging, 2023
Finding Failure-Inducing Test Cases with ChatGPT, 2023, link
Improving ChatGPT Prompt for Code Generation, 2023, link
Is ChatGPT the Ultimate Programming Assistant - How Far Is It?, 2023, link
Is This Snippet Written by ChatGPT? An Empirical Study with a CodeBERT-Based Classifier, 2023
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation, 2023, link
Keep the Conversation Going: Fixing 162 Out of 337 Bugs for $0.42 Each Using ChatGPT, 2023, link
LLM Is Like a Box of Chocolates: The Non-determinism of ChatGPT in Code Generation, 2023
LMPA: Improving Decompilation by Synergy of Large Language Model and Program Analysis, 2023, link
No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation, 2023, link
Refining ChatGPT-Generated Code: Characterizing and Mitigating Code Quality Issues, 2023
Self-Collaboration Code Generation via ChatGPT, 2023, link
SelfEvolve: A Code Evolution Framework via Large Language Models, 2023, link
Stack Over-Flowing with Results: The Case for Domain-Specific Pre-Training Over One-Size-Fits-All Models, 2023
The Scope of ChatGPT in Software Engineering: A Thorough Investigation, 2023, link
The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation, 2023, link
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation, 2023, link
ToolCoder: Teach Code Generation Models to Use API Search Tools, 2023, link

Codex

Algo: Synthesizing Algorithmic Programs with Generated Oracle Verifiers, 2023, link
Analysis of ChatGPT on Source Code, 2023
Automatic Model Selection with Large Language Models for Reasoning, 2023, link
Enabling Programming Thinking in Large Language Models Toward Code Generation, 2023, link
Explainable Automated Debugging via Large Language Model-Driven Scientific Debugging, 2023, link
Exploring the Robustness of Large Language Models for Solving Programming Problems, 2023
SelfEvolve: A Code Evolution Framework via Large Language Models, 2023, link
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation, 2023, link

Papers across LLMs

Table of contents

GPT-3.5

GPT-4

ChatGPT

Codex