Top Researchers in LM4SE

Table of contents

Researchers

This page tracks the top researchers in language models for software engineering (LM4SE) over the past decade based on their publication record at top SE conferences. We analyzed all papers from ICSE, ASE, FSE, and ISSTA since 2012 to compile a list of prolific authors advancing the state-of-the-art in LM4SE.

Researchers

Hongyu Zhang

Log-based Anomaly Detection Without Log Parsing, ASE 2021
Log Parsing: How Far Can ChatGPT Go?, ASE 2023
Deep API learning, FSE 2016
Predicting Node failure in cloud service systems, FSE 2018
Robust log-based anomaly detection on unstable log data, FSE 2019
No more fine-tuning? an experimental evaluation of prompt tuning in code intelligence, FSE 2022
Diet code is healthy: simplifying programs for pre-trained models of code, FSE 2022
You see what I want you to see: poisoning vulnerabilities in neural code search, FSE 2022
A Novel Neural Source Code Representation Based on Abstract Syntax Tree, ICSE 2019
Retrieval-based Neural Source Code Summarization, ICSE 2020
Cross-Domain Deep Code Search with Meta Learning, ICSE 2022
Improving Fault Localization and Program Repair with Deep Semantic Features and Transferred Knowledge, ICSE 2022
On the Evaluation of Neural Code Summarization, ICSE 2021
What Do They Capture? - A Structural Analysis of Pre-Trained Language Models for Source Code, ICSE 2022
Where is Your App Frustrating Users?, ICSE 2022
Keeping Pace with Ever-Increasing Data: Towards Continual Learning of Code Intelligence Models, ICSE 2023
Template-based Neural Program Repair, ICSE 2023
CoCoSoDa: Effective Contrastive Learning for Code Search, ICSE 2022
Towards Efficient Fine-Tuning of Pre-trained Code Models: An Experimental Study and Beyond, ISSTA 2023
Detecting Condition-Related Bugs with Control Flow Graph Neural Network, ISSTA 2023

David Lo

Xin Xia

Yang Liu

Zhenchang Xing

Gabriele Bavota

Tien Nhut Nguyen

Yi Li

Qing Wang

Chetan Arora

Lionel Briand

Lin Shi

Xiaofei Xie

Ge Li

Zhi Jin

Lingming Zhang

Mehrdad Sabetzadeh

Michael R. Lyu

What Makes Good In-Context Demonstrations for Code Intelligence Tasks with LLMs?, ASE 2023
Generative Type Inference for Python, ASE 2023
REEF: A Framework for Collecting Real-World Vulnerabilities and Fixes, ASE 2023
No more fine-tuning? an experimental evaluation of prompt tuning in code intelligence, FSE 2022
BiasAsker: Measuring the Bias in Conversational AI System, FSE 2023
AEON: a method for automatic evaluation of NLP test cases, ISSTA 2022
ARCLIN: Automated API Mention Resolution for Unformatted Texts, ICSE 2022

Neel Sundaresan

IntelliCode compose: code generation using transformer, FSE 2020
Program merge conflict resolution via neural transformers, FSE 2021
DeepDev-PERF: a deep learning-based approach for improving software performance, FSE 2022
Exploring and evaluating personalized models for code generation, FSE 2022
InferFix: End-to-End Program Repair with LLMs, FSE 2023
AdaptivePaste: Intelligent Copy-Paste in IDE, FSE 2023
Learning to Reduce False Positives in Analytic Bug Detectors, ICSE 2022

Bowen Xu

Lin Tan

Bugram: Bug detection with n-gram language models, ASE 2016
CURE: Code-Aware Neural Machine Translation for Automatic Program Repair, ICSE 2021
Revisiting Learning-based Commit Message Generation, ICSE 2023
Impact of Code Language Models on Automated Program Repair, ICSE 2023
CoCoNuT: combining context-aware neural translation models using ensemble for program repair, ISSTA 2020
How Effective Are Neural Networks for Fixing Security Vulnerabilities, ISSTA 2023

Denys Poshyvanyk

Xiaodong Gu

Answering Software Deployment Questions via Neural Machine Reading at Scale, ASE 2022
InfeRE: Step-by-Step Regex Generation via Chain of Inference, ASE 2023
Deep API learning, FSE 2016
Diet code is healthy: simplifying programs for pre-trained models of code, FSE 2022
Self-Supervised Query Reformulation for Code Search, FSE 2023
Cross-Domain Deep Code Search with Meta Learning, ICSE 2022

Toufique Ahmed

Few-shot training LLMs for project-specific code-summarization, ASE 2022
Better Patching Using LLM Prompting, via Self-Consistency, ASE 2023
Learning type annotation: is big data enough?, FSE 2021
NatGen: generative pre-training by “naturalizing” source code, FSE 2022
Multilingual training for Software Engineering, ICSE 2021
Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models, ICSE 2023

Cuiyun Gao

Pinjia He

Machine translation testing via pathological invariance, FSE 2020
Log Parsing with Generalization Ability under New Log Types, FSE 2023
BiasAsker: Measuring the Bias in Conversational AI System, FSE 2023
Automated Testing and Improvement of Named Entity Recognition Systems, FSE 2023
Structure-Invariant Testing for Machine Translation, ICSE 2019
AEON: a method for automatic evaluation of NLP test cases, ISSTA 2022

Collin McMillan

Chunyang Chen

Michele Tufano

Beijun Shen

Lancer: Your Code Tell Me What You Need, ASE 2019
Answering Software Deployment Questions via Neural Machine Reading at Scale, ASE 2022
InfeRE: Step-by-Step Regex Generation via Chain of Inference, ASE 2023
Diet code is healthy: simplifying programs for pre-trained models of code, FSE 2022
Cross-Domain Deep Code Search with Meta Learning, ICSE 2022

Xing Hu

Haoyu Wang

Prem Devanbu

Few-shot training LLMs for project-specific code-summarization, ASE 2022
Better Patching Using LLM Prompting, via Self-Consistency, ASE 2023
Learning type annotation: is big data enough?, FSE 2021
NatGen: generative pre-training by “naturalizing” source code, FSE 2022
Multilingual training for Software Engineering, ICSE 2021

Shing-Chi Cheung

Chunqiu Steven Xia

Antonio Mastropaolo

Chaozheng Wang

Zeyu Sun

A syntax-guided edit decoder for neural program repair, FSE 2021
Automatic Testing and Improvement of Machine Translation, ICSE 2019
FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation, ICSE 2022
Improving Machine Translation Systems via Isotopic Replacement, ICSE 2022
Tare: Type-Aware Neural Program Repair, ICSE 2023

Sallam Abualhaija

Xu Wang

A Novel Neural Source Code Representation Based on Abstract Syntax Tree, ICSE 2019
Retrieval-based Neural Source Code Summarization, ICSE 2020
Improving Fault Localization and Program Repair with Deep Semantic Features and Transferred Knowledge, ICSE 2022
Template-based Neural Program Repair, ICSE 2023
Detecting Condition-Related Bugs with Control Flow Graph Neural Network, ISSTA 2023

Hailong Sun

A Novel Neural Source Code Representation Based on Abstract Syntax Tree, ICSE 2019
Retrieval-based Neural Source Code Summarization, ICSE 2020
Improving Fault Localization and Program Repair with Deep Semantic Features and Transferred Knowledge, ICSE 2022
Template-based Neural Program Repair, ICSE 2023
Detecting Condition-Related Bugs with Control Flow Graph Neural Network, ISSTA 2023

Xudong Liu

A Novel Neural Source Code Representation Based on Abstract Syntax Tree, ICSE 2019
Retrieval-based Neural Source Code Summarization, ICSE 2020
Improving Fault Localization and Program Repair with Deep Semantic Features and Transferred Knowledge, ICSE 2022
Template-based Neural Program Repair, ICSE 2023
Detecting Condition-Related Bugs with Control Flow Graph Neural Network, ISSTA 2023

Luca Pascarella

Vincent Hellendoorn

Yao Wan

Guandong Xu

Tegawendé F. Bissyandé

Assessing the Generalizability of Code2vec Token Embeddings, ASE 2019
Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repair, ASE 2020
Natural Language to Code: How Far Are We?, FSE 2023
CodeGrid: A Grid Representation of Code, ISSTA 2023

Shaohua Wang

Jianjun Zhao

Songqiang Chen

Xiaoyuan Xie

Shangqing Liu

Jia Li

EditSum: A Retrieve-and-Edit Framework for Source Code Summarization, ASE 2021
ZC3: Zero-Shot Cross-Language Code Clone Detection, ASE 2023
SkCoder: A Sketch-based Approach for Automatic Code Generation, ICSE 2023

Jidong Ge

Pengyu Nie

CoditT5: Pretraining for Source Code and Natural Language Editing, ASE 2022
On the naturalness of hardware descriptions, FSE 2020
Multilingual Code Co-evolution using Large Language Models, FSE 2023
Learning Deep Semantics for Test Completion, ICSE 2023

Junyi Jessy Li

CoditT5: Pretraining for Source Code and Natural Language Editing, ASE 2022
On the naturalness of hardware descriptions, FSE 2020
Multilingual Code Co-evolution using Large Language Models, FSE 2023
Learning Deep Semantics for Test Completion, ICSE 2023

Miloš Gligorić

CoditT5: Pretraining for Source Code and Natural Language Editing, ASE 2022
On the naturalness of hardware descriptions, FSE 2020
Multilingual Code Co-evolution using Large Language Models, FSE 2023
Learning Deep Semantics for Test Completion, ICSE 2023

Shangwen Wang

Junjie Chen

Learning to Construct Better Mutation Faults, ASE 2022
Natural Test Generation for Precise Testing of Question Answering Software, ASE 2022
Robust log-based anomaly detection on unstable log data, FSE 2019
On the Evaluation of Neural Code Summarization, ICSE 2021

Qihao Zhu

Learning to Construct Better Mutation Faults, ASE 2022
A syntax-guided edit decoder for neural program repair, FSE 2021
FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation, ICSE 2022
Tare: Type-Aware Neural Program Repair, ICSE 2023

Xiwei Xu

Anh Tuan Nguyen

Xin Peng

Yiling Lou

Simone Scalabrino

Rocco Oliveto

Wenxuan Wang

What Makes Good In-Context Demonstrations for Code Intelligence Tasks with LLMs?, ASE 2023
Generative Type Inference for Python, ASE 2023
BiasAsker: Measuring the Bias in Conversational AI System, FSE 2023
AEON: a method for automatic evaluation of NLP test cases, ISSTA 2022

Kevin Moran

Jian Zhang

Ming Wen

Zhendong Su

On the localness of software, FSE 2014
Machine translation testing via pathological invariance, FSE 2020
Stochastic Optimization of Program Obfuscation, ICSE 2017
Structure-Invariant Testing for Machine Translation, ICSE 2019

Premkumar T. Devanbu

On the localness of software, FSE 2014
Are deep neural networks the best choice for modeling source code?, FSE 2017
On the naturalness of proofs, FSE 2018
On the naturalness of software, ICSE 2012

Miltiadis Allamanis

Learning natural coding conventions, FSE 2014
Mining idioms from source code, FSE 2014
Suggesting accurate method and class names, FSE 2015
AdaptivePaste: Intelligent Copy-Paste in IDE, FSE 2023

Charles Sutton

Learning natural coding conventions, FSE 2014
Mining idioms from source code, FSE 2014
Suggesting accurate method and class names, FSE 2015
Big Code != Big Vocabulary: Open-Vocabulary Models for Source Code, ICSE 2020

Yepang Liu

OASIS: prioritizing static analysis warnings for Android apps based on app user reviews, FSE 2017
Natural Language to Code: How Far Are We?, FSE 2023
CCT5: A Code-Change-Oriented Pre-trained Model, FSE 2023
Demystifying Privacy Policy of Third-Party Libraries in Mobile Apps, ICSE 2023

Dongmei Zhang

Robust log-based anomaly detection on unstable log data, FSE 2019
On the Evaluation of Neural Code Summarization, ICSE 2021
CoCoSoDa: Effective Contrastive Learning for Code Search, ICSE 2022
Towards Efficient Fine-Tuning of Pre-trained Code Models: An Experimental Study and Beyond, ISSTA 2023

Lu Zhang

A syntax-guided edit decoder for neural program repair, FSE 2021
Automatic Testing and Improvement of Machine Translation, ICSE 2019
Improving Machine Translation Systems via Isotopic Replacement, ICSE 2022
Tare: Type-Aware Neural Program Repair, ICSE 2023

Sumit Gulwani

Junjie Wang

Lingxiao Jiang

Saad Ezzini

Mark Harman

A study of equivalent and stubborn mutation operators using human analysis of equivalence, ICSE 2014
Automatic Testing and Improvement of Machine Translation, ICSE 2019
Improving Machine Translation Systems via Isotopic Replacement, ICSE 2022
Who Judges the Judge: An Empirical Study on Online Judge Tests, ISSTA 2023

Thibaud Lutellier

Yanlin Wang

On the Evaluation of Neural Code Summarization, ICSE 2021
CoCoSoDa: Effective Contrastive Learning for Code Search, ICSE 2022
Towards Efficient Fine-Tuning of Pre-trained Code Models: An Experimental Study and Beyond, ISSTA 2023
RefBERT: A Two-Stage Pre-trained Framework for Automatic Rename Refactoring, ISSTA 2023