Overview

CodeXGLUE is a benchmark dataset designed to evaluate and compare models for code intelligence tasks. It encompasses 14 datasets across 10 diversified code intelligence scenarios, including code-to-code translation, code summarization, and natural language code search. The platform supports the development of models leveraging pre-trained architectures like CodeBERT and CodeGPT. It aids developers in improving productivity through tasks such as code completion, defect detection, and code repair. Microsoft Research Asia, Developer Division, and Bing jointly created CodeXGLUE. The platform provides baseline models and pipelines, enabling researchers to participate in open challenges and contribute to the advancement of code intelligence. It includes tasks covering code-code, text-code, code-text, and text-text scenarios.

Common tasks

Code Completion Code Translation Code Search Defect Detection Code Summarization Code Understanding Code Generation Model Evaluation