Sourcify
Effortlessly find and manage open-source dependencies for your projects.

A 15B parameter model trained on 600+ programming languages, designed for code generation and understanding.

StarCoder2-15B is a 15 billion parameter language model trained on over 600 programming languages from The Stack v2 dataset. It leverages a Transformer decoder architecture with grouped-query and sliding window attention mechanisms, enabling it to process context windows of up to 16,384 tokens. The model was trained using the Fill-in-the-Middle objective on over 4 trillion tokens using NVIDIA's NeMo framework on NVIDIA DGX H100 systems. StarCoder2-15B excels in code generation tasks, providing code snippets based on context, though it is not an instruction model and does not work well with natural language instructions directly. It offers quantized versions using bitsandbytes for reduced memory footprint, making it accessible on various hardware configurations, from CPU to multi-GPU setups.
StarCoder2-15B is a 15 billion parameter language model trained on over 600 programming languages from The Stack v2 dataset.
Explore all tools that specialize in generate code snippets. This domain focus ensures StarCoder2-15B delivers optimized results for this specific requirement.
Explore all tools that specialize in explain code logic. This domain focus ensures StarCoder2-15B delivers optimized results for this specific requirement.
Explore all tools that specialize in refactor code. This domain focus ensures StarCoder2-15B delivers optimized results for this specific requirement.
Explore all tools that specialize in code completion. This domain focus ensures StarCoder2-15B delivers optimized results for this specific requirement.
Utilizes Grouped Query Attention (GQA) to improve inference speed and reduce memory footprint compared to multi-head attention.
Implements a sliding window attention mechanism to handle long context windows of 16,384 tokens while reducing computational complexity.
Trained with the Fill-in-the-Middle (FIM) objective, enabling the model to better handle code completion tasks.
Supports quantization techniques (8-bit and 4-bit) using bitsandbytes, reducing memory footprint without significant performance degradation.
Supports multi-GPU configurations for faster training and inference using accelerate.
Leverages bfloat16 precision for training and inference, providing a balance between accuracy and computational efficiency.
Install transformers from source: pip install git+https://github.com/huggingface/transformers.git
Import AutoModelForCausalLM and AutoTokenizer from transformers.
Load the model and tokenizer using AutoModelForCausalLM.from_pretrained('bigcode/starcoder2-15b').
Move the model to the desired device (CPU or GPU).
Encode the input code snippet using the tokenizer.
Generate code using model.generate(inputs).
Decode the generated output using tokenizer.decode(outputs[0]).
All Set
Ready to go
Verified feedback from other users.
"Generally positive sentiment highlighting the model's code generation capabilities and efficiency, but some users report occasional inaccuracies."
Post questions, share tips, and help other users.
Effortlessly find and manage open-source dependencies for your projects.

End-to-end typesafe APIs made easy.

Page speed monitoring with Lighthouse, focusing on user experience metrics and data visualization.

Topcoder is a pioneer in crowdsourcing, connecting businesses with a global talent network to solve technical challenges.

Explore millions of Discord Bots and Discord Apps.

Build internal tools 10x faster with an open-source low-code platform.

Open-source RAG evaluation tool for assessing accuracy, context quality, and latency of RAG systems.

AI-powered synthetic data generation for software and AI development, ensuring compliance and accelerating engineering velocity.