Sourcify
Effortlessly find and manage open-source dependencies for your projects.

A 15B parameter model for code generation trained on 600+ programming languages.

StarCoder2-15B is a 15 billion parameter language model designed for code generation. It is trained on over 600 programming languages from The Stack v2 dataset, excluding opt-out requests. The model uses Grouped Query Attention and has a context window of 16,384 tokens with a sliding window attention of 4,096 tokens. Trained with the Fill-in-the-Middle objective on 4+ trillion tokens using NVIDIA NeMo framework on NVIDIA DGX H100 systems. It's not an instruction-following model, it excels at generating code snippets given context. Fine-tuning scripts are available in the StarCoder2's GitHub repository. Quantized versions are available through bitsandbytes for efficient memory usage.
StarCoder2-15B is a 15 billion parameter language model designed for code generation.
Explore all tools that specialize in generate code. This domain focus ensures StarCoder2-15B delivers optimized results for this specific requirement.
Explore all tools that specialize in complete code. This domain focus ensures StarCoder2-15B delivers optimized results for this specific requirement.
Explore all tools that specialize in translate code. This domain focus ensures StarCoder2-15B delivers optimized results for this specific requirement.
Explore all tools that specialize in debug code. This domain focus ensures StarCoder2-15B delivers optimized results for this specific requirement.
Explore all tools that specialize in refactor code. This domain focus ensures StarCoder2-15B delivers optimized results for this specific requirement.
Explore all tools that specialize in code completion. This domain focus ensures StarCoder2-15B delivers optimized results for this specific requirement.
Improves inference speed and reduces memory consumption by grouping query vectors.
Allows the model to attend to a larger context window (16,384 tokens) with reduced computational cost by focusing on a sliding window of 4,096 tokens.
The model is trained to fill in missing code segments, enhancing its ability to understand and generate code in various contexts.
Supports 8-bit and 4-bit quantization using bitsandbytes, reducing memory footprint and enabling deployment on resource-constrained devices.
Seamlessly scales training and inference across multiple GPUs using the Accelerate library.
Install transformers from source: pip install git+https://github.com/huggingface/transformers.git
Install necessary libraries: pip install accelerate bitsandbytes
Load the tokenizer: tokenizer = AutoTokenizer.from_pretrained("bigcode/starcoder2-15b")
Load the model: model = AutoModelForCausalLM.from_pretrained("bigcode/starcoder2-15b", device_map="auto", torch_dtype=torch.bfloat16)
Encode input: inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to("cuda")
Generate output: outputs = model.generate(inputs)
Decode output: print(tokenizer.decode(outputs[0]))
All Set
Ready to go
Verified feedback from other users.
"Users praise its code generation capabilities, but some note its limitations in complex tasks."
Post questions, share tips, and help other users.
Effortlessly find and manage open-source dependencies for your projects.

End-to-end typesafe APIs made easy.

Page speed monitoring with Lighthouse, focusing on user experience metrics and data visualization.

Topcoder is a pioneer in crowdsourcing, connecting businesses with a global talent network to solve technical challenges.

Explore millions of Discord Bots and Discord Apps.

Build internal tools 10x faster with an open-source low-code platform.

Open-source RAG evaluation tool for assessing accuracy, context quality, and latency of RAG systems.

AI-powered synthetic data generation for software and AI development, ensuring compliance and accelerating engineering velocity.