Text Generation
Transformers
Safetensors
llama
code
conversational
Eval Results (legacy)
text-generation-inference
Instructions to use budecosystem/code-millenials-8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use budecosystem/code-millenials-8b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="budecosystem/code-millenials-8b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("budecosystem/code-millenials-8b") model = AutoModelForCausalLM.from_pretrained("budecosystem/code-millenials-8b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use budecosystem/code-millenials-8b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "budecosystem/code-millenials-8b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "budecosystem/code-millenials-8b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/budecosystem/code-millenials-8b
- SGLang
How to use budecosystem/code-millenials-8b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "budecosystem/code-millenials-8b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "budecosystem/code-millenials-8b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "budecosystem/code-millenials-8b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "budecosystem/code-millenials-8b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use budecosystem/code-millenials-8b with Docker Model Runner:
docker model run hf.co/budecosystem/code-millenials-8b
File size: 4,797 Bytes
6725209 b676165 49b7c6f 599355b 49b7c6f 599355b 49b7c6f 6725209 599355b 49b7c6f 599355b 49b7c6f 599355b 49b7c6f 599355b 49b7c6f 599355b 49b7c6f 599355b 49b7c6f 599355b 49b7c6f 599355b 49b7c6f 599355b 49b7c6f 85d673e 49b7c6f 599355b 49b7c6f 3be3888 49b7c6f 3be3888 49b7c6f 3be3888 49b7c6f 3be3888 49b7c6f b676165 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | ---
license: llama3
library_name: transformers
tags:
- code
model-index:
- name: Code Millenials
results:
- task:
type: text-generation
dataset:
name: HumanEval
type: openai_humaneval
metrics:
- type: pass@1
value: 0.671
name: pass@1
verified: false
---
# Bud Code Millenials 8B
Welcome to our Code Model repository! Our model is specifically fine-tuned for code generation tasks. Bud Millenial Code Gen open-source models are currently the State of the Art (SOTA) for code generation, beating all the existing models of all sizes. We have achieved a HumanEval value of 80.48 @ Pass 1, beating proprietary models like Gemini Ultra, Claude, GPT-3.5 etc. by a large margin, and on par with GPT-4 (HumanEval ~ 82. Ref. WizardCoder). Our proprietary model (Bud Code Jr) beats GPT-4 as well with a HumanEval value of 88.2 & a context size of 168K, we will be releasing an API for Researchers, Enterprises, and potential Partners by January 2024 end. If interested, please reach out to jithinvg@bud.studio
### News 🔥🔥🔥
- [2024/04/21] We released **Code Millenials 8B** , which achieves the **67.1 pass@1** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
- [2024/01/09] We released **Code Millenials 3B** , which achieves the **56.09 pass@1** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
- [2024/01/09] We released **Code Millenials 1B** , which achieves the **51.82 pass@1** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
- [2024/01/03] We released **Code Millenials 34B** , which achieves the **80.48 pass@1** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
- [2024/01/02] We released **Code Millenials 13B** , which achieves the **76.21 pass@1** on the [HumanEval Benchmarks](https://github.com/openai/human-eval).
### HumanEval
<p align="center" width="100%">
<a ><img src="https://raw.githubusercontent.com/BudEcosystem/code-millenials/main/assets/result.png" alt="CodeMillenials" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
</p>
For the millenial models, the eval script in the github repo is used for the above result.
Note: The humaneval values of other models are taken from the official repos of [WizardCoder](https://github.com/nlpxucan/WizardLM), [DeepseekCoder](https://github.com/deepseek-ai/deepseek-coder), [Gemini](https://deepmind.google/technologies/gemini/#capabilities) etc.
### Models
| Model | Checkpoint | HumanEval (+) | MBPP (+) |
|---------|-------------|---------------|----------|
|Code Millenials 34B | <a href="https://huggingface.co/budecosystem/code-millenials-34b" target="_blank">HF Link</a> | 80.48 (75) | 74.68 (62.9) |
|Code Millenials 13B | <a href="https://huggingface.co/budecosystem/code-millenials-13b" target="_blank">HF Link</a> | 76.21 (69.5) | 70.17 (57.6) |
|Code Millenials 8B | <a href="https://huggingface.co/budecosystem/code-millenials-8b" target="_blank">HF Link</a> | 67.1 (61.6) | - |
|Code Millenials 3B | <a href="https://huggingface.co/budecosystem/code-millenials-3b" target="_blank">HF Link</a> | 56.09 (52.43) | 55.13 (47.11) |
|Code Millenials 1B | <a href="https://huggingface.co/budecosystem/code-millenials-1b" target="_blank">HF Link</a> | 51.82 (48.17) | 53.13 (44.61) |
### 🚀 Quick Start
Inference code using the pre-trained model from the Hugging Face model hub
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("budecosystem/code-millenials-8b")
model = AutoModelForCausalLM.from_pretrained("budecosystem/code-millenials-8b")
template = """You are an exceptionally intelligent coding assistant that consistently delivers accurate and reliable responses to user instructions.
### Instruction: {instruction}
### Response:"""
instruction = <Your code instruction here>
prompt = template.format(instruction=instruction)
inputs = tokenizer(prompt, return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))
```
## Training details
The model is trained of 8 A100 80GB for approximately 50hrs.
| Hyperparameters | Value |
| :----------------------------| :-----: |
| per_device_train_batch_size | 8 |
| gradient_accumulation_steps | 1 |
| epoch | 3 |
| steps | 8628 |
| learning_rate | 2e-5 |
| lr schedular type | cosine |
| warmup ratio | 0.1 |
| optimizer | adamw |
| fp16 | True |
| GPU | 8 A100 80GB |
### Important Note
- **Bias, Risks, and Limitations:** Model may sometimes make errors, produce misleading contents, or struggle to manage tasks that are not related to coding. |