Instructions to use dtp-fine-tuning/multi-turn_chatbot_diploy with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use dtp-fine-tuning/multi-turn_chatbot_diploy with PEFT:
Task type is invalid.
- Transformers
How to use dtp-fine-tuning/multi-turn_chatbot_diploy with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="dtp-fine-tuning/multi-turn_chatbot_diploy") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("dtp-fine-tuning/multi-turn_chatbot_diploy") model = AutoModelForCausalLM.from_pretrained("dtp-fine-tuning/multi-turn_chatbot_diploy") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use dtp-fine-tuning/multi-turn_chatbot_diploy with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "dtp-fine-tuning/multi-turn_chatbot_diploy" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dtp-fine-tuning/multi-turn_chatbot_diploy", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/dtp-fine-tuning/multi-turn_chatbot_diploy
- SGLang
How to use dtp-fine-tuning/multi-turn_chatbot_diploy with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "dtp-fine-tuning/multi-turn_chatbot_diploy" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dtp-fine-tuning/multi-turn_chatbot_diploy", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "dtp-fine-tuning/multi-turn_chatbot_diploy" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dtp-fine-tuning/multi-turn_chatbot_diploy", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use dtp-fine-tuning/multi-turn_chatbot_diploy with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for dtp-fine-tuning/multi-turn_chatbot_diploy to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for dtp-fine-tuning/multi-turn_chatbot_diploy to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for dtp-fine-tuning/multi-turn_chatbot_diploy to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="dtp-fine-tuning/multi-turn_chatbot_diploy", max_seq_length=2048, ) - Docker Model Runner
How to use dtp-fine-tuning/multi-turn_chatbot_diploy with Docker Model Runner:
docker model run hf.co/dtp-fine-tuning/multi-turn_chatbot_diploy
Model Card for SFT-Bakti-8B-Base-MultiTurn-Chatbot
Model Details
Model Description
This model is a fine-tuned version of [aitfindonesia/Bakti-8B-Base] designed specifically for multi-turn conversational capabilities in the Indonesian language. It was trained using the Unsloth library for faster and memory-efficient training, utilizing LoRA (Low-Rank Adaptation).
The model is optimized to handle context retention across multiple turns of conversation, making it suitable for interview simulations, customer support, and general-purpose Indonesian assistants.
- Developed by: DTP Fine Tuning Team
- Model type: Causal Language Model (Fine-tuned Qwen2/3 architecture)
- Language(s) (NLP): Indonesian
- License: Apache 2.0
- Finetuned from model: aitfindonesia/Bakti-8B-Base
Uses
Direct Use
The model is designed for:
- Multi-turn chat interactions in Indonesian.
- Question Answering (QA) requiring context from previous turns.
- Roleplay interactions (e.g., interview scenarios).
Out-of-Scope Use
- The model should not be used for generating factually accurate data without RAG (Retrieval Augmented Generation) as hallucinations are possible.
- Not intended for code generation tasks.
Training Details
Training Data
Dataset: dtp-fine-tuning/dtp-multiturn-interview-valid-15k
- Split: Train (90%) / Test (10%)
- Format: Multi-turn conversation format.
- Max Length: 2048 tokens
Training Procedure
The model was fine-tuned using Unsloth on a single NVIDIA A100 (80GB) GPU. It utilizes 4-bit quantization (NF4) to reduce memory usage while maintaining performance via QLoRA.
Training Hyperparameters
- Training regime: QLoRA (4-bit quantization with FP16 precision)
- Optimizer: AdamW 8-bit
- Learning Rate: $2 \times 10^{-5}$
- Scheduler: Linear with 5% warmup
- Batch Size: 8 per device (Gradient Accumulation: 4)
- Epochs: 2
- LoRA Config:
- Rank ($r$): 16
- Alpha ($\alpha$): 32
- Dropout: 0.05
- Target Modules:
q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj
Hardware
- GPU: NVIDIA A100 80GB PCIe
- VRAM Usage: Peak allocation approx. 19GB (23% utilization) due to 4-bit loading.
Evaluation
Results
The model demonstrates strong convergence on the multi-turn dataset.
- Final Train Loss: $\approx 0.42$
- Final Eval Loss: $\approx 0.41$
Note: The model outperforms the standard Qwen3-8B baseline on this specific Indonesian dataset, achieving lower loss values faster.
Environmental Impact
- Hardware Type: NVIDIA A100 80GB
- Compute Region: asia-east1
- Carbon Emitted: 0.31
Framework Versions
- Unsloth
- PEFT
- Transformers
- TRL
- Downloads last month
- 1