image/png

webbigdata/Qwen3-0.6B_WBD

Qwen3-0.6Bใซ็ถ™็ถšๅญฆ็ฟ’ใ‚’่กŒใ„ใ€ๆ—ฅๆœฌ่ชž่ƒฝๅŠ›ใƒปๆŽจ่ซ–่ƒฝๅŠ›ใƒปๆ—ฅๅธธไผš่ฉฑ่ƒฝๅŠ›ใ‚’ๅผทๅŒ–ใ—ใŸ่ปฝ้‡ๆ—ฅๆœฌ่ชžใƒขใƒ‡ใƒซใงใ™ใ€‚
ใƒ–ใƒฉใ‚ฆใ‚ถไธŠใงใฎๅฎŒๅ…จๅ‹•ไฝœ ใ‚นใƒžใƒผใƒˆใƒ•ใ‚ฉใƒณใ€ใ‚จใƒƒใ‚ธใƒ‡ใƒใ‚คใ‚นใงใฎๅ‹•ไฝœใ‚’ไธปใช็›ฎๆจ™ใจใ—ใฆ้–‹็™บใ•ใ‚Œใพใ—ใŸใ€‚
A lightweight Japanese-enhanced model based on Qwen3-0.6B with improved Japanese language ability, reasoning, and conversational capability.
Designed primarily to run completely in-browser and on smartphones, and edge devices.


ใƒ‹ใƒฅใƒผใ‚น / News

  • ใƒ–ใƒฉใ‚ฆใ‚ถใƒ‡ใƒขๅ…ฌ้–‹ ใ‚คใƒณใ‚นใƒˆใƒผใƒซไธ่ฆใƒปใ‚ตใƒผใƒใƒผไธ่ฆใงใƒ–ใƒฉใ‚ฆใ‚ถไธŠใงๅฎŒๅ…จๅ‹•ไฝœใ™ใ‚‹ใƒ‡ใƒขใ‚’ๅ…ฌ้–‹ โ†’ webbigdata SLM Demo
  • ใ‚นใƒžใƒผใƒˆใƒ•ใ‚ฉใƒณๅ‹•ไฝœ็ขบ่ชๆธˆใฟ 2020ๅนด็™บๅฃฒใฎAQUOS sense4 basic๏ผˆSnapdragon 720G / RAM 3GB๏ผ‰ใง 17.20 t/s ใฎๅ‹•ไฝœใ‚’็ขบ่ช โ†’ ๅ‹•ไฝœๅ‹•็”ป
  • ใ‚นใƒžใƒผใƒˆใƒ•ใ‚ฉใƒณๅ‘ใ‘้‡ๅญๅŒ–็‰ˆๅ…ฌ้–‹ executorchใ‚’ไฝฟใฃใŸ4bit้‡ๅญๅŒ–็‰ˆใ‚’ๅ…ฌ้–‹ โ†’ dahara1/Qwen3-0.6B-executorch-jp

ใƒขใƒ‡ใƒซๆฆ‚่ฆ / Model Overview

้ …็›ฎ ๅ†…ๅฎน
ใƒ™ใƒผใ‚นใƒขใƒ‡ใƒซ / Base Model Qwen/Qwen3-0.6B
ใƒ‘ใƒฉใƒกใƒผใ‚ฟๆ•ฐ / Parameters ็ด„6ๅ„„ (0.6B)
ใƒฉใ‚คใ‚ปใƒณใ‚น / License Apache 2.0
ๅฏพๅฟœ่จ€่ชž / Languages ๆ—ฅๆœฌ่ชžใƒป่‹ฑ่ชž (Japanese / English)
ๅญฆ็ฟ’ๆ‰‹ๆณ• / Training SFTใ€RLใ€8bit้‡ๅญๅŒ–
้–‹็™บ่€… / Developer dahara1@webbigdata

ใƒ–ใƒฉใ‚ฆใ‚ถใƒ‡ใƒข / Browser Demo

ใ‚คใƒณใ‚นใƒˆใƒผใƒซไธ่ฆใƒปใ‚ตใƒผใƒใƒผไธ่ฆใ€‚ใƒ–ใƒฉใ‚ฆใ‚ถใงไปŠใ™ใ่ฉฆใ›ใพใ™ใ€‚
No installation, no server required. Try it directly in your browser.

๐Ÿ‘‰ https://webbigdata.jp/slm/

Browser Demo Screenshot

WASM + llama.cpp ใซใ‚ˆใ‚‹ๅฎŒๅ…จใ‚ฏใƒฉใ‚คใ‚ขใƒณใƒˆใ‚ตใ‚คใƒ‰ๅ‹•ไฝœใ€‚ใƒ‘ใƒฉใƒกใƒผใ‚ฟๆ•ฐ0.6B๏ผˆ8ใƒ“ใƒƒใƒˆ้‡ๅญๅŒ–๏ผ‰610MBใฎใƒขใƒ‡ใƒซใŒใƒ–ใƒฉใ‚ฆใ‚ถไธŠใงๆŽจ่ซ–ใ—ใพใ™ใ€‚
Fully client-side inference via WASM + llama.cpp. A 610MB (8-bit quantized, 0.6B parameter) model runs entirely in-browser.


็‰นๅพด / Features

  • ๆ—ฅๆœฌ่ชž่ƒฝๅŠ›ใฎๅบ•ไธŠใ’๏ผš็‹ฌ่‡ชใƒ‡ใƒผใ‚ฟใซใ‚ˆใ‚‹็ถ™็ถšๅญฆ็ฟ’ใซใ‚ˆใ‚Šใ€ๆ—ฅๆœฌ่ชžใฎ่ชžๅฝ™ใƒป็Ÿฅ่ญ˜ใƒป่กจ็พๅŠ›ใ‚’ๅผทๅŒ–
  • ๆŽจ่ซ–่ƒฝๅŠ›ใฎๅผทๅŒ–๏ผšๅผทๅŒ–ๅญฆ็ฟ’(RL)ใ‚’ใซใ‚ˆใ‚Šใ€่ซ–็†็š„ใชๆŽจ่ซ–่ƒฝๅŠ›ใ‚’ๅ‘ไธŠ
  • ๆ—ฅๆœฌ่ชžๆ—ฅๅธธไผš่ฉฑ่ƒฝๅŠ›ใฎๅผทๅŒ–๏ผš่‡ช็„ถใชๆ—ฅๆœฌ่ชžไผš่ฉฑใ‚’็›ฎๆŒ‡ใ—ใŸๅญฆ็ฟ’ใ‚’ๅฎŸๆ–ฝ
    โ€ป 0.6Bใƒขใƒ‡ใƒซใฎๆ€ง่ณชไธŠใ€่ค‡ๆ•ฐใ‚ฟใƒผใƒณใซๅŠใถ้•ทใ„ไผš่ฉฑใซใฏ้™็•ŒใŒใ‚ใ‚Šใพใ™
  • ใƒ–ใƒฉใ‚ฆใ‚ถๅฎŒๅ…จๅ‹•ไฝœ๏ผšWASM + llama.cppใซใ‚ˆใ‚Šใ‚ตใƒผใƒใƒผไธ่ฆใงใƒ–ใƒฉใ‚ฆใ‚ถไธŠใงๅ‹•ไฝœ
  • ใ‚นใƒžใƒผใƒˆใƒ•ใ‚ฉใƒณๅ‹•ไฝœ็ขบ่ชๆธˆใฟ๏ผšexecutorchใซใ‚ˆใ‚Š2020ๅนด็™บๅฃฒใฎๅป‰ไพก็ซฏๆœซ๏ผˆSnapdragon 720G / RAM 3GB๏ผ‰ใง17.20 t/s ใ‚’็ขบ่ช

ใƒ™ใƒณใƒใƒžใƒผใ‚ฏ็ตๆžœ / Benchmark Results

ๆ—ฅๆœฌ่ชžใƒ™ใƒณใƒใƒžใƒผใ‚ฏ / Japanese Benchmarks

Model JCommonsenseQA JNLI JSTS JSQuAD Average
Qwen3-0.6B-Q8_0๏ผˆใƒ™ใƒผใ‚นใƒฉใ‚คใƒณ๏ผ‰ 62.40% 32.20% 17.20% 76.00% 46.95%
Qwen3-0.6B_WBD๏ผˆๆœฌใƒขใƒ‡ใƒซ๏ผ‰ 59.60% 72.60% 35.60% 82.00% 62.45%

็ถ™็ถšๅญฆ็ฟ’ใซใ‚ˆใ‚Šๅนณๅ‡ใ‚นใ‚ณใ‚ขใŒ 46.95% โ†’ 62.45%๏ผˆ+15.5pt๏ผ‰ ใซๅ‘ไธŠใ—ใพใ—ใŸใ€‚็‰นใซJNLI๏ผˆ่‡ช็„ถ่จ€่ชžๆŽจ่ซ–๏ผ‰ใฏ +40.4pt ใจๅคงๅน…ใซๆ”นๅ–„ใ—ใฆใ„ใพใ™ใ€‚

JCommonsenseQAใฎใ‚ใšใ‹ใชไฝŽไธ‹ใฏใ€็Ÿฅ่ญ˜ใƒป่ชžๅฝ™ใŒๅข—ใˆใŸ็ตๆžœใ€ๅพฎๅฆ™ใชใƒ‹ใƒฅใ‚ขใƒณใ‚นใง่ฟทใ„ใŒ็”Ÿใ˜ใ‚‹ใ‚ฑใƒผใ‚นใŒๅข—ใˆใŸใŸใ‚ใงใ™ใ€‚

ไป–ใƒขใƒ‡ใƒซใจใฎๆฏ”่ผƒใซใคใ„ใฆ / Comparison with Other Models

NTTใฎtsuzumi๏ผˆ0.6B๏ผ‰ใชใฉๅŒใ‚ตใ‚คใ‚บๅธฏใฎๆ—ฅๆœฌ่ชž็‰นๅŒ–ใƒขใƒ‡ใƒซใ‚‚ๅญ˜ๅœจใ—ใพใ™ใŒใ€JCommonsenseQAใƒปJNLIใƒปJSTSใƒปJSQuADใฎๅ…ทไฝ“็š„ใชๆ•ฐๅ€คใ‚’ๅ…ฌ้–‹ใ—ใฆใ„ใ‚‹ใƒขใƒ‡ใƒซใฏๅฐ‘ใชใใ€็พๆ™‚็‚นใงๅŒไธ€ใƒ™ใƒณใƒใƒžใƒผใ‚ฏใงใฎ็›ดๆŽฅๆฏ”่ผƒใฏใงใใฆใ„ใพใ›ใ‚“ใ€‚ๆœฌใƒขใƒ‡ใƒซใฏๅ†็พๅฏ่ƒฝใช่ฉ•ไพกๆกไปถใ‚’ๅ…ฌ้–‹ใ—ใฆใ„ใพใ™ใ€‚

M-IFEval๏ผˆๆ—ฅๆœฌ่ชžๅ‘ฝไปค่ฟฝๅพ“่ƒฝๅŠ›๏ผ‰

Model prompt-level (strict) instruction-level (strict)
Qwen3-0.6B-Q8_0 0.366 0.420
Qwen3-0.6B_WBD 0.238 0.314

M-IFEVALใฎไฝŽไธ‹ใซใคใ„ใฆ๏ผš่ฉ•ไพกใ‚ปใƒƒใƒˆใซใฏใ€Œ่‹ฑ่ชžไปฅๅค–ใฎ่จ€่ชžใธใฎ็ฟป่จณใ€ใชใฉๆ—ฅๆœฌ่ชž็‰นๅŒ–ๅญฆ็ฟ’ใจ็›ธๆ€งใฎๆ‚ชใ„ใ‚ฟใ‚นใ‚ฏใŒๆททๅœจใ—ใฆใ„ใพใ™ใ€‚
ๆ—ฅๆœฌ่ชžๅ›บๆœ‰ใ‚ฟใ‚นใ‚ฏ๏ผˆใ‚ญใƒผใƒฏใƒผใƒ‰ๅญ˜ๅœจ็ขบ่ชใƒปๆ–‡ๅญ—ๆ•ฐๅˆถ็ด„ใƒปnumbered listใชใฉ๏ผ‰ใงใฏ็ซถไบ‰ๅŠ›ใฎใ‚ใ‚‹ๆ€ง่ƒฝใ‚’็คบใ—ใฆใ„ใพใ™ใ€‚


ใ‚นใƒžใƒผใƒˆใƒ•ใ‚ฉใƒณๅ‹•ไฝœ / Smartphone Performance

executorchใ‚’ไฝฟใฃใŸ4bit้‡ๅญๅŒ–็‰ˆใซใ‚ˆใ‚Šใ€ใ‚นใƒžใƒผใƒˆใƒ•ใ‚ฉใƒณไธŠใงใฎๅ‹•ไฝœใ‚’ๅฎŸ็พใ—ใฆใ„ใพใ™ใ€‚

ๅ‹•ไฝœ็ขบ่ช็ซฏๆœซ๏ผš

้ …็›ฎ ๅ†…ๅฎน
ๆฉŸ็จฎ AQUOS sense4 basic A003SH
็™บๅฃฒๆ—ฅ 2020ๅนด11ๆœˆ19ๆ—ฅ๏ผˆ5ๅนดๅ‰ใฎๅป‰ไพกใ‚นใƒžใƒผใƒˆใƒ•ใ‚ฉใƒณ๏ผ‰
OS Android 12
SoC Qualcomm Snapdragon 720G๏ผˆใ‚ชใ‚ฏใ‚ฟใ‚ณใ‚ข๏ผ‰
RAM 3GB
ๅ‹•ไฝœ้€Ÿๅบฆ 17.20 t/s

๐Ÿ“น ๅ‹•ไฝœ็ขบ่ชๅ‹•็”ป๏ผˆYouTube Shorts๏ผ‰

ๆณจๆ„๏ผš ็พๆ™‚็‚นใงใฎใ‚นใƒžใƒผใƒˆใƒ•ใ‚ฉใƒณๅ‹•ไฝœใฏPC็ตŒ็”ฑใฎใ‚ฑใƒผใƒ–ใƒซ่ปข้€ใŒๅฟ…่ฆใงใ™ใ€‚ไธ€่ˆฌๅ‘ใ‘ใ‚ขใƒ—ใƒชใจใ—ใฆใฎ้…ๅธƒใฏใพใ ่กŒใฃใฆใ„ใพใ›ใ‚“ใ€‚iPhoneๅ‘ใ‘ใฏใ‚ทใƒŸใƒฅใƒฌใƒผใ‚ฟใƒผไธŠใงใฎๅ‹•ไฝœ็ขบ่ชใฎใฟใงใ™ใ€‚

ใ‚นใƒžใƒผใƒˆใƒ•ใ‚ฉใƒณๅ‘ใ‘้‡ๅญๅŒ–็‰ˆ๏ผšdahara1/Qwen3-0.6B-executorch-jp


ๅ‹•ใ‹ใ—ๆ–น / How to Run

llama.cpp ใ‚’ไฝฟใฃใŸๆ–นๆณ•

llama.cpp ใ‹ใ‚‰ใŠไฝฟใ„ใฎใƒใƒผใƒ‰ใ‚ฆใ‚งใ‚ขๅ‘ใ‘ใฎใƒ‘ใƒƒใ‚ฑใƒผใ‚ธใ‚’ใƒ€ใ‚ฆใƒณใƒญใƒผใƒ‰ใ—ใฆใใ ใ•ใ„ใ€‚
Ollama ใ‚„ LM Studio ใชใฉใ€ggufใƒ•ใ‚กใ‚คใƒซใซๅฏพๅฟœใ—ใŸใƒ„ใƒผใƒซใงใ‚‚ๅ‹•ใ‹ใ™ใ“ใจใŒใงใใพใ™ใ€‚

CLIใงๅ‹•ใ‹ใ™๏ผˆLinux/Mac๏ผ‰

./llama-cli -hf webbigdata/Qwen3-0.6B_WBD --ctx-size 4096 --temp 0.7 --top-p 0.8 --top-k 20 --min-p 0.01 --repeat-penalty 1.05

llama-server ใง่ตทๅ‹•ใ—ใฆใƒ–ใƒฉใ‚ฆใ‚ถใ‹ใ‚‰ใ‚ขใ‚ฏใ‚ปใ‚นใ™ใ‚‹

./llama-server -hf webbigdata/Qwen3-0.6B_WBD --host 0.0.0.0 --port 8080 --ctx-size 4096 --temp 0.7 --top-p 0.8 --top-k 20 --min-p 0.01 --repeat-penalty 1.05

ใƒ–ใƒฉใ‚ฆใ‚ถใง http://127.0.0.1:8080/ ใ‚’้–‹ใ„ใฆใใ ใ•ใ„ใ€‚

Python ใ‚นใ‚ฏใƒชใƒ—ใƒˆใ‹ใ‚‰ใ‚ขใ‚ฏใ‚ปใ‚นใ™ใ‚‹๏ผˆOpenAIไบ’ๆ›API๏ผ‰

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="dummy"
)

response = client.chat.completions.create(
    model="webbigdata/Qwen3-0.6B_WBD",
    messages=[
        {"role": "system", "content": "ใ‚ใชใŸใฏ่ฆชๅˆ‡ใชใ‚ขใ‚ทใ‚นใ‚ฟใƒณใƒˆใงใ™ใ€‚"},
        {"role": "user", "content": "ใ“ใ‚“ใซใกใฏ๏ผ"}
    ],
    stream=True
)
for chunk in response:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)

Qwen3 ๆŽจๅฅจใƒ‘ใƒฉใƒกใƒผใ‚ฟใƒผ่จญๅฎš / Recommended Parameters

Qwen3ใฏGreedy decoding๏ผˆTemperature=0ใชใฉใฎๆฑบๅฎš่ซ–็š„็”Ÿๆˆ๏ผ‰ใ‚’ไฝฟ็”จใ™ใ‚‹ใจ็นฐใ‚Š่ฟ”ใ—็”Ÿๆˆใชใฉใฎไธๅ…ทๅˆใŒ่ตทใใ‚„ใ™ใ„ใŸใ‚ใ€ใ‚ตใƒณใƒ—ใƒชใƒณใ‚ฐ๏ผˆTemperature > 0๏ผ‰ใฎไฝฟ็”จใ‚’ๅผทใๆŽจๅฅจใ—ใพใ™ใ€‚

ใƒ‘ใƒฉใƒกใƒผใ‚ฟใƒผ ๆŽจๅฅจๅ€ค
Temperature 0.7
Top_P 0.8
Top_K 20
Min_P 0.01
Repetition Penalty 1.05

้‡ๅญๅŒ–ใƒใƒชใ‚ขใƒณใƒˆ / Quantized Variants

ใƒใƒชใ‚ขใƒณใƒˆ ่ชฌๆ˜Ž ใƒชใƒณใ‚ฏ
executorch 4bit็‰ˆ ใ‚นใƒžใƒผใƒˆใƒ•ใ‚ฉใƒณๅ‘ใ‘ๅ‹•ไฝœ็”จ dahara1/Qwen3-0.6B-executorch-jp

ๅญฆ็ฟ’ใƒ‡ใƒผใ‚ฟ / Training Data

็‹ฌ่‡ชใซๅŽ้›†ใƒปๅˆๆˆใ—ใŸใƒ—ใƒฉใ‚คใƒ™ใƒผใƒˆใƒ‡ใƒผใ‚ฟใ‚ปใƒƒใƒˆใ‚’ไฝฟ็”จใ—ใฆใ„ใพใ™ใ€‚
Private datasets collected and created by webbigdata.


่ฌ่พž / Acknowledgments

  • Qwen/Qwen3-0.6B โ€” ใƒ™ใƒผใ‚นใƒขใƒ‡ใƒซ
  • Qwen/Qwen3-0.6B โ€” ใƒ—ใƒญใƒณใƒ—ใƒˆใƒ†ใƒณใƒ—ใƒฌใƒผใƒˆ
  • llama.cpp โ€” ๆŽจ่ซ–ใ‚จใƒณใ‚ธใƒณ
  • wllama โ€” WebAssembly
  • Hugging Face โ€” ใƒขใƒ‡ใƒซใƒ›ใ‚นใƒ†ใ‚ฃใƒณใ‚ฐ

้–‹็™บ่€… / Developer

@misc{dahara2025Qwen3-0.6B_WBD,
  author       = {dahara1@webbigdata},
  title        = {Qwen3-0.6B_WBD - Japanese-Enhanced Continual Learning Model},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/webbigdata/Qwen3-0.6B_WBD}},
  abstract     = {A lightweight Japanese-enhanced model based on Qwen3-0.6B, designed to run in browsers and on smartphones.},
}
Downloads last month
58
GGUF
Model size
0.6B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for webbigdata/Qwen3-0.6B_WBD

Finetuned
Qwen/Qwen3-0.6B
Quantized
(306)
this model