Text Generation
Transformers.js
ONNX
English
llama
code
python
maincoder
code-generation
reinforcement-learning
mcpo
conversational
Instructions to use shreyask/Maincoder-1B-ONNX-web with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use shreyask/Maincoder-1B-ONNX-web with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('text-generation', 'shreyask/Maincoder-1B-ONNX-web');
| license: apache-2.0 | |
| language: | |
| - en | |
| library_name: transformers.js | |
| tags: | |
| - code | |
| - python | |
| - maincoder | |
| - code-generation | |
| - reinforcement-learning | |
| - mcpo | |
| - onnx | |
| pipeline_tag: text-generation | |
| base_model: Maincode/Maincoder-1B | |
| # Maincoder 1B — ONNX (Quantized, WebGPU) | |
| This is a **quantized ONNX** version of [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B), optimized for in-browser inference with [Transformers.js](https://huggingface.co/docs/transformers.js) and WebGPU. | |
| ## Quantization | |
| - **Format:** ONNX with int4 (MatMulNBits) quantization | |
| - **Original model size:** ~5 GB (fp32) | |
| - **Quantized model size:** ~1.5 GB (q4) | |
| - **Quantization method:** `MatMulNBitsQuantizer` from `onnxruntime` with block_size=32, symmetric quantization | |
| All tensor data is embedded in a single `.onnx` file (no external data files) for browser compatibility. | |
| ## Usage with Transformers.js | |
| ```javascript | |
| import { AutoModelForCausalLM, AutoTokenizer } from "@huggingface/transformers"; | |
| const model = await AutoModelForCausalLM.from_pretrained( | |
| "shreyask/Maincoder-1B-ONNX-web", | |
| { dtype: "q4", device: "webgpu" } | |
| ); | |
| const tokenizer = await AutoTokenizer.from_pretrained( | |
| "shreyask/Maincoder-1B-ONNX-web" | |
| ); | |
| const messages = [ | |
| { role: "system", content: "You are Maincoder, an expert code generation assistant." }, | |
| { role: "user", content: "Write a binary search function in Python" }, | |
| ]; | |
| const input = tokenizer.apply_chat_template(messages, { | |
| add_generation_prompt: true, | |
| return_dict: true, | |
| }); | |
| const output = await model.generate({ | |
| ...input, | |
| max_new_tokens: 1024, | |
| eos_token_id: [151643, 151645], | |
| }); | |
| ``` | |
| ## Base Model | |
| This is a quantized conversion of [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B). See the base model card for training details, benchmarks, and intended use. | |