shreyask
/

Maincoder-1B-ONNX-web

Text Generation

Transformers.js

code-generation

reinforcement-learning

Model card Files Files and versions

Maincoder-1B-ONNX-web / README.md

shreyask's picture

Upload README.md with huggingface_hub

314c10a verified 3 months ago

|

history blame contribute delete

1.84 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: transformers.js
	tags:
	- code
	- python
	- maincoder
	- code-generation
	- reinforcement-learning
	- mcpo
	- onnx
	pipeline_tag: text-generation
	base_model: Maincode/Maincoder-1B
	---

	# Maincoder 1B — ONNX (Quantized, WebGPU)

	This is a quantized ONNX version of [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B), optimized for in-browser inference with [Transformers.js](https://huggingface.co/docs/transformers.js) and WebGPU.

	## Quantization

	- Format: ONNX with int4 (MatMulNBits) quantization
	- Original model size: ~5 GB (fp32)
	- Quantized model size: ~1.5 GB (q4)
	- Quantization method: `MatMulNBitsQuantizer` from `onnxruntime` with block_size=32, symmetric quantization

	All tensor data is embedded in a single `.onnx` file (no external data files) for browser compatibility.

	## Usage with Transformers.js

	```javascript
	import { AutoModelForCausalLM, AutoTokenizer } from "@huggingface/transformers";

	const model = await AutoModelForCausalLM.from_pretrained(
	"shreyask/Maincoder-1B-ONNX-web",
	{ dtype: "q4", device: "webgpu" }
	);

	const tokenizer = await AutoTokenizer.from_pretrained(
	"shreyask/Maincoder-1B-ONNX-web"
	);

	const messages = [
	{ role: "system", content: "You are Maincoder, an expert code generation assistant." },
	{ role: "user", content: "Write a binary search function in Python" },
	];

	const input = tokenizer.apply_chat_template(messages, {
	add_generation_prompt: true,
	return_dict: true,
	});

	const output = await model.generate({
	...input,
	max_new_tokens: 1024,
	eos_token_id: [151643, 151645],
	});
	```

	## Base Model

	This is a quantized conversion of [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B). See the base model card for training details, benchmarks, and intended use.