view article Article Open-source DeepResearch – Freeing our search agents +3 m-ric, albertvillanova, merve, thomwolf, clefourrier • Feb 4, 2025 • 1.32k
Running Featured 85 Distilling 100B+ Models 40x Faster with TRL 📝 85 TRL distillation for 100B+ teachers, 40x faster
Marco-MoE Collection A suit of multilingual MoE models with highly-sparse architectures • 5 items • Updated Apr 8 • 17
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 902
Marco-MoE Collection A suit of multilingual MoE models with highly-sparse architectures • 5 items • Updated Apr 8 • 17
Marco-MoE Collection A suit of multilingual MoE models with highly-sparse architectures • 5 items • Updated Apr 8 • 17
Marco-MoE Collection A suit of multilingual MoE models with highly-sparse architectures • 5 items • Updated Apr 8 • 17