Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper • 2502.12982 • Published Feb 18, 2025 • 19
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 4 days ago • 8
PokeGym: A Visually-Driven Long-Horizon Benchmark for Vision-Language Models Paper • 2604.08340 • Published 3 days ago • 3
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 3 days ago • 218
MolmoWeb: Open Visual Web Agent and Open Data for the Open Web Paper • 2604.08516 • Published 3 days ago • 33
OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks Paper • 2604.08539 • Published 3 days ago • 39
view post Post 165 Great experience yesterday at PyTorch Conf Europe in Paris 🇫🇷We (w/ @kashif ) talked about training LLMs through interaction, using trajectories across games, browsers, or simulatorsRoom was packed, a clear sign of interest in where RL post-training is heading.sharing the slides! 🤓https://drive.google.com/file/d/16k7YRnf5EJEo0XjXGlRJ_hVeLoFWKyNP/view?usp=sharing See translation 🔥 1 1 + Reply
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling Paper • 2604.07209 • Published 4 days ago • 27
VenusBench-Mobile: A Challenging and User-Centric Benchmark for Mobile GUI Agents with Capability Diagnostics Paper • 2604.06182 • Published Feb 6 • 2
FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling Paper • 2604.06916 • Published 4 days ago • 16
Experience Transfer for Multimodal LLM Agents in Minecraft Game Paper • 2604.05533 • Published 5 days ago • 9
Action Images: End-to-End Policy Learning via Multiview Video Generation Paper • 2604.06168 • Published 5 days ago • 10
Scientific Graphics Program Synthesis via Dual Self-Consistency Reinforcement Learning Paper • 2604.06079 • Published 5 days ago • 4
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models Paper • 2604.04707 • Published 6 days ago • 197
FileGram: Grounding Agent Personalization in File-System Behavioral Traces Paper • 2604.04901 • Published 6 days ago • 38