EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation Paper • 2605.23271 • Published 11 days ago • 78
AgentLens: Revealing The Lucky Pass Problem in SWE-Agent Evaluation Paper • 2605.12925 • Published 20 days ago • 3
SkillOS: Learning Skill Curation for Self-Evolving Agents Paper • 2605.06614 • Published 26 days ago • 46
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation Paper • 2604.28196 • Published Apr 30 • 72
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 630