FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published 15 days ago • 309
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning Paper • 2603.23483 • Published 11 days ago • 60
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning Paper • 2603.23483 • Published 11 days ago • 60
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning Paper • 2603.23483 • Published 11 days ago • 60
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published 18 days ago • 248
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published 18 days ago • 248
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published 18 days ago • 248
Helios: Real Real-Time Long Video Generation Model Paper • 2603.04379 • Published about 1 month ago • 178
Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation Paper • 2602.03619 • Published Feb 3 • 28
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published Jan 23 • 33
PixelDiT: Pixel Diffusion Transformers for Image Generation Paper • 2511.20645 • Published Nov 25, 2025 • 35
In-Video Instructions: Visual Signals as Generative Control Paper • 2511.19401 • Published Nov 24, 2025 • 32
Plan-X: Instruct Video Generation via Semantic Planning Paper • 2511.17986 • Published Nov 22, 2025 • 18