AI & ML interests

multi-modal foundation models

Recent Activity

geoffreychen777  updated a collection 5 days ago
mvp-engine
geoffreychen777  updated a dataset 5 days ago
mvp-lab/mvp-engine-llm-dev-data
geoffreychen777  published a dataset 5 days ago
mvp-lab/mvp-engine-llm-dev-data
View all activity

oliveryanzuolu 
posted an update 20 days ago
view post
Post
119
Excited to share RAVEN, my first PhD project. Paper, code, and models are all released.

RAVEN is for real-time autoregressive video generation. Instead of simply appending future chunks, we train the model to better remember and use its own generated history, leading to more realistic and natural long-horizon videos.

Technically, RAVEN repacks self-rollouts into interleaved clean historical endpoints and noisy denoising states, aligning training-time attention with inference-time extrapolation.

We also introduce CM-GRPO: by reformulating consistency-model sampling as a conditional Gaussian transition kernel, online RL can directly optimize the sampler transition used at inference.

Project Page: https://yanzuo.lu/raven
Paper: https://arxiv.org/abs/2605.15190
Code: https://github.com/mvp-ai-lab/RAVEN
Model: mvp-lab/RAVEN