Andrew Zhao
andrewzh
AI & ML interests
Reinforcement Learning, Agents
Recent Activity
upvoted a paper 5 days ago
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning upvoted a paper 11 days ago
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers upvoted a paper about 2 months ago
Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning