Junkang Wu
junkang0909
AI & ML interests
LLM alignment
Recent Activity
upvoted a paper 6 days ago
On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation upvoted a paper 6 months ago
EPO: Entropy-regularized Policy Optimization for LLM Agents
Reinforcement Learning authored a paper 6 months ago
Aligning Multimodal LLM with Human Preference: A SurveyOrganizations
None yet