Efficient Inference

Long-Context Modeling

Other

GRPO-VPS: Enhancing Group Relative Policy Optimization with Verifiable Process Supervision for Effective Reasoning

Jingyi Wang, Lei Zhuโœ‰, Tengjin Weng, Song-Li Wu, Haochen Tan, Jierun Chen, Chaofan Tao, Haoli Bai, Lu Hou, Lifeng Shang, Xiao-Ping Zhangโœ‰

ICLRW ICLR 2026 Workshop on Logical Reasoning of Large Language Models, 2026