Ph.D. Student · Rutgers University

Yang Zhou

Researching multimodal LLMs, reinforcement learning, agents, and machine learning.

I am a third-year Ph.D. student in the Department of Computer Science at Rutgers University, supervised by Prof. Dimitris N. Metaxas in the Center for Computational Biomedicine Imaging and Modeling. Before that, I obtained my M.S. degree in Control Science and Engineering from the University of Science and Technology of China.

Multimodal LLMs RL Agent Machine Learning
Yang Zhou
Google Scholar Citations Synced from local cache

About

A concise overview of my current research, academic background, and training.

Education

Experience

Publications

Selected publications in multimodal language modeling, agents, reinforcement learning, and computer vision.

MLLM & LLM

Evaluating LLMs When They Do Not Know the Answer: Statistical Evaluation of Mathematical Reasoning via Comparative Signals

Zihan Dong, Zhixian Zhang, Yang Zhou, Can Jin, Ruijia Wu, Linjun Zhang

(arXiv, 2026) ICML 2026 Submit

LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation

Yang Zhou, Shiyu Zhao, Yuxiao Chen, Zhengting Wang, Can Jin, Dimitris N. Metaxas

(CVPR 2026)

Multi-Agent

What Makes Autonomous Transferable Agent Skill Generation Work?

Yang Zhou, Zihan Dong, Zhenting Wang, Can Jin, Shiyu Zhao, Bangwei Guo, Difei Gu, Mu Zhou, Dimitris N. Metaxas

NeurIPS 2026 Submit

M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark

Yang Zhou, Mingyu Zhao, Zhenting Wang, Difei Gu, Bangwei Guo, Ruosong Ye, Ligong Han, Can Jin, Dimitris N. Metaxas

(arXiv, 2025)

CAMEL: A Framework for Finding the Scaling Laws of Agents

Ziyi Yang, ... Yang Zhou, ...

ICML 2026 Submit

RL

Improving LLM Reinforcement Learning Efficiency via Importance-Sampled Difficulty Estimation and Stratified Training

Yang Zhou, Can Jin, Yanting Yang, Shiyu Zhao, Dimitris N. Metaxas

NeurIPS 2026 Submit

AIRL-S: Unifying Reinforcement Learning and Search-Based Test-Time Scaling via Adversarial Inverse Reinforcement Learning

Can Jin, Yang Zhou, Qixin Zhang, Hongwu Peng, Di Zhang, Zihan Dong, Marco Pavone, Ligong Han, Zhang-Wei Hong, Tong Che, Dimitris N. Metaxas

ICML 2026 Submit

Your Reward Function for RL is Your Best PRM for Search: Unifying RL and Search-Based TTS

Can Jin, Yang Zhou, Qixin Zhang, Hongwu Peng, Di Zhang, Marco Pavone, Ligong Han, Zhang‑Wei Hong, Tong Che, Dimitris N. Metaxas

(arXiv, 2025)

Computer Vision & Medical Imaging

RadAlign: Advancing Radiology Report Generation with Vision-Language Concept Alignment

Difei Gu, Yuxuan Gao, Yang Zhou, Mengmi Zhou, Dimitris Metaxas

(MICCAI 2025)

A Multimodal Spatio-Temporal GCN Model with Enhancements for Isolated Sign Recognition

Yang Zhou, Zhaoyang Xia, Yuxiao Chen, Carol Neidle, Dimitris N. Metaxas

(LREC-COLING 2024)

A Review of Convolutional Neural Network Architectures and Their Optimizations

Shuang Cong, Yang Zhou

(Artificial Intelligence Review, IF 9.588/Q1)

Diffusion Models for Sign Language Video Anonymization

Zhaoyang Xia, Yang Zhou, Ligong Han, Carol Neidle, Dimitris N. Metaxas

(LREC-COLING 2024)

Other Collaborative Work 2 papers

LUCID-SAE: Learning Unified Vision-Language Sparse Codes for Interpretable Concept Discovery

Difei Gu, Yunhe Gao, Gerasimos Chatzoudis, Zihan Dong, Guoning Zhang, Bangwei Guo, Yang Zhou, Mu Zhou, Dimitris N. Metaxas

(arXiv, 2026) ICML 2026 Submit

K-Prism: A Knowledge-Guided and Prompt Integrated Universal Medical Image Segmentation Model

Bangwei Guo, Yuxuan Gao, Mengqi Ye, Difei Gu, Yang Zhou, Leon Axel, Dimitris Metaxas

(arXiv, 2025)

Earlier Work 5 papers

Modeling & Data Analysis

A Multistory Building Evacuation Model Based on Multiple-Factor Analysis

Yang Zhou, Zichuan Fan

(Advances in Civil Engineering, IF 1.924/Q3)

Signals & Systems

Quasi-Dispersion of Air-Coupled Ultrasonic Signal for Angle-Dependent Reception

Zichuan Fan, Yang Zhou, Tianhao Qie

(Measurement, IF 5.131/Q1)

Multiple Reflective Signal Reception in Gas Flow Measurement Using Air-Coupled Leaky Lamb Waves

Zichuan Fan, Tianhao Qie, Yang Zhou

(Measurement, IF 5.131/Q1)

Teaching & Talks

Courses and presentations that reflect my academic service and communication experience.

Characteristics of Lamb Waves and Their Leakage Waves

Presenter, ICCAR International Academic Conference (May 2019)

Beijing, China

Skills & Honors

Research interests, technical background, and selected honors.

Research & Technical Areas

Research

Multimodal LLMs Reinforcement Learning Agent Systems Machine Learning

Programming

Python C++ Java

Systems & Tools

Linux Windows MATLAB Office Suite

Honors & Recognition

2019 Invention Patents (China)

2019 Utility Model Patent (China)

2019 Mathematical Contest in Modeling (COMAP) — Meritorious Winner