π Selected Publications
For a complete list of publications, please visit my Google Scholar profile
π View Citation Trend
Note: * denotes equal contribution
π€ Vision-Language Models & VLA 3

Video-STAR: Reinforcing Zero-shot Video Understanding with Tools Yuan Z., Qu X., Qian, C., Chen, R., Tang, J., Sun L., Chu X., Zhang D., Wang Y., Cai Y., Li S.
Video-STAR proposes a novel framework that reinforces zero-shot video understanding through tool-use agents with multi-turn reasoning.

AutoDrive-RΒ²: Incentivizing Reasoning and Self-Reflection Capacity for VLA Model in Autonomous Driving Featured by AutoDrive Heart (θͺε¨ι©Ύι©ΆδΉεΏ) Yuan Z., Tang, J., Luo, J., Chen, R., Qian, C., Sun, L., Cai Y., Zhang D., Li, S.
AutoDrive-RΒ² introduces a reasoning and self-reflection framework for Vision-Language-Action models in autonomous driving scenarios.

Reasoning-VLA: A Fast and General Vision-Language-Action Reasoning Model for Autonomous Driving Zhang D.*, Yuan Z.*, Chen Z., Liao C., Chen Y., Shen F., Zhou Q., Chua T.
Reasoning-VLA presents a fast and general VLA reasoning model optimized for real-time autonomous driving applications.
β¨ Diffusion Models 1

ADE-CoT: β¦ Yuan Z., et al.
π 3D Vision 6

DVP-MVS++: Synergize Depth-Normal-Edge and Harmonized Visibility Prior for Multi-View Stereo Yuan Z., Zhang, D., Li, Z., Qian, C., Chen, J., Chen, Y., Chen K., Mao T., Li Z., Jiang H., Wang, Z.
DVP-MVS++ advances multi-view stereo through synergistic depth-normal-edge and visibility prior modeling.

SED-MVS: Segmentation-Driven and Edge-Aligned Deformation Multi-View Stereo with Depth Restoration and Occlusion Constraint Yuan Z., Yang, Z., Cai, Y., Wu, K., Liu, M., Zhang, D., Jiang H, Li Z., Wang, Z.
SED-MVS introduces segmentation-driven and edge-aligned deformation for robust multi-view stereo with depth restoration.

DVP-MVS: Synergize Depth-Edge and Visibility Prior for Multi-View Stereo Yuan Z., Luo, J., Shen, F., Li, Z., Liu, C., Mao, T., Wang, Z.

MSP-MVS: Multi-granularity segmentation prior guided multi-view stereo Yuan Z., Liu, C., Shen, F., Li, Z., Luo, J., Mao, T., Wang, Z.

SD-MVS: Segmentation-driven deformation multi-view stereo with spherical refinement and em optimization Yuan Z., Cao, J., Li, Z., Jiang, H., Wang, Z.

TSAR-MVS: Textureless-aware segmentation and correlative refinement guided multi-view stereo Yuan Z., Cao, J., Wang, Z., Li, Z.