πŸ‘‹ About Me

I am currently pursuing my Ph.D. at the Institute of Computing Technology , Chinese Academy of Sciences , advised by Prof. Zhaoqi Wang. Concurrently, I serve as a Research Intern at AMAP , Alibaba , where I work closely with Xiangxiang Chu. I am deeply grateful for the opportunity to collaborate with exceptional researchers including Prof. Shuo Li, Prof. Yujun Cai, and Prof. Yiwei Wang, as well as Prof. Zhengzhong Tu, Prof. Manling Li, and Prof. LiangLin. Their mentorship and insights have profoundly shaped my academic journey.

My research interest includes Vision-Language Model (VLM), Large Language Model (LLM), Embodied Agents, Multimodal AI, and 3D Vision. I have published 18+ papers ) at the top international AI conferences such as NeurIPS, ICLR, ICCV, AAAI.

I will be graduating with my Ph.D. in June 2026 at the age of 26 and am now exploring PostDoc opportunities starting Fall 2026. If you are interested in my profile, feel free to contact with me via email (πŸ“§ yuanzhenlong21b[at]ict[dot]ac[dot]cn) or WeChat (πŸ“§ YZL20000224).

πŸ“š Research Interests

  • Foundation Models & Pre-training πŸ”₯πŸ”₯
    • Vision-Language Models (VLMs) / Vision-Language Action (VLA) / Spatial Intelligence
  • Model Enhancement & Post-training πŸ”₯πŸ”₯
    • Reasoning & Alignment / Tool-Augmented RL / NLP-Enhanced Training
  • Model Interpretation πŸ”₯πŸ”₯
    • Mechanistic Interpretability / Factuality, Truthfulness, and Social Good
  • Real-World ApplicationsπŸ”₯πŸ”₯
    • Embodied Agents / AI for Science / Biomedical Engineering

πŸ”₯ Main News

  • 2025.10: Β πŸŽ‰πŸŽ‰ We propose Video-STAR, which is now available on ArXiv!
  • 2025.08: Β πŸŽ‰πŸŽ‰ Our work AutoDrive-RΒ² was reported by AutoDrive Heart (θ‡ͺεŠ¨ι©Ύι©ΆδΉ‹εΏƒ)
  • 2025.08: Β πŸŽ‰πŸŽ‰ We propose AutoDrive-RΒ², which is now available on ArXiv!
  • 2025.06: Β πŸŽ‰πŸŽ‰ We propose DVP-MVS++, which is now available on ArXiv!
  • 2025.05: Β πŸŽ‰πŸŽ‰ Our work SED-MVS has been Accepted by TCSVT 2025.
  • 2024.12: Β πŸŽ‰πŸŽ‰ We propose SED-MVS, which is now available on ArXiv!
  • 2024.12: Β πŸŽ‰πŸŽ‰ Our work DVP-MVS has been Accepted by AAAI 2025.
  • 2024.12: Β πŸŽ‰πŸŽ‰ Our work MSP-MVS has been Accepted by AAAI 2025.
  • 2024.08: Β πŸŽ‰πŸŽ‰ We propose DVP-MVS, which is now available on ArXiv!
  • 2024.08: Β πŸŽ‰πŸŽ‰ We propose MSP-MVS, which is now available on ArXiv!
  • 2024.05: Β πŸŽ‰πŸŽ‰ Our work TSAR-MVS has been Accepted by PR 2024.
  • 2024.01: Β πŸŽ‰πŸŽ‰ We propose TSAR-MVS, which is now available on ArXiv!
  • 2023.12: Β πŸŽ‰πŸŽ‰ Our work SD-MVS has been Accepted by AAAI 2024.
  • 2023.09: Β πŸŽ‰πŸŽ‰ We propose SD-MVS, which is now available on ArXiv!

πŸ“ Main Publications

Multimodal LLMs Post-Training

Preprint
sym

Video-STAR: Reinforcing Zero-shot Video Understanding with Tools

Yuan, Z., Qu X., Qian, C., Chen, R., Tang, J., Sun L., Chu X., Zhang D., Wang Y., Cai Y., Li S.

[Paper]

Preprint
sym

AutoDrive-RΒ²: Incentivizing Reasoning and Self-Reflection Capacity for VLA Model in Autonomous Driving

Yuan, Z., Tang, J., Luo, J., Chen, R., Qian, C., Sun, L., Cai Y., Zhang D., Li, S

[Paper]

3D Vision

TCSVT
sym

DVP-MVS++: Synergize Depth-Normal-Edge and Harmonized Visibility Prior for Multi-View Stereo

Yuan, Z., Zhang, D., Li, Z., Qian, C., Chen, J., Chen, Y., Chen K., Mao T., Li Z, Jiang H., Wang, Z

IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT) (Under Review), 2025.

[Paper]

TCSVT
sym

SED-MVS: Segmentation-Driven and Edge-Aligned Deformation Multi-View Stereo with Depth Restoration and Occlusion Constraint

Yuan, Z., Yang, Z., Cai, Y., Wu, K., Liu, M., Zhang, D., Jiang H, Li Z., Wang, Z.

IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT), 2025.

[Paper]

AAAI
sym

DVP-MVS: Synergize Depth-Edge and Visibility Prior for Multi-View Stereo

Yuan, Z., Luo, J., Shen, F., Li, Z., Liu, C., Mao, T., Wang, Z.

AAAI Conference on Artificial Intelligence (AAAI), 2025.

[Paper] [Code]

AAAI
sym

MSP-MVS: Multi-granularity segmentation prior guided multi-view stereo

Yuan, Z., Liu, C., Shen, F., Li, Z., Luo, J., Mao, T., Wang, Z.

AAAI Conference on Artificial Intelligence (AAAI), 2025.

[Paper] [Code]

AAAI
sym

SD-MVS: Segmentation-driven deformation multi-view stereo with spherical refinement and em optimization

Yuan, Z., Cao, J., Li, Z., Jiang, H., Wang, Z.

AAAI Conference on Artificial Intelligence (AAAI), 2024.

[Paper] [Code]

PR
sym

TSAR-MVS: Textureless-aware segmentation and correlative refinement guided multi-view stereo

Yuan, Z., Cao, J., Wang, Z., Li, Z..

Pattern Recognition (PR), 2024.

[Paper] [Code]

πŸ“– All Publications

πŸ† Awards and Service

  • 2024.12 Lenovo Enterprise Scholarship (Top 3%)
  • 2025.10 ICT National Scholarships (Top 5%)
  • Conference Reviewers: NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, AAAI
  • Journal Reviewers: IJCV, TIP, TMM, TNNLS, TCSVT, PR