Yuhao Chen

My ultimate vision is to empower individuals to create their own games and movies through accessible, powerful CV algorithms. To realize this, my current research focuses on advancing computer vision in diverse domains, including food computing, 3D scene understanding, robotic action analysis, and sports analytics. My work spans the full spectrum of computer vision—from 2D and 3D perception to video understanding and generative modeling—pushing the boundaries of what intelligent visual systems can achieve.

Work Experience and Education

I am a Research Assistant Professor at the Vision and Image Processing Lab (VIP) at the University of Waterloo, specializing in Computer Vision. I joined the VIP lab as a postdoctoral fellow under the supervision of Professor Alexander Wong from 2020 to 2022, was promoted to Research Associate in 2022, and became a Research Assistant Professor in 2023. I earned my B.A.Sc. and Ph.D. degrees in Electrical and Computer Engineering from Purdue University in 2015 and 2019, respectively, where I was a member of the Video and Image Processing (VIPER) laboratory, under the supervision of Professor Edward J. Delp.

Prospective Students/Postdocs

I’m looking for MASc/PhD students and Postdocs for Computer Vision in Construction. Candidates with background in SLAM, Nerf, Gaussian Splatting, Robotics are encouraged to apply.

Our lab is also looking for students in Remote Sensing, supervised by Professor David Clausi

Professional Services

Chair CVPR 2025 MetaFood Workshop
Chair CVPR 2024 MetaFood Workshop
Co-Chair IAAI-24
Co-Chair IAAI-23
Co-Chair CVIS-23
Co-Chair CVIS-22
Co-Chair CVIS-21

News

I am excited to announce that I will be attending CVPR 2025! Here are the papers I will be presenting:

Main Conference

Wu, F., & Chen, Y. (2025). FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

MetaFood Workshop

Viswanath, S., Shah, K., Xi, P., Wong, A., & Chen, Y. (2025). FoodVideoQA: A Novel Baseline Framework for Dietary Monitoring. In MetaFood Workshop at CVPR 2025.
Li, J., Pena Cantu, F. J., Yu, E., Wong, A., Cui, Y., & Chen, Y. (2025). SAMJAM: Zero-Shot Video Scene Graph Generation for Egocentric Kitchen Videos. In MetaFood Workshop at CVPR 2025.
Valdes, J., Liu, S., Yang, S., Chen, Y., Wong, A., & Xi, P. (2025). Food Degradation Analysis Using Multimodal Fuzzy Clustering. In MetaFood Workshop at CVPR 2025.
Wang, E., & Chen, Y. (2025). FoodTrack: Estimating Handheld Food Portions with Egocentric Video. In MetaFood Workshop at CVPR 2025.
Lee, Y. H., & Chen, Y. (2025). Dietary Intake Estimation via Continuous 3D Reconstruction of Food. In MetaFood Workshop at CVPR 2025.
Tan, K., Yang, F., & Chen, Y. (2025). 6D Pose Estimation on Spoons and Hands. In MetaFood Workshop at CVPR 2025.

CVSports Workshop

Khanna, D., Bright, J., Chen, Y., & Zelek, J. (2025). SportMamba: Adaptive Non-Linear Multi-Object Tracking with State Space Models for Team Sports. In CVSports Workshop at CVPR 2025.
Salass, L., Bright, J., Nazemi, A., Chen, Y., Zelek, J., & Clausi, D. (2025). Ice Hockey Puck Localization Using Contextual Cues. In CVSports Workshop at CVPR 2025.

Computer Vision in the Wild Workshop

Bright, J., Wang, Z., Chen, Y., Rambhatla, S., Clausi, D. A., & Zelek, J. S. (2025). Gen4D: Synthesizing Humans and Scenes in the Wild. In Computer Vision in the Wild Workshop at CVPR 2025.

Women in Computer Vision Workshop

Buzko, K., Clausi, D., & Chen, Y. (2025). Generative Video Editing: From Unconfident to Confident. In Women in Computer Vision Workshop at CVPR 2025.
Buzko, K., Clausi, D., & Chen, Y. (2025). HAIKYU: Hockey Action Identification and Keypose Understanding. In Women in Computer Vision Workshop at CVPR 2025.

3D-LLM/VLA Workshop

Ali, M. Q., Nair, S., Wong, A., Cui, Y., & Chen, Y. (2025). GraphPad: Inference-Time 3D Scene Graph Updates for Embodied Question Answering. In 3D-LLM/VLA Workshop at CVPR 2025.