Changan Chen 陈昌安

I am a postdoctoral researcher at Stanford Vision and Learning Lab (SVL) hosted by Prof. Fei-Fei Li and Prof. Ehsan Adeli. I received my PhD from UT Austin advised by Prof. Kristen Grauman. I am broadly interested at building machine learning models that perceive the world with multi-modalities and interact with the world. Currently, I work on multimodal perception and generation for 3D scenes and humans.

Previously, I spent five months working with Prof. Andrea Vedaldi and Dr. Natalia Neverova at FAIR, London. I was a visiting researcher at FAIR working with Prof. Kristen Grauman for two years. In my undergrad, I spent a wonderful year working with Prof. Greg Mori on sports video analysis and efficient deep learning, eight months working with Prof. Alexandre Alahi on social navigation in crowds, and eight months working with Prof. Manolis Savva on relational graph reasoning for navigation.

My first name is pronounced as /tʃæn'æn/ with the g being silent.

Research opportunities: I am happy to collaborate with motivated undergrad and master students at Stanford. I am also happy to answer questions about my research. If you are interested, please send me an email.

CV | E-Mail | Google Scholar | Github | Twitter | Dissertation

Photo credit: Jasmin Zhang

News
May 2024 Joining Stanford Vision and Learning Lab as a postdoc researcher!
May 2024 I defended my PhD dissertation 4D Audio-Visual Learning: A Visual Perspective of Sound Propagation and Production!
March 2024 We are organizing the first Multimodalities for 3D Scenes (M3DS) workshop at CVPR 2024!
March 2024 We are organizing the fifth Embodied AI workshop at CVPR 2024!
October 2023 Organizing the second AV4D workshop at ICCV 2023!
June 2023 Giving one keynote talk at Ambient AI Workshop, ICASSP23, and one at Sight and Sound Workshop, CVPR23
Feb 2023 Co-organizing Embodied AI Workshop and the 3rd SoundSpaces Challenge at CVPR 2023!
Jan 2023 Co-organizing L3DAS23: Learning 3D Audio Sources for Audio-Visual Extended Reality at ICASSP 2023!
October 2022 We are organizing the first AV4D: Visual Learning of Sounds in Spaces workshop at ECCV 2022!
July 2022 Joining FAIR London for summer internship!
March 2022 I am very honored to receive the 2022 Adobe Research Fellowship!
Feb 2022 Organizing the second SoundSpaces Challenge at the Embodied AI Workshop, CVPR 2022!
Feb 2021 Organizing the first SoundSpaces Challenge at the Embodied AI Workshop, CVPR 2021!
May 2020 Joining Facebook AI Research as a visiting researcher
Publications

[NEW] HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
Zihui Xue, Mi Luo, Changan Chen, Kristen Grauman
arXiv 2024
paper | project

sym

[NEW] Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
Changan Chen*, Puyuan Peng*, Ami Baid, Sherry Xue, Wei-Ning Hsu, David Harwath, Kristen Grauman
ECCV 2024 (Oral)
paper | project | data | code

sym

[NEW] Sim2Real Transfer for Audio-Visual Navigation with Frequency-Adaptive Acoustic Field Prediction
Changan Chen*, Jordi Ramos*, Anshul Tomar*, Kristen Grauman
IROS 2024
paper | project

sym

[NEW] ActiveRIR: Active Audio-Visual Exploration for Acoustic Environment Modeling
Arjun Somayazulu, Sagnik Majumder, Changan Chen, Kristen Grauman
IROS 2024
paper | project

sym

[NEW] SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos
Changan Chen, Kumar Ashutosh, Rohit Girdhar, David Harwath, Kristen Grauman
CVPR 2024
paper | project

sym

[NEW] Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, ..., Changan Chen, ..., Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray
CVPR 2024 (Oral)
website | paper | video

sym

Self-Supervised Visual Acoustic Matching
Arjun Somayazulu, Changan Chen, Kristen Grauman
NeurIPS 2023
paper | project

sym

Replay: Multi-modal Multi-view Acted Videos for Casual Holography
Roman Shapovalov*, Yanir Kleiman*, Ignacio Rocco*, David Novotny, Andrea Vedaldi, Changan Chen, Filippos Kokkinos, Ben Graham, Natalia Neverova
ICCV 2023
paper | project | data

sym

Novel-View Acoustic Synthesis
Changan Chen, Alexander Richard, Roman Shapovalov, Vamsi Krishna Ithapu, Natalia Neverova, Kristen Grauman, Andrea Vedaldi
CVPR 2023
paper | project | code | data

sym

Learning Audio-Visual Dereverberation
Changan Chen, Wei Sun, David Harwath, Kristen Grauman
ICASSP 2023
paper | project | code

sym

Retrospectives on the Embodied AI Workshop
Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi, Sonia Raychaudhuri, Mike Roberts, Silvio Savarese, Manolis Savva, Mohit Shridhar, Niko Sünderhauf, Andrew Szot, Ben Talbot, Joshua B. Tenenbaum, Jesse Thomason, Alexander Toshev, Joanne Truong, Luca Weihs, Jiajun Wu
arXiv 2022
paper | website

sym

SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
Changan Chen*, Carl Schissler*, Sanchit Garg*, Philip Kobernik, Alexander Clegg, Paul Calamia, Dhruv Batra, Philip W Robinson, Kristen Grauman
NeurIPS 2022
Distinguished Paper Award at EgoVis Workshop, CVPR 2024
paper | project | website | code

sym

Few-Shot Audio-Visual Learning of Environment Acoustics
Sagnik Majumder, Changan Chen*, Ziad Al-Halah*, Kristen Grauman
NeurIPS 2022
paper | project

sym

Visual Acoustic Matching
Changan Chen, Ruohan Gao, Paul Calamia, Kristen Grauman
CVPR 2022 (Oral)
paper | video | project | code
Media coverage: media logo media logo media logo media logo
media logo media logo media logo media logo media logo media logo media logo media logo

sym

Sound Adversarial Audio-Visual Navigation
Yinfeng Yu, Wenbing Huang, Fuchun Sun, Changan Chen, Yikai Wang, Xiaohong Liu
ICLR 2022
paper | project | code

sym

Semantic Audio-Visual Navigation
Changan Chen, Ziad Al-Halah, Kristen Grauman
CVPR 2021
paper | project | code

sym

Learning to Set Waypoints for Audio-Visual Navigation
Changan Chen, Sagnik Majumder, Ziad Al-Halah, Ruohan Gao, Santhosh K. Ramakrishnan, Kristen Grauman
ICLR 2021
paper | project | code

sym

VisualEchoes: Spatial Image Representation Learning through Echolocation
Ruohan Gao, Changan Chen, Carl Schissler, Ziad Al-Halah, Kristen Grauman
ECCV 2020
paper | project | code

sym

SoundSpaces: Audio-Visual Navigation in 3D Environments
Changan Chen*, Unnat Jain*, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, Kristen Grauman
ECCV 2020 (Spotlight)
paper | project | code | website
Media coverage: media logo media logo
media logo media logo media logo

sym

Relational Graph Learning for Crowd Navigation
Changan Chen*, Sha Hu*, Payam Nikdel, Greg Mori, Manolis Savva
IROS 2020
paper | code

sym

Crowd-Robot Interaction: Crowd-aware Robot Navigation with Attention-based Deep Reinforcement Learning
Changan Chen, Yuejiang Liu, Sven Kreiss, Alexandre Alahi
ICRA 2019
paper | code

sym

Constraint-Aware Deep Neural Network Compression
Changan Chen, Frederick Tung, Naveen Vedula, and Greg Mori
ECCV 2018
paper | code

Invited Talks
Dec 2023 Invited talk at NYU, "4D Audio-Visual Perception: Simulating, Synthesizing and Navigating with Sounds in Spaces" (Slides)
June 2023 Keynote talk at PerDream Workshop, ICCV 2023, "Audio-Visual Embodied AI: From Simulating to Navigating with Sounds in Spaces" (Slides)
June 2023 Keynote talk at Sight and Sound Workshop, CVPR 2023, "Novel-view Acoustic Synthesis" (Slides)
June 2023 Keynote talk at Ambient AI Workshop, ICASSP 2023, "Visual-acoustic Learning" (Slides)
Feb 2023 Invited talk at Texas Acoustics, "Visual Learning of Sound in Spaces" (Slides)
Jan 2023 Invited talk at MIT, "Visual Learning of Sound in Spaces" (Slides)
Nov 2022 Invited talk at FAIR, Meta AI, "Visual Learning of Sound in Spaces" (Slides)
June 2022 Oral talk at CVPR 2022, "Visual Acoustic Matching" (Slides)
June 2021 Invited talk at Facebook Reality Labs, "Learning Audio-Visual Dereverberation" (Slides)
June 2021 Invited talk at EPIC Workshop, CVPR 2021, "Semantic Audio-Visual Navigation" (Slides)
Sept. 2020 Invited talk at CS391R: Robot Learning at UT Austin, ""Audio-Visual Navigation" (Slides)
Dec. 2018 Invited talk at SwissAI Meetup, "Navigation in Crowds: From 2D Navigation to Visual Navigation"
Nov. 2018 Invited talk at Swiss Machine Learning Day, "Crowd-aware Robot Navigation with Attention-based DRL"
Affiliations
           
ZJU, China
2014-2016
SFU, Canada
2016-2019
EPFL, Switzerland
2018
FAIR, USA & UK
2020 - 2022
UT AUSTIN, USA
2019 - 2024
Stanford, USA
2024 - present

Template credits: Unnat, Jon