Kan Jen Cheng 鄭堪任

I am an incoming PhD student in Computer Science at the University of Maryland, College Park (Fall 2026), advised by Professor Ming C. Lin. Previously, I received my bachelor's degree from UC Berkeley, where I worked on audio-visual research in Berkeley Speech Group (BAIR), advised by Professor Gopala Anumanchipalli.

My research interests lie at the intersection of audio-visual learning, multimodal perception, and generative modeling. While recent advancements in LLMs have mastered language, I view text as a reduction of reality. Instead, I aim to build multimodal systems that perceive the world through the synergy of sight and sound, grounded in spatial awareness and physical understanding.

If you would like to discuss my research or potential collaborations, feel free to contact me. I'm always open to connect and collaborate.

Kan Jen Cheng

Research Philosophies

My work is driven by two core philosophies: human-centered perception, where I model speech characteristics, affective dynamics, and joint cognitive attention to capture how humans naturally experience the world; and creative media, where I develop tools that offer precise, object-level control for content creation.

Future Directions

As an audio-visual researcher, I have witnessed the power of multimodal synergy, yet I realize that correlation alone is insufficient. True perception requires understanding the physical laws, such as geometry, dynamics, and material interactions, that govern the spaces where sight and sound coexist. Consequently, my future research will focus on spatial and physical learning to contribute to the development of a comprehensive world model. I aim to move beyond surface-level alignment to construct digital twins that not only mimic the appearance of the environment but also simulate its underlying physical reality. By grounding audio-visual generation in these physical truths, we can enable agents to reason about the world through a unified sensory experience.

Research

Professional Experiences

Berkeley AI Research (BAIR), CA, U.S.

Research Assistant • Spring 2024 - Spring 2026

With: Gopala Anumanchipalli

Education

Ph.D.

Fall 2026 - Present

University of Maryland, College Park, MD, U.S.

Ph.D. in Computer Science

B.A.

Fall 2020, Spring 2022 - Fall 2023

University of California, Berkeley, CA, U.S.

B.A. in Computer Science