Zhuowan Li (李卓婉)
I am a final-year PhD student in the Computer Science department at Johns Hopkins University, co-advised by Prof. Alan Yuille and Benjamin Van Durme. I am a member of the CCVL lab.
I received my B.E. degree from Tsinghua Univeristy in 2018, where I double major in Electronic Engineering and Journalism and Communication. I have also spent wonderful times interning at Amazon AWS, Meta AI, Adobe Research and Sensetime.
My research interest focuses on computer vision and natural language processing. My works relates to both large-scale pretraining and compositional models. I am also interested in model diagnosis including robustness, generaliation, compositionality, etc. I believe that the joint learning of vision and language offers mutual benefits.
In part time, I am a big fan of outdoor sports including rock climbing (recovering from a finger injury), snowboarding, skiing, hiking, mountaineering, etc.
CV  / 
Google Scholar  / 
Twitter  / 
Github
|
Email: zli110 at jhu dot edu
|
On the job market: I am currently on the job market. I am interested in both industry and academic positions. Don’t hesitate to email me if there is a potential fit.
|
News
[June 2023] I will attend CVPR 2023 in person at Vancouver. Let me know if you want to talk with me!
[May 2023] Started as as applied scentist intern at Amazon AWS.
[May 2023] Invited talk at the Computational Cognitive Science Lab at MIT.
[February 2023] Super-CLEVR is accepted by CVPR 2023 as Highlight.
Last updated: 2023/10/29.
|
|
3D-Aware Visual Question Answering
about Parts, Poses and Occlusions
Xingrui Wang,
Wufei Ma,
Zhuowan Li,
Adam Kortylewski,
Alan Yuille
NeurIPS, 2023
arXiv /
code and dataset
|
|
Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
Zhuowan Li,
Xingrui Wang,
Elias Stengel-Eskin,
Adam Kortylewski,
Wufei Ma,
Benjamin Van Durme,
Alan Yuille
CVPR (Highlight, top 2.5%), 2023
project page /
arXiv /
code and dataset
|
|
Localization vs. Semantics: How Can Language Benefit Visual Representation Learning?
Zhuowan Li,
Cihang Xie,
Benjamin Van Durme,
Alan Yuille
Under submission, 2023
arXiv /
code (to be released)
|
|
Visual Commonsense in Pretrained Unimodal and Multimodal Models
Chenyu Zhang
Benjamin Van Durme,
Zhuowan Li*,
Elias Stengel-Eskin*,
NAACL (Oral), 2022
arXiv /
code and dataset
|
|
SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering
Vipul Gupta,
Zhuowan Li,
Adam Kortylewski,
Chenyu Zhang,
Yingwei Li,
Alan Yuille
CVPR, 2022
arXiv /
code
|
|
Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images
Zhuowan Li,
Elias Stengel-Eskin,
Yixiao Zhang,
Cihang Xie,
Quan Tran,
Benjamin Van Durme,
Alan Yuille
ICCV, 2021
arXiv /
code
|
|
Context-Aware Group Captioning via Self-Attention and Contrastive Features
Zhuowan Li,
Quan Tran,
Long Mai,
Zhe Lin,
Alan Yuille
CVPR, 2020
arXiv /
project page
|
|
FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification
Yixiao Ge*,
Zhuowan Li*,
Haiyu Zhao,
Guojun Yin,
Shuai Yi,
Xiaogang Wang,
Hongsheng Li
NeurIPS, 2018
arXiv /
project page /
code
|
|