Zhuowan Li (李卓婉)

I am a final-year PhD student in the Computer Science department at Johns Hopkins University, co-advised by Prof. Alan Yuille and Benjamin Van Durme. I am a member of the CCVL lab.

I received my B.E. degree from Tsinghua Univeristy in 2018, where I double major in Electronic Engineering and Journalism and Communication. I have also spent wonderful times interning at Amazon AWS, Meta AI, Adobe Research and Sensetime.

My research interest focuses on computer vision and natural language processing. My works relates to both large-scale pretraining and compositional models. I am also interested in model diagnosis including robustness, generaliation, compositionality, etc. I believe that the joint learning of vision and language offers mutual benefits.

In part time, I am a big fan of outdoor sports including rock climbing (recovering from a finger injury), snowboarding, skiing, hiking, mountaineering, etc.

CV  /  Google Scholar  /  Twitter  /  Github

profile photo

Email: zli110 at jhu dot edu

On the job market: I am currently on the job market. I am interested in both industry and academic positions. Don’t hesitate to email me if there is a potential fit.

  • [June 2023] I will attend CVPR 2023 in person at Vancouver. Let me know if you want to talk with me!
  • [May 2023] Started as as applied scentist intern at Amazon AWS.
  • [May 2023] Invited talk at the Computational Cognitive Science Lab at MIT.
  • [February 2023] Super-CLEVR is accepted by CVPR 2023 as Highlight.
  • Last updated: 2023/10/29.

    3D-Aware Visual Question Answering about Parts, Poses and Occlusions
    Xingrui Wang, Wufei Ma, Zhuowan Li, Adam Kortylewski, Alan Yuille
    NeurIPS, 2023
    arXiv / code and dataset

    Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning
    Zhuowan Li, Xingrui Wang, Elias Stengel-Eskin, Adam Kortylewski, Wufei Ma, Benjamin Van Durme, Alan Yuille
    CVPR (Highlight, top 2.5%), 2023
    project page / arXiv / code and dataset

    Localization vs. Semantics: How Can Language Benefit Visual Representation Learning?
    Zhuowan Li, Cihang Xie, Benjamin Van Durme, Alan Yuille
    Under submission, 2023
    arXiv / code (to be released)

    Visual Commonsense in Pretrained Unimodal and Multimodal Models
    Chenyu Zhang Benjamin Van Durme, Zhuowan Li*, Elias Stengel-Eskin*,
    NAACL (Oral), 2022
    arXiv / code and dataset

    SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering
    Vipul Gupta, Zhuowan Li, Adam Kortylewski, Chenyu Zhang, Yingwei Li, Alan Yuille
    CVPR, 2022
    arXiv / code

    Calibrating Concepts and Operations: Towards Symbolic Reasoning on Real Images
    Zhuowan Li, Elias Stengel-Eskin, Yixiao Zhang, Cihang Xie, Quan Tran, Benjamin Van Durme, Alan Yuille
    ICCV, 2021
    arXiv / code

    Context-Aware Group Captioning via Self-Attention and Contrastive Features
    Zhuowan Li, Quan Tran, Long Mai, Zhe Lin, Alan Yuille
    CVPR, 2020
    arXiv / project page

    FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification
    Yixiao Ge*, Zhuowan Li*, Haiyu Zhao, Guojun Yin, Shuai Yi, Xiaogang Wang, Hongsheng Li
    NeurIPS, 2018
    arXiv / project page / code

    Website theme stolen from Jon Barron.