Dongyao Zhu

Researcher, Engineer
Open to research collaborations!

dongyaoz32 [AT] outlook [dot] com

Research Interests

I am interested in multi-modal generative models and human-centered AI. I focus on efficient data-centric approaches for improving and accelerating spatial reasoning ability of multi-modal foundation models. I also study societal value alignment for large language models, including values such as diversity, equity, and inclusion, controlled through precise human interactions.

Multi-Modality: vision large language models, spatial reasoning, visual token efficiency
Human Centred Studies: societal AI, social value alignment
Applications: audio, music, sports simulation, computer vision, natural language processing

Publications

indicates equal contribution.

  • Selected
  • All

Picaco: Pluralistic in-context value alignment of llms via total correlation optimization

Han Jiang, Dongyao Zhu, Zhihua Wei, Xiaoyuan Yi, Ziang Xiao, Xing Xie

Rethinking Data Distillation: Do Not Overlook Calibration

Dongyao Zhu, Bowen Lei, Jie Zhang, Yanbo Fang, Yiqun Xie, Ruqi Zhang, Dongkuan Xu

International Conference on Computer Vision (ICCV), 2023

Efficient Informed Proposals for Discrete Distributions via Newton's Series Approximation

Yue Xiang, Dongyao Zhu, Bowen Lei, Dongkuan Xu, Ruqi Zhang

International Conference on Artificial Intelligence and Statistics (AISTATS), 2023

Picaco: Pluralistic in-context value alignment of llms via total correlation optimization

Han Jiang, Dongyao Zhu, Zhihua Wei, Xiaoyuan Yi, Ziang Xiao, Xing Xie

Rethinking Data Distillation: Do Not Overlook Calibration

Dongyao Zhu, Bowen Lei, Jie Zhang, Yanbo Fang, Yiqun Xie, Ruqi Zhang, Dongkuan Xu

International Conference on Computer Vision (ICCV), 2023

Efficient Informed Proposals for Discrete Distributions via Newton's Series Approximation

Yue Xiang, Dongyao Zhu, Bowen Lei, Dongkuan Xu, Ruqi Zhang

International Conference on Artificial Intelligence and Statistics (AISTATS), 2023

Projects

  • Selected
  • All
Real-Time Accent Conversion
Converts English utterances from one accent to another in 5 seconds. No need of parallel data
VQ-VAE-WaveNet
VQ-VAE with WaveNet decoder based on TensorFlow (picture source: https://avdnoord.github.io/homepage/vqvae/)
MindAudio
An audio research toolkit.
MindONE
A Generative-AI repository, including stable diffusion 2.1 and ChatGPT detector (MPU-RoBERTA).
Real-Time Accent Conversion
Converts English utterances from one accent to another in 5 seconds. No need of parallel data
VQ-VAE-WaveNet
VQ-VAE with WaveNet decoder based on TensorFlow (picture source: https://avdnoord.github.io/homepage/vqvae/)
MindAudio
An audio research toolkit.
MindONE
A Generative-AI repository, including stable diffusion 2.1 and ChatGPT detector (MPU-RoBERTA).

Vitæ

Available upon request.

  • North Carolina State University 2024 - Present
    Ph.D. Student
  • Microsoft Research Asia 2023 - 2024
    Research Assistant
  • Huawei Technologies Co., Ltd. 2021 - 2023
    Software Engineer
    Distributed Data Lab, 2012 Labs
  • Phonetic AI, Inc. 2020 - 2021
    Deep Learning Engineer
  • University of California, San Diego 2016 - 2020
    B.Sc. Student
    B.Sc. in Computer Science
    B.A. in Economics
    Provost Honors

Services

Reviewer
  • CVPR 2026, ICLR 2026, NeurIPS 2025, AAAI (2024, 2025), International Workshop on Resource-Efficient Learning for Knowledge Discovery (2023)
  • Teaching

  • CSC 230, CSC 116, CSC 492 @ NCSU
  • CSE 130 @ UCSD, principles and paradigms of programming languages
  • Website Design

    This website is adapted from Martin Saveski. Many thanks!