Xu Zhang
I am a PhD student at Illinois Institute of Technology. Previously, I received my M.S. degree in Information and Communication Engineering and my B.S. degree in Telecommunication Engineering from Huazhong University of Science and Technology (HUST), Wuhan, China. My current research focuses on adversarial robustness and multimodal large language model (MLLM) jailbreaks, with the goal of improving the reliability and security of modern learning systems.
More details are available in my CV (PDF).
Research Interests
My research centers on two closely related themes:
- Adversarial Robustness: understanding failure modes under worst-case perturbations, and designing training objectives and architectures that improve robustness without overly compromising standard performance.
- MLLM Jailbreaks: analyzing why multimodal models exhibit jailbreak vulnerabilities (e.g., objective conflicts and representation gaps), and developing practical mitigation strategies that better align safety and utility.
More broadly, I aim to build learning systems whose behavior remains dependable under adversarial pressure and shifting objectives, and to develop mechanisms that make safety alignment more stable in real-world deployment.
Selected Publication
This work studies the robustness–accuracy tradeoff in mixture-of-experts models and proposes a dual-model approach to improve robustness while preserving strong standard performance.
Paper (OpenReview)