Research

My recent research thrusts lie in trustworthy Artificial Intelligence (AI) and AI Safety with human-trust inspired and community driven innovations for social good, such as good health and wellbeing, mobility equity, and, zero hunger. My foundational AI research focuses on explainability, adversarial robustness, and fairness of deep neural networks (DNNs), collectively among others referring to trustworthy AI. Recent literature has seen marked performance improvement of AI models on benchmark datasets, nevertheless overlooking trustworthiness when deploying the AI models in safety and security-critical real-world scenarios. Our foundational AI research subsequently motivates use-inspired AI research to tackle some of more pressing human-centered and community-driven issues (e.g., transparency, fairness, and disparities in health and mobility) via efficiently leveraging the limited resources to improve accessibility of the socially vulnerable groups, fostering a thriving community. In addition, my research interest also lies in AI for Science where I collaborate with researchers from life, physical and social science domains to design tailor-made AI algorithms to solve their real-world research problems.

Interpretable Machine Learning and Explainable AI (XAI) research attempts to understand (1) from developer perspective how information flows from input to output and what Deep Neural Network learns, and (2) from both end user and developer perspectives, what does DNN 'see' in the image or 'comprehend' in natural language. We have made original contributions to both directions of current XAI research: explainable model prediction and interpretable feature representation learning.

Attentive Class Activation Token (AttCAT)

Explaining Transformer prediction

A novel Transformer explanation technique via attentive class activation tokens, aka, AttCAT, leveraging encoded features, their gradients, and their attention weights to generate a faithful and confident explanation for Transformer’s output
Qiang, Y, Pan, D, Li, C, Li, X, Jang, R, and Zhu, D. (2022) AttCAT: Explaining Transformers via Attentive Class Activation Tokens. In the Proceedings of Thirty-sixth Conference on Neural Information Processing Systems (NuerIPS-22), New Orleans, LA, USA.

Adversarial Gradient Integration (AGI)

Explaining DNN prediction

Adversarial Gradient Integration (AGI) to explain the attribution of each pixel or each token to the DNN’s class prediction via integrating gradients from adversarial examples to the test examples for the target class.
Pan, D, Li, X and Zhu, D. (2021) Explaining Deep Neural Network Models with Adversarial Gradient Integration. In 30th International Joint Conference on Artificial Intelligence (IJCAI-21), Montreal, Canada.

Interpretable Feature Mapping (IFM)

Interpretable feature mapping

An interpretable neural network architecture allowing interpretable feature mapping via dissecting latent layers guided by aspects and demonstrated applications in explainable recommender’s system.
Pan, D, Li, X, Li, X and Zhu, D. (2020) Explainable recommendation via interpretable feature mapping and evaluating explainability. In the proceedings of 29th International Joint Conference on Artificial Intelligence (IJCAI-20), Yokohama, Japan.

Adversarial robustness of DNNs research: we develop novel feature representation learning approaches to improve the adversarial robustness of the DNN via learning compact feature representation. We achieve this goal via designing novel natural training and adversarial training schemes. We also design adversarial attack and defense strategies to improve the safety of Large Language Models (LLMs).

Illustrations of hijacking attack during ICL

Adversarial In-Context Learning

This work introduces a novel transferable attack against In-Context-Learning to hijack LLMs to generate the target response or jailbreak. We also propose a defense strategy against hijacking attacks through the use of extra clean demos, which enhances the robustness of LLMs during ICL.

Qiang, Y., Zhou, X. and Zhu, D., 2023. Hijacking Large Language Models via Adversarial In-Context Learning. arXiv e-prints, pp.arXiv-2311.

Illustration of our learning to poison attack

Learning to Poison Large Language Models During Instruction Tuning

A novel gradient-guided backdoor trigger learning (GBTL) algorithm to identify adversarial triggers efficiently, ensuring an evasion of detection by conventional defenses while maintaining content integrity.

Qiang Y, Zhou X, Zade SZ, Roshani MA, Khanduri P, Zytko D, Zhu D. Learning to poison large language models during instruction tuning. arXiv preprint arXiv:2402.13459. 2024 Feb 21.

Probabilistically Compact (PC) loss
with logit constraints

Probabilistically Compact (PC) loss with logit constraints

A new Probabilistically Compact loss (PC loss) to directly enlarge the probability gap between true class and false classes to improve adversarial robustness of DNNs.
Li, X, Li, X, Pan,D and Zhu, D. (2021) Improving adversarial robustness via probabilistically compact loss with logit constraints. In the proceedings of Thirty-Five AAAI Conference on Artificial Intelligence (AAAI-21), virtual conference.

Fairness of DNNs research: this includes mitigation of class-imbalance and group-imbalance issues. To mitigate group-imbalance issue, we developed Multi-Task Learning (MTL) algorithms to predict time-to-event and ordinal outcomes. To overcome class-imbalance issue, we developed a cost-sensitive approach to design optimal weight schemes. We explicated the learning property of logistic and softmax loss functions by analyzing the necessary condition (e.g., gradient equals to zero) after training converges.

The Debiased Self-Attention (DSA) framework.

Fairness-aware Vision Transformer

We developed Debiased Self-Attention (DSA), is a fairness-through-blindness approach that enforces ViT to eliminate spurious features correlated with the sensitive attributes for bias mitigation. Adversarial examples are leveraged to locate and mask the spurious features in the input image patches with attention weights alignment.

Qiang Y, Li C, Khanduri P, Zhu D. Fairness-aware vision transformer via debiased self-attention. ECCV-24.

Attribution map generated by CIA

Counterfactual interpolation augmentation

A novel generative data augmentation approach to create counterfactual samples to make the sensitive attribute and the target attribute d-separated to achieve fairness and the interpolation path ensures attribution based explainability.

Qiang, Y, Li, C, Brocanelli, M, Zhu, D. (2022) Counterfactual Interpolation Augmentation (CIA): A Unified Approach to Enhance Fairness and Explainability of DNN. In the proceedings of 31st International Joint Conference on Artificial Intelligence (IJCAI-22), Messe Wien, Vienna, Austria.

In-Training Representation Alignment (ITRA)

Learning property of cross-entropy loss based DNNs

A novel reweighted logistic loss for multi-class classification to improve ordinary logistic loss by focusing on learning hard non-target classes (target vs. non-target class in one-vs.-all)
Li, X, Li, X, Pan,D and Zhu, D. (2020) On the learning behavior of logistic and softmax losses for deep neural networks. In the proceedings of Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), New York, USA.

Use-inspired Trustworthy AI Research: Healthcare data is featured with high dimension, heterogeneity and label scarcity. Our AI applications in healthcare research lies in patient subgroup identification and risk factor prioritization. To overcome label scarcity issue, we used primary labels together with auxiliary labels as regularization to learn features and improve prediction performance. We also tackle the label scarcity issue using semi-supervised and active learning approaches using EHR data. To address the data heterogeneity issue, we developed multi-task deep feature learning approaches to learn general features for predicting population-wide and task-specific features for predicting group-specific health outcomes. When patient groups are undefined, we generalized it with a deep mixture neural network model to predict health outcomes for latent groups. Our recent research thrust lies in trustworthy AI algorithms with community driven innovations for social good, such as health and wellbeing, mobility equity, security and privacy.

Deep Mixture Neural Network (DMNN) predictive model

Clinical outcome predictive model with patient stratification

Existing DNN predictive models either relies on pre-defined patient subgroups or one-size-fit-all. We develop a novel Deep Mixture Neural Network based predictive models for patient stratification and group-specific risk factor prioritization.
Li, X, Zhu, D* and Levy, P (2020) Predicting clinical outcomes with patient stratification via deep mixture neural networks. American Medical Informatics Association (AMIA-20) Summit on Clinical Research Informatics, Houston, USA. (Best Student Paper Award, *Corresponding Autor) PubMed 32477657

Multi-Task Learning (MTL) for risk factor prioritization

Prioritizing multi-level risk factors for obesity

Many behavior disorders, e.g., obesity, are multi-faced health outcome and risk factors are highly specific to certain subpopulation groups residing in different geospatial districts. We develop a Multi-Task Learning (MTL) approach for prioritize multi-level risk factors.
Wang, Dong, M, Towner, E and Zhu, D. (2019) Prioritization of multi-level risk factors for obesity. In the proceedings of 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM-19), 1065-1072.

Age effect identification from fMRI

Aging effect from human fetal brain using spontaneous fMRI

We design a deep 3D CNN to fetal blood oxygen-level dependence (BOLD) resting-state fMRI data to isolate the variation in fMRI signals that relates to the age effect.
Li, X., Hect, J., Thompson, J. and Zhu, D. (2020). Interpreting age effects of human fetal brain from spontaneous fMRI using deep 3D convolutional neural networks. IEEE International Symposium on Biomedical Imaging (ISBI-20), Iowa City, USA.

Automatic radiologist report generation

Automatic generation of radiologist report

We present Vispi, an automatic medical image interpretation system, which first annotates an image via classifying and localizing common thoracic diseases with visual support and then followed by report generation from an attentive LSTM model.
Li, X., Cao, R., & Zhu, D. (2020). Vispi: Automatic visual perception and interpretation of chest X-rays. In the proceedings of the Medical Imaging with Deep Learning (MIDL-20) conference, Montreal, CA.

AI for Science: AI has been increasingly applied to life, physical and social science domains to solve the real-world problems. In life science domain, I develop mining and learning algorithms to analyze the deep sequencing data (DNA-Seq or RNA-Seq). The Graphical User Interface software SAMMATE has become a standard software suite in RNA-Seq data based research area. Other GUI software dSpliceType for detecting tissue specific differential splicing and TEAK for detecting activated sub-pathways have also been widely used by the life science research community.

SAMMate: RNA-Seq data analysis software suite with GUI

RNA-Seq and Subpathway Analysis

Deng N, Puetter, A, Zhang, K, Johnson, K., Zhao, Z, Taylor, C, Flemington, E and Zhu, D. (2011) Isoform-level microRNA-155 Target Prediction using RNA-seq. Nucleic Acids Res., doi: 10.1093/nar/gkr042.

Judeh, T, Johnson, C, Kumar, A, Zhu, D (2013) TEAK: Topological Enrichment Analysis frameworK for detecting activated biological subpathways. Nucleic Acids Res., doi: 10.1093/nar/gks1299.