Zhi-ning Liu
你好 / Hello / 안녕하세요 / こんにちは / Bonjour / Hola (and more)!
Ph.D. Candidate · University of Illinois Urbana-Champaign (UIUC)
Advised by Prof. Hanghang Tong at IDEA-ISAIL Lab
Advised by Prof. Hanghang Tong at IDEA-ISAIL Lab
I do research and build open-source systems for Data-centric Trustworthy AI, with a focus on reliable LLM/VLM reasoning, graph and time-series data mining, class-imbalanced learning, meta ensemble learning, and AI ethics/fairness.
LLM/VLM Reasoning: agentic reasoning, evidence grounding, mechanistic interpretability
Graph & Time Series Mining: scalable model fusion, graph neural networks, temporal forecasting
Class-imbalanced Learning: data curation, efficient ensemble methods, few-shot learning
Trustworthy AI: data-encoded unfairness, bias attribution, AI ethics and morality
Education

University of Illinois Urbana-Champaign
Ph.D. in Computer Science · 2022 - 2026 (expected)
Department of Computer Science · Advisor: Prof. Hanghang Tong

Jilin University
M.Eng. in Computer Science · 2019 - 2022
School of Artificial Intelligence · Advisor: Prof. Yi Chang
B.Sc. in Computer Science · 2015 - 2019
Tang-Aoqing Honors Program · Computer Science
Experience

Amazon
Applied Scientist II · Palo Alto, CA · starting June 2026
Amazon Ads · LLM for recommendation
Applied Scientist Intern · Palo Alto, CA · May - Dec 2025
Amazon Ads · Vision language model reasoning reliability -> ICLR 2026, ICML 2026, ACL 2026
Applied Scientist Intern · Palo Alto, CA · May - Aug 2024
Amazon Rufus · RAG-based language model reasoning -> ACL 2025
Applied Scientist Intern · Seattle, WA · May - Aug 2023
Amazon Search · Multi-task learning for partially ordered entity ranking
Microsoft Research
Research Intern · Beijing · Aug 2018 - June 2019
Machine Learning Group · Extreme class-imbalanced learning -> ICDE 2020, NeurIPS 2020
News
Apr 2026
💼 Joining Amazon: I will join Amazon as an Applied Scientist starting June 2026 : )
Apr 2026
🎉 ACL'26: Two main papers and one findings paper accepted to ACL 2026.
Jan 2026
🇧🇷 ICLR'26: Four papers accepted to ICLR 2026. See you in Brazil.
Oct 2025
👀 VLM Perception: VLMs can see the image, but still may not use it. [PDF]
Aug 2025
May 2025
🏆 Award: Honored to receive the C.L. and Jane Liu Award.
May 2025
💼 Intern@Amazon: Back to the Bay Area again.
Jan 2025
🎉 ICLR'25: One co-author paper on test-time adaptation for graph structural shift accepted. [PDF]
Sep 2024
May 2024
💼 Intern@Amazon: Starting my Applied Scientist Internship in the Bay Area.
Mar 2024
⚖️ FAccT'24: Group Fairness via Group Consensus, with Eunice Chan. [PDF]
May 2023
🏥 KDD'23: Web-based Long-term Spine Treatment Outcome Forecasting, with Hangting Ye. [PDF]
May 2023
💼 Intern@Amazon: Starting my Applied Scientist Internship in Seattle.
Mar 2022
🎓 Starting Ph.D.@UIUC: I will join Prof. Hanghang Tong's group at UIUC in Fall 2022.
Jan 2022
Apr 2020
📦 Open-source: Awesome-Imbalanced-Learning, a curated list of imbalanced learning resources.
Oct 2019
Jul 2019
🎓 Graduation@JilinU: Received my B.Sc. from Tang Aoqing Honors Program in Science, Jilin University.
Sep 2018
💼 Intern@Microsoft: Starting my internship at Microsoft Research Asia. Supervisors: Dr. Jiang Bian and Dr. Wei Cao.
Publications
Publications, preprints, and submissions, sorted by year. See the full list on Google Scholar.
-
Do VLMs Have a Moral Backbone? A Study on the Fragile Morality of Vision-Language Models
ACL Findings 2026Key Insight: VLM moral judgments are fragile when visual evidence and textual cues create subtle ethical tension.
-
MORALISE: A Structured Benchmark for Moral Alignment in Visual Language Models
ICML 2026Key Insight: Structured visual moral scenarios reveal alignment failures hidden by coarse benchmark scores.
-
Agentic Reasoning for Large Language Models
arXiv preprint, 2026Key Insight: A unified taxonomy connects planning, tool use, memory, reflection, and self-improvement in agentic reasoning.
-
Mixture of Sequence: Theme-aware Mixture-of-Experts for Long-Sequence Recommendation
WebConf 2026 OralKey Insight: Theme-aware experts let recommender models specialize over long, shifting user behavior sequences.
-
Seeing but Not Believing: Probing the Disconnect Between Visual Attention and Answer Correctness in VLMs
ICLR 2026Key Insight: Correct visual attention does not guarantee correct visual-language reasoning.
-
Continual Low-Rank Adapters for LLM-based Generative Recommender Systems
ICLR 2026Key Insight: Continually evolving LoRA adapters preserve generative recommendation quality as user data shifts.
-
Language in the Flow of Time: Time-Series-Paired Texts Weaved into a Unified Temporal Narrative
ICLR 2026Key Insight: Time-series and paired text can be aligned into unified temporal narratives for multimodal reasoning.
-
TRQA: Time Series Reasoning Question And Answering Benchmark
In submission, 2026Key Insight: Time-series QA should test compositional reasoning over temporal patterns, not just forecasting accuracy.
-
PLANETALIGN: A Comprehensive Python Library for Benchmarking Network Alignment
ICLR 2026Key Insight: A unified Python library makes network alignment algorithms easier to compare, reproduce, and extend.
-
Flow Matching Meets Biology and Life Science: A Survey
npj Artificial Intelligence 2026Key Insight: Flow matching provides a flexible generative lens for molecular, cellular, and biological modeling tasks.
-
ReMix: Reinforcement Routing for Mixtures of LoRAs in LLM Finetuning
arXiv preprint, 2026Key Insight: Reinforcement routing learns how to select and combine LoRA experts during LLM finetuning.
-
Inference Scaling of LLM Ensembling: Bridging Token Spaces with Token Translation
In submission, 2026Key Insight: Token translation bridges heterogeneous vocabularies so LLM ensembles can scale at inference time.
-
WAPITI: A Watermark for Finetuned Open-Source LLMs
In submission, 2026Key Insight: Finetuned open-source LLMs can retain detectable ownership signals without sacrificing utility.
-
AdaFuse: Adaptive Ensemble Decoding for Large Language Models
ACL 2026 MainKey Insight: Adaptive ensemble decoding fuses multiple LLM outputs during generation for stronger test-time reasoning.
-
Mem-Gallery: Benchmarking Multimodal Long-Term Conversational Memory for MLLM Agents
ACL 2026 MainKey Insight: Multimodal agents still struggle to store, retrieve, and use long-term conversational memory.
-
CLIMB: Class-imbalanced Learning Benchmark on Tabular Data
NeurIPS 2025Key Insight: A standardized benchmark exposes when class-imbalanced tabular methods actually generalize.
-
Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting
ICML 2025Key Insight: Sample-level adaptive fusion lets heterogeneous forecasters complement each other instead of competing in silos.
-
SelfElicit: Your Language Model Secretly Knows Where the Relevant Evidence is
ACL 2025 MainKey Insight: LLMs can self-elicit where relevant evidence lies, reducing reliance on external retrieval heuristics.
-
ClimateBench-M: A Multi-Modal Climate Data Benchmark with a Simple Generative Method
CIKM 2025Key Insight: Climate modeling benefits from benchmarks that jointly test time-series, image, and generative signals.
-
Not All Voices Are Rewarded Equally: Probing and Repairing Reward Models across Human Diversity
EMNLP Findings 2025Key Insight: Reward models can encode demographic preference gaps, and targeted repair can reduce those disparities.
-
LLM-RecG: A Semantic Bias-Aware Framework for Zero-Shot Sequential Recommendation
RecSys 2025Key Insight: Semantic group information helps LLM recommenders recognize and reduce bias in zero-shot settings.
-
Matcha: Mitigating Graph Structure Shifts with Test-Time Adaptation
ICLR 2025Key Insight: Test-time adaptation can mitigate graph structure shifts without retraining on the target graph.
-
THeGCN: Temporal Heterophilic Graph Convolutional Network
In submission, 2026Key Insight: Temporal heterophily requires graph convolutions that model changing cross-class neighborhood patterns.
-
BackTime: Backdoor Attacks on Multivariate Time Series Forecasting
NeurIPS 2024 SpotlightKey Insight: Time-series forecasters can be compromised by backdoor triggers embedded in multivariate temporal patterns.
-
AIM: Attributing, Interpreting, Mitigating Data-encoded Unfairness
KDD 2024Key Insight: Data-encoded unfairness can be attributed, interpreted, and mitigated before it becomes model behavior.
-
Class-Imbalanced Graph Learning without Class Rebalancing
ICML 2024Key Insight: Bias-aware graph learning can handle class imbalance without naive resampling or class reweighting.
-
Group Fairness via Group Consensus
FAccT 2024Key Insight: Group consensus offers a practical fairness signal when protected groups disagree in complex ways.
-
Graph Mixup on Approximate Gromov-Wasserstein Geodesics
ICML 2024Key Insight: Gromov-Wasserstein geodesics create topology-aware graph mixup paths for stronger augmentation.
-
Ensuring User-side Fairness in Dynamic Recommender Systems
WWW 2024Key Insight: Dynamic recommenders need fairness constraints that evolve with users, items, and exposure patterns.
-
Hierarchical Multi-Marginal Optimal Transport for Network Alignment
AAAI 2024Key Insight: Hierarchical multi-marginal optimal transport aligns multiple networks through shared structural signals.
-
Taming Over-Smoothing Representation on Heterophilic Graphs
Information Sciences, 2023Key Insight: Heterophilic graphs require representation smoothing to be controlled rather than blindly increased.
-
Web-based Long-term Spine Treatment Outcome Forecasting
KDD 2023Key Insight: Web-based modeling can support long-term spine treatment outcome forecasting from real clinical data.
-
UADB: Unsupervised Anomaly Detection Booster
ICDE 2023Key Insight: A boosting wrapper can improve unsupervised anomaly detectors without relying on anomaly labels.
-
A Survey of Explainable Graph Neural Networks for Cyber Malware Analysis
IEEE BigData 2022Key Insight: Explainable GNN techniques can make cyber malware analysis more transparent and actionable.
-
MESA: Boost Ensemble Imbalanced Learning with Meta-sampler
NeurIPS 2020Key Insight: A meta-sampler can learn how to construct better ensemble training sets for imbalanced data.
-
Self-paced Ensemble for Highly Imbalanced Massive Data Classification
ICDE 2020Key Insight: Self-paced sampling lets ensembles learn from massive imbalanced data from easy cases to harder ones.
Selected Awards & Honors
C.L. and Jane Liu Award
UIUC, 2025
UIUC, 2025
Top 10 Honorary Graduates (Highest Honor)
Jilin University, 2022
Jilin University, 2022
National Scholarship
Ministry of Education of China, 2020
Ministry of Education of China, 2020
National Scholarship
Ministry of Education of China, 2019
Ministry of Education of China, 2019
Gallery
Fun Facts
🌿 My Name
In Chinese, "Zhi Ning" has a gently feminine feeling: Zhi (芷) means fragrant herb, and ning (宁) means peace and tranquility. Many friends once expected to meet a cute girl because of the name, and were mildly disappointed when we actually met.
In Chinese, "Zhi Ning" has a gently feminine feeling: Zhi (芷) means fragrant herb, and ning (宁) means peace and tranquility. Many friends once expected to meet a cute girl because of the name, and were mildly disappointed when we actually met.
🎮 Games
I enjoy nearly every kind of video game (but not necessarily good at them): shooters, strategy, 4X, RPGs, roguelikes, and more. Favorites include Battlefield, Civilization, Stellaris, GTA, The Witcher, DiRT, Homeworld, Metro, BioShock, and Borderlands.
I enjoy nearly every kind of video game (but not necessarily good at them): shooters, strategy, 4X, RPGs, roguelikes, and more. Favorites include Battlefield, Civilization, Stellaris, GTA, The Witcher, DiRT, Homeworld, Metro, BioShock, and Borderlands.
🎨 Making Things
Making things look nice and satisfying, from this page to paper figures, makes me happy. Drawing was one of my favorite things before my twenties, and I may have been better suited to design than computer science.
Making things look nice and satisfying, from this page to paper figures, makes me happy. Drawing was one of my favorite things before my twenties, and I may have been better suited to design than computer science.