Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
publications
National College Students’ Innovation and Entrepreneurship Program — Development of an Intelligent Aesthetic Education System Based on Large Language Models and Image Generation Models.
Type: project, NanKai University, 2023
In order to develop teenagers comprehensive abilities in intelligence, emotion, willpower, aesthetic ability, and achieve their all-round development, we are intending to develop an intelligent art education virtual platform based on large language models (LLM) and image generation models (stable diffusion). The platform will establish virtual reality models for famous artists from ancient and modern times, and users can interact with these artists, learn about their history, appreciate their works, and improve their aesthetic taste
A Unified Framework Via Correction for Offline Safe Reinforcement Learning
Type: publication, Submitted to NeurIPS 2024, 2024
Offline safe reinforcement learning (Safe RL) aims to learn an optimal policy from previously collected datasets that maximizes the expected reward while satisfying given cost constraints. Directly applying Safe RL in the offline setting can fail due to the extrapolation error caused by the out-of-distribution actions. Moreover, since the offline dataset may come from unsafe policies, the cost-aware learning process can still be guided toward unsafe trajectories generated by the behavioral policy. To address the challenge of learning both extrapolation error and safe constraints, we introduce the Cost-Corrected Markov Decision Process (CC-MDP). It can corrects unsafe policy learning by reward re-distribution and cost penalty, transforming a safe offline learning problem into a pure offline learning problem without cost constraints. We theoretically demonstrate that CC-MDP have the same optimal value function as its corresponding CMDP in the offline setting. To demonstrate our framework, we combine CC-MDP with common offline RL algorithms. Experiments on various offline Safe RL tasks show that pure offline RL algorithms can achieve competitive rewards while satisfying constraints under our CC-MDP framework.
Enhance the Safety in Reinforcement Learning by ADRC Lagrangian Methods
Type: publication, Submitted to NeurIPS 2025, 2025
Safe reinforcement learning (Safe RL) aims to maximize rewards while satisfying cost constraints. Lagrangian-based methods are widely used to address safety concerns, but existing approaches, including PID and classical Lagrangian methods, often suffer from oscillations due to parameter sensitivity and inherent phase lag, leading to frequent safety violations during training. Active Disturbance Rejection Control (ADRC), with its adaptive and robust control capabilities, provides a compelling alternative by effectively managing parameter uncertainties and dynamic system behaviors. Leveraging these advantages, in this paper, we introduces ADRC Lagrangian methods that reduce oscillations and enhance parameters robustness, positioning classical and PID Lagrangian methods as special cases within our framework. Experiments demonstrate that our method reduces safety violations by up to 74%, constraint violation magnitudes by 89%, and average costs by 67%, establishing its effectiveness for Safe RL in complex environments.
AtomDisc: An Interpretable Atom-level Tokenizer that Boosts Molecular LLMs and Reveals Structure–Property Relationships
Type: publication, Submitted to Nature Machine Intelligence, 2025
Recent advances in large language models (LLMs) have spurred growing interest in their application to molecular modeling and property prediction. However, existing molecular LLMs either rely solely on SMILES strings—thus ignoring rich atomic and structural context—or introduce molecular features via auxiliary adapters, which fails to achieve true modality integration and interpretability. Here, we present AtomDisc, an interpretable atom-level tokenizer that discretizes local chemical environments into structure-aware tokens directly embedded within SMILES sequences. This unified representation enables LLMs to jointly model chemical syntax and atomic structure, providing both fine-grained performance gains and unprecedented interpretability. Through systematic case studies and attribution analysis, AtomDisc not only achieves state-of-the-art results on molecular generation and property prediction tasks, but also reveals new structure–property relationships, demonstrating its potential for AI-driven scientific discovery in chemistry.
talks
Talk on Introduction to Artificial Intelligence Course
Published:
In the “Introduction to Artificial Intelligence” course, we conducted paper readings in groups. Our topic was related to image generation. We read several classic papers on image generation and selected the four most classic ones to complete a literature review. Our review was rated as an excellent major assignment. As the group leader, I reported our research progress in class. You can find the corresponding slides here and the paper here.
Talk on Principles of Compilers
Published:
In the Principles of Compilers course, I completed a project for the Open Topic section that involved implementing classic lexical analysis algorithms. Specifically, I recreated the Thompson algorithm and the Powerset construction method in C++, and began work on the Hopcroft algorithm.
You can view the slides here
teaching
Introduction to Artificial Intelligence
Major Course, Nankai University, 2025
I am serving as a teaching assistant in the Introduction to Artificial Intelligence course for Spring 2024.