Zijian Chen 「陈子健」

I am currently a first-year PhD Student of Multimedia Lab at Shanghai Jiao Tong University (SJTU), advised by Prof. Guangtao Zhai and Prof. Wenjun Zhang. Previously, I received my M.E. degree from East China University of Science and Technology in June 2023, and B.S. degree in EE from Wenzhou University in June 2020.

I'm generally interested in image/video quality assessment, large multimodal models, especially all-round evaluation, and AI4Humanity (Oracle bone character processing). My ultimate goal is to build models with human-like general intelligence that can seamlessly understand, generate, and reason across multiple modalities.

I'm always eager to communicate and cooperate, so feel free to contact me!!!

Email: zijian.chen@sjtu.edu.cn          

Email  /  Google Scholar  /  Github  /  Zhihu  /  Team Web

profile photo
News
Preprints

* denotes equal contribution, and denotes corresponding author.

LMM-JND Just Noticeable Difference for Large Multimodal Models
Zijian Chen, Yuan Tian, Yuze Sun, Wei Sun, Zicheng Zhang, Weisi Lin, Guangtao Zhai, Wenjun Zhang
arXiv, 2025.   (NEW)
project page / Code / arXiv

In this paper, we propose a novel concept, LMM-JND, to quantify the perceptual redundancy characteristic for LMMs and a well-designed pipeline for its determination. We also construct a large-scale dataset, named VPA-JND, which contains 21.5k reference images with over 489k stimuli across 12 distortion types, to facilitate LMM-JND studies.

Puzzlebench PuzzleBench: A Fully Dynamic Evaluation Framework for Large Multimodal Models on Puzzle Solving
Zeyu Zhang*, Zijian Chen*, Zicheng Zhang, Yuze Sun, Yuan Tian, Ziheng Jia, Chunyi Li, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai
arXiv, 2025.   (NEW)
arXiv

In this paper, we construct PuzzleBench, a dynamic and scalable benchmark comprising 11,840 VQA samples, which features six carefully designed puzzle tasks targeting three core LMM competencies, visual recognition, logical reasoning, and context understanding.

Oracle-P15k Mitigating Long-tail Distribution in Oracle Bone Inscriptions: Dataset, Model, and Benchmark
Jinhao Li*, Zijian Chen*, Runze Jiang, Tingzhu Chen, Changbo Wang, Guangtao Zhai
ACM MM, 2025.   (Oral Presentation)
project page / arXiv

In this paper, we present the Oracle-P15K, a structure-aligned OBI dataset for OBI generation and denoising, consisting of 14,542 images infused with domain knowledge from OBI experts. Based on this, we propose a diffusion model-based pseudo OBI generator, called OBIDiff, to achieve realistic and controllable OBI generation.

Publications -- MLLM/LLM

* denotes equal contribution, and denotes corresponding author.

-----
Publications -- Image/Video Quality Assessment/Low-level

* denotes equal contribution, and denotes corresponding author.

debanding Joint Luminance-Chrominance Learning for Image Debanding
Zijian Chen, Wei Sun, Jun Jia, Ru Huang, Fangfang Lu, Ying Chen, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang
IEEE Transactions on Circuits and Systems for Video Technology, 2025.   (NEW)

In this paper, we propose a unified deep neural network that explicitly disentangles the luminance and chrominance channels, and simultaneously recovers intensity gradients and color discontinuity from detection-free measurement in an end-to-end manner.

GAIA GAIA: Rethinking Action Quality Assessment for AI-Generated Videos
Zijian Chen, Wei Sun, Yuan Tian, Jun Jia, Zicheng Zhang, Jiarui Wang, Ru Huang, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang
NeurIPS, 2024.   (Spotlight Presentation)
project page / dataset extraction code: s277 / 中文版速递: 知乎

In this work, we construct GAIA, a Generic AI-generated Action dataset, by conducting a large-scale subjective evaluation from a novel causal reasoning-based perspective, resulting in 971,244 ratings among 9,180 video-action pairs, and evaluate a suite of popular text-to-video models on their ability to generate visually rational actions.

AGIN Study of Subjective and Objective Naturalness Assessment of AI-Generated Images
Zijian Chen, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Ru Huang, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang
IEEE Transactions on Circuits and Systems for Video Technology, 2025.   (NEW)
paper / dataset / code

In this work, we construct the AI-Generated Image Naturalness (AGIN) dataset and propose the Joint Objective Image Naturalness evaluaTor (JOINT) to automatically assess the naturalness of AIGIs that align with human opinions.

Band2k BAND-2k: Banding Artifact Noticeable Database for Banding Detection and Quality Assessment
Zijian Chen, Wei Sun, Jun Jia, Fangfang Lu, Zicheng Zhang, Jing Liu, Ru Huang, Xiongkuo Min, Guangtao Zhai
IEEE Transactions on Circuits and Systems for Video Technology, 2024.  
project page / dataset

In this work, we build the Banding Artifact Noticeable Database (BAND-2k), which consists of 2,000 banding images generated by 15 compression and quantization schemes.

fsband FS-BAND: A frequency-sensitive banding detector
Zijian Chen, Wei Sun, Zicheng Zhang, Ru Huang, Fangfang Lu, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang
IEEE International Symposium on Circuits and Systems (ISCAS), 2024.  
project page

In this paper, we develop a no-reference banding evaluator for banding detection and quality assessment by leveraging its frequency characteristics.

Publications -- Oracle Bone Inscriptions (OBI) Processing

* denotes equal contribution, and denotes corresponding author.

obi-bench OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?
Zijian Chen, Tingzhu Chen, Wenjun Zhang, Guangtao Zhai
ICLR, 2025.   (NEW)
project page / dataset extraction code: 7adv / 中文版速递: 知乎

In this work, we introduce OBI-Bench, a holistic benchmark crafted to systematically evaluate large multi-modal models (LMMs) on whole-process oracle bone inscriptions (OBI) processing tasks demanding expert-level domain knowledge and deliberate cognition.

OBIFormer OBIFormer: A fast attentive denoising framework for oracle bone inscriptions
Jinhao Li, Zijian Chen, Tingzhu Chen, Zhiji Liu, Changbo Wang
Displays, 2025.   (NEW)
project page

In this work, we propose OBIFormer, a fast attentive framework for high-precision Oracle bone inscriptions denoising.

Publications -- Early Works

* denotes the sole student author.

STCGCN Spatial-temporal correlation graph convolutional networks for traffic forecasting
Ru Huang (Master's Advisor), Zijian Chen*, Guangtao Zhai, Jianhua He, Xiaoli Chu
IET Intelligent Transport Systems, 2023.  

In this work, we propose a novel architecture, named spatial-temporal correlation graph convolutional networks (STCGCN), for traffic prediction.

TNSE A Graph Entropy Measure From Urelement to Higher-Order Graphlets for Network Analysis
Ru Huang (Master's Advisor), Zijian Chen*, Guangtao Zhai, Jianhua He, Xiaoli Chu
IEEE Transactions on Network Science and Engineering, 2022.  

In this work, we introduce an unbiased graphlet estimation strategy to obtain both urelement and higher-order statistics for network analysis.

Reviewer Service
  • Annual Conference on Neural Information Processing Systems (NeurIPS 2025)
  • International Conference on Learning Representations (ICLR 2025)
  • ACM Multimedia (ACM MM 2025)
  • IEEE Transactions on Multimedia (TMM 2025)
  • IEEE Intelligent Transportation Systems Magazine (ITSM 2024)
Talks
  • [2024.11] AI+Virtual Simulation: Empowering Display Device Development (The 16th China Display Academic Conference)
Blogs
Awards
  • [2023] Excellent Graduates in Shanghai (for postgraduates)
  • [2021] National Second Prize (National Graduate Electronics Design Contest)
  • [2019] First Prize in Zhejiang Province (National Undergraduate Electronics Design Contest)

Updated in Jun. 2025

Thanks Jon Barron for this amazing website template.