Preprints
* denotes equal contribution, and † denotes corresponding author.
|
|
Just Noticeable Difference for Large Multimodal Models
Zijian Chen, Yuan Tian, Yuze Sun, Wei Sun, Zicheng Zhang, Weisi Lin, Guangtao Zhai†, Wenjun Zhang
arXiv, 2025. (NEW)
project page
/
Code
/
arXiv
In this paper, we propose a novel concept, LMM-JND, to quantify the perceptual redundancy characteristic for LMMs and a well-designed pipeline for its determination. We also construct a large-scale dataset, named VPA-JND, which contains 21.5k reference images with over 489k stimuli across 12 distortion types, to facilitate LMM-JND studies.
|
|
PuzzleBench: A Fully Dynamic Evaluation Framework for Large Multimodal Models on Puzzle Solving
Zeyu Zhang*, Zijian Chen*, Zicheng Zhang, Yuze Sun, Yuan Tian, Ziheng Jia, Chunyi Li, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai
arXiv, 2025. (NEW)
arXiv
In this paper, we construct PuzzleBench, a dynamic and scalable benchmark comprising 11,840 VQA samples, which features six carefully designed puzzle tasks targeting three core LMM competencies, visual recognition, logical reasoning, and context understanding.
|
|
Mitigating Long-tail Distribution in Oracle Bone Inscriptions: Dataset, Model, and Benchmark
Jinhao Li*, Zijian Chen*, Runze Jiang, Tingzhu Chen†, Changbo Wang†, Guangtao Zhai†
ACM MM, 2025. (Oral Presentation)
project page
/
arXiv
In this paper, we present the Oracle-P15K, a structure-aligned OBI dataset for OBI generation and denoising, consisting of 14,542 images infused with domain knowledge from OBI experts. Based on this, we propose a diffusion model-based pseudo OBI generator, called OBIDiff, to achieve realistic and controllable OBI generation.
|
Publications -- MLLM/LLM
* denotes equal contribution, and † denotes corresponding author.
|
-----
Publications -- Image/Video Quality Assessment/Low-level
* denotes equal contribution, and † denotes corresponding author.
|
|
Joint Luminance-Chrominance Learning for Image Debanding
Zijian Chen, Wei Sun, Jun Jia, Ru Huang, Fangfang Lu, Ying Chen, Xiongkuo Min, Guangtao Zhai†, Wenjun Zhang
IEEE Transactions on Circuits and Systems for Video Technology, 2025. (NEW)
In this paper, we propose a unified deep neural network that explicitly disentangles the luminance and chrominance channels, and simultaneously recovers intensity gradients and color discontinuity from detection-free measurement in an end-to-end manner.
|
|
GAIA: Rethinking Action Quality Assessment for AI-Generated Videos
Zijian Chen, Wei Sun†, Yuan Tian, Jun Jia, Zicheng Zhang, Jiarui Wang, Ru Huang, Xiongkuo Min†, Guangtao Zhai†, Wenjun Zhang
NeurIPS, 2024. (Spotlight Presentation)
project page
/
dataset extraction code: s277
/
中文版速递: 知乎
In this work, we construct GAIA, a Generic AI-generated Action dataset, by conducting a large-scale subjective evaluation from a novel causal reasoning-based perspective, resulting in 971,244 ratings among 9,180 video-action pairs, and evaluate a suite of popular text-to-video models on their ability to generate visually rational actions.
|
|
Study of Subjective and Objective Naturalness Assessment of AI-Generated Images
Zijian Chen, Wei Sun†, Haoning Wu, Zicheng Zhang, Jun Jia, Ru Huang, Xiongkuo Min†, Guangtao Zhai†, Wenjun Zhang
IEEE Transactions on Circuits and Systems for Video Technology, 2025. (NEW)
paper
/
dataset
/
code
In this work, we construct the AI-Generated Image Naturalness (AGIN) dataset and propose the Joint Objective Image Naturalness evaluaTor (JOINT) to automatically assess the naturalness of AIGIs that align with human opinions.
|
|
BAND-2k: Banding Artifact Noticeable Database for Banding Detection and Quality Assessment
Zijian Chen, Wei Sun, Jun Jia, Fangfang Lu, Zicheng Zhang, Jing Liu, Ru Huang, Xiongkuo Min†, Guangtao Zhai†
IEEE Transactions on Circuits and Systems for Video Technology, 2024.
project page
/
dataset
In this work, we build the Banding Artifact Noticeable Database (BAND-2k), which consists of 2,000 banding images generated by 15 compression and quantization schemes.
|
Publications -- Oracle Bone Inscriptions (OBI) Processing
* denotes equal contribution, and † denotes corresponding author.
|
Publications -- Early Works
* denotes the sole student author.
|
- Annual Conference on Neural Information Processing Systems (NeurIPS 2025)
- International Conference on Learning Representations (ICLR 2025)
- ACM Multimedia (ACM MM 2025)
- IEEE Transactions on Multimedia (TMM 2025)
- IEEE Intelligent Transportation Systems Magazine (ITSM 2024)
|
- [2024.11] AI+Virtual Simulation: Empowering Display Device Development (The 16th China Display Academic Conference)
|
- [2023] Excellent Graduates in Shanghai (for postgraduates)
- [2021] National Second Prize (National Graduate Electronics Design Contest)
- [2019] First Prize in Zhejiang Province (National Undergraduate Electronics Design Contest)
|
Updated in Jun. 2025
Thanks Jon Barron for this amazing website template.
|
|