|
sDFT: Scaling Diffusion Field Transformers on Images, Videos, and 3D Data
Kangfu Mei,
Mo Zhou,
Vishal M. Patel,
Work in Progress
PDF /
project page
A generative filed model that can generate image, video, and 3D data in a single unified network architecture. The unification can boost video generation with 3D prior in multi-task learning, with the 10 times smaller model size than SORA.
|
|
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Kangfu Mei,
Zhengzhong Tu,
Mauricio Delbracio,
Hossein Talebi,
Vishal M. Patel,
Peyman Milanfar
arXiv 2024
arXiv
The first work that throughly investigates the scaling properties of the recent trending latent diffsuion models (e.g., the representative StableDiffusion).
|
|
CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation
Kangfu Mei,
Mauricio Delbracio,
Hossein Talebi,
Zhengzhong Tu,
Vishal M. Patel,
Peyman Milanfar
CVPR 2024
project page /
arXiv
Faster conditional diffusion that produces high-quality images with 1-4 sampling steps.
|
|
VIDM: Video Implicit Diffusion Models
Kangfu Mei,
Vishal M. Patel,
AAAI, 2023 Oral Presentation
project page /
arXiv
Video Generation Diffusion Models By Using Implicit Motiion Condition.
|
|
Deep Semantic Statistics Matching (D2SM) Denoising Network
Kangfu Mei,
Vishal M. Patel,
Rui Huang
ECCV, 2022
project page /
arXiv /
poster
A New General Plug-and-play Component For Denoising
|
|
Latent Feature-Guided Diffusion Models for Shadow Removal
Kangfu Mei, Luis Figueroa, Zhe Lin, Zhihong Ding, Scott Cohen,
Vishal M. Patel
WACV, 2023
project page /
code /
demo /
arXiv
We (together with Adobe) conducted this very early exploration of applying diffusion models on shadow removal. This proposes the first instance-level shadow removal task.
|
|
LTT-GAN: Looking Through Turbulence by Inverting GANs
Kangfu Mei,
Vishal M. Patel
IEEE Journal of Selected Topics in Signal Processing [IF: 7.695]
arXiv
The first turbulence mitigation algorithm that can clearly recover face images captured in a range of 300 meters long.
|
|
AttaNet: Attention-Augmented Network for Fast and Accurate Scene Parsing
Qi Song, Kangfu Mei,
Rui Huang
AAAI, 2021
code /
arXiv
Strip Attention Module (SAM) and Attention Fusion Module (AFM) are proposed for enhancing
the accuracy of semantic segmentation networks with limited computational complexity.
|
|
Higher-resolution network for image demosaicing and enhancing
Kangfu Mei, Juncheng Li, Jiajie Zhang, Haoyu Wu, Jie Li, Rui Huang
ICCV AIM RAW to RGB mapping challenge, 2019
code /
arXiv
For the first time, a neural ISP has outperformed a traditional ISP (like Huawei's mobile ISP) and achieved visual quality comparable to that of a DSLR.
|
|
Progressive Feature Fusion Network for Realistic Image Dehazing
Kangfu Mei, Aiwen Jiang, Juncheng Li, Mingwen Wang
ACCV, 2019
code /
arXiv
PFFNet was the first dehazing netowrk that uses fully end-to-end neural network architecture without physical gating unit. More than 100 works (untill 2024) adapot our strategy and show its effectiveness.
|
|
Multi-scale Residual Network for Image Super-resolution
Juncheng Li, Faming Fang,
Kangfu Mei, Guixu Zhang
ECCV, 2018
code /
bibtex
We built MSRN in 2018. It quickly becomes the foundamental component in all image restoration works and has more than 800 citations untill 2024.
|