Record papers I have read or reproduced since 2020 which were beneficial to my work.

My paper notes: 2021 2022

My Xmind notes:

AI Art (TalkingHead/Text2Image/Text2Video etc.)

Hand Mocap


hmr-survey by tinatiansjz

Hand3DResearch by SeanChenxy

Human-Video-Generation by yule-li.

HelloFace by becauseofAI

awesome-NeRF by koolo233

awesome-ai-painting by hua1995116

Awesome-Face-Restoration by TaoWangzj

3D Face Reconstruction

Year Name Paper Codes
2018 3DDFA Face Alignment in Full Pose Range: A 3D Total Solution official
2019 Deep3DFaceRecon Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set unofficial
2020 3DDFA_V2 Towards Fast, Accurate and Stable 3D Dense Face Alignment official
2020 Detailed3DFace FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction official
2021 DECA Detailed Expression Capture and Animation official
Imperial College London
2019 GANFit Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction official
2021 TBGAN Synthesizing Coupled 3D Face Modalities by Trunk-Branch Generative Adversarial Networks official
2021 OSTeC One-Shot Texture Completion official
2021 Fast-GANFit Fast-GANFIT: Generative Adversarial Network for High Fidelity 3D Face Reconstruction
2021 AvatarMe AvatarMe: Realistically Renderable 3D Facial Reconstruction "in-the-wild" official
2021 AvatarMe++ AvatarMe++: Facial Shape and BRDF Inference with Photorealistic Rendering-Aware GANs

3D Human Digitization

Year Name Paper Codes
2019 speech2gesture Learning Individual Styles of Conversational Gesture official
2020 Monoport Monoport: Monocular Volumetric Human Teleportation official
2020 PiFuHD PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization Meta
2021 iPERCore Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis official
2021 ContactHumanDynamics Contact and Human Dynamics from Monocular Video Stanford
2021 HuMoR HuMoR: 3D Human Motion Model for Robust Pose Estimation Stanford
2021 MeTRAbs MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation official
2022 DeepMotion official

Motion Capture & Driven

Year Name Paper Codes
2021 ParameterizedMotion Learning a family of motor skills from a single motion clip official
2021 1165048017 Blog official
2021 TDPT official

Human Kp Estimation

Year Name Paper Codes
2D Kp
2018 AlphaPose RMPE: Regional Multi-Person Pose Estimation official
3D Kp
2019 mvpose Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views ZJU3DV
2022 PoseTriplet Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision official
2021 MediaPipe official
2021 mmpose official

Human Motion Estimation

inverse dynamics

Year Name Paper Codes
2020 6D rotation On the Continuity of Rotation Representations in Neural Networks official
Body Model
2015 SMPL SMPL: A Skinned Multi-Person Linear Model official
2019 SMPL-X SMPL-X: A new joint 3D model of the human body, face and hands together official
Image Based
2017 VNect VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera Max Planck
2019 SPIN Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop official
2021 PyMAF PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop official
2021 MeshGraphormer Mesh Graphormer microsoft
2021 ROMP Monocular, One-stage, Regression of Multiple 3D People official
2021 DynaBOA Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation official
2021 PARE PARE: Part Attention Regressor for 3D Human Body Estimation Max Planck
2021 PoseTriplet PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision official
2020 PhysCap PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time Max Planck
2021 HybrIK HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation official
2021 Physics-based Human Motion Estimation Physics-based Human Motion Estimation and Synthesis from Videos Nvidia
2021 SimPoE SimPoE: Simulated Character Control for 3D Human Pose Estimation Meta
2021 imGHUM imGHUM: Implicit Generative Models of 3D Human Shape and Articulated Pose Google
2021 PoseAug PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation official
2022 MuJoCo deepmind
Temporal Based
2020 VIBE VIBE: Video Inference for Human Body Pose and Shape Estimation official
2021 TCMR TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video official
2021 maed MAED: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation official
 Full Body
2021 FrankMocap A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator Meta
2021 PIXIE Collaborative Regression of Expressive Bodies using Moderation official
2022 Hand4Whole Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation official
Multi Views
2020 3D Human Pose Estimation 3D Human Pose Estimation using Multi Camera official
2020 Learnable Triangulation Learnable Triangulation of Human Pose official
2020 Epipolar Transformers Epipolar Transformers official
2020 VoxelPose VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment microsoft
2020 EasyMocap Motion Capture from Internet Videos ZJU3DV
2021 PlaneSweepPose Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo official
2021 freemocap official
2022 Generalizable Human Pose Triangulation Generalizable Human Pose Triangulation official

Hand Estimation

Year Name Paper Codes
2017 MANO Embodied Hands: Modeling and Capturing Hands and Bodies Together official
2020 Mediapipe MediaPipe Hands: On-device Real-time Hand Tracking Google
2021 MocapNETv3 Towards Holistic Real-time Human 3D Pose Estimation using MocapNETs official
2021 S2HAND S2HAND: Model-based 3D Hand Reconstruction via Self-Supervised Learning Tencent


Year Name Paper Codes
2021 MLP-Mixer MLP-Mixer: An all-MLP Architecture for Vision official
2021 Noisy Student Self-training with Noisy Student improves ImageNet classification official
2021 ImageNet-21K ImageNet-21K Pretraining for the Masses official
2021 MicroNet MicroNet: Improving Image Recognition with Extremely Low FLOPs official
2021 RepVGG RepVGG: Making VGG-style ConvNets Great Again official
2022 ConvNeXt A ConvNet for the 2020s official

Face Detection

Year Name Paper Codes
2016 MTCNN Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks unofficial
2020 DSFD DSFD: Dual Shot Face Detector official
2021 SCRFD Sample and Computation Redistribution for Efficient Face Detection official

Face Swap

Year Name Paper Codes
2019 FSGAN FSGAN: Subject Agnostic Face Swapping and Reenactment official
2020 Disney High-Resolution Neural Face Swapping for Visual Effects unofficial
2020 FaceShifter FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping unofficial
2021 SimSwap SimSwap: An Efficient Framework For High Fidelity Face Swapping official
2021 InfoSwap Information Bottleneck Disentanglement for Identity Swapping official
2021 ShapeEditer ShapeEditer: a StyleGAN Encoder for Face Swapping
2021 HifiFace HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping unofficial
2022 MobileFaceSwap MobileFaceSwap: A Lightweight Framework for Video Face Swapping baidu
2022 Stitch it in Time Stitch it in Time: GAN-Based Facial Editing of Real Videos official

Image2Image Translation

Year Name Paper Codes
2019 SPADE Semantic Image Synthesis with Spatially-Adaptive Normalization Nvidia
2021 OASIS You Only Need Adversarial Supervision for Semantic Image Synthesis official
2017 pix2pix Image-to-Image Translation with Conditional Adversarial Networks official
2018 pix2pixHD High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs nvidia
2018 vid2vid Video-to-Video Synthesis nvidia
anime face
2019 TalkingHeadAnime official
2021 TalkingHeadAnime2 official
2022 EasyVtuber official

Image Generation

Year Name Paper Codes
2020 ALAE Adversarial Latent Autoencoders official
2020 GANSpace GANSpace: Discovering Interpretable GAN Controls official
2021 Cartoon-StyleGAN Fine-tuning StyleGAN2 for Cartoon Face Generation official
2021 Barbershop GAN-based Image Compositing using Segmentation Masks official
2021 GANs N' Roses GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation official
2021 PTI PTI: Pivotal Tuning for Latent-based editing of Real Images official
2021 sefa Closed-Form Factorization of Latent Semantics in GANs Genforce
2021 StyleMapGAN StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing NAVER AI
2021 SuperStyleNet SuperStlyeNet: Deep Image Synthesis with Superpixel Based Style Encoder official
2021 Chunkmogrify Real Image Inversion via Segments Adobe
2021 encoder4editing Designing an Encoder for StyleGAN Image Manipulation official
2021 Projected GANs Projected GANs Converge Faster official
2022 CLIPasso Semantically-Aware Object Sketching official


Year Name Paper Codes
2019 StyleGan A Style-Based Generator Architecture for Generative Adversarial Networks Nvidia
2019 StyleGan2 Analyzing and Improving the Image Quality of StyleGAN Nvidia
2021 stylegan2-ada Training Generative Adversarial Networks with Limited Data Nvidia
2021 StyleGan3 Alias-Free Generative Adversarial Networks Nvidia
2021 SemanticGAN Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization Nvidia

Neural Head & Body

Year Name Paper Codes
2020 FirstOrder First Order Motion Model for Image Animation official
2021 speech2gesture NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video ZJU3DV
2021 StyleGestures Style-controllable speech-driven gesture synthesis using normalising flows official
2021 face-vid2vid One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing Nvidia Project unofficial unofficial-2
2022 DaGAN Depth-Aware Generative Adversarial Network for Talking Head Video Generation official


Year Name Paper Codes
2020 NeRF Representing Scenes as Neural Radiance Fields for View Synthesis Berkeley
2021 Neural Body Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans ZJU3DV
2021 AD-Nerf AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis official

Object Detection

Year Name Paper Codes
2020 100 Days of Hands Understanding Human Hands in Contact at Internet Scale official
2021 YOLOX YOLOX: Exceeding YOLO Series in 2021 Megvii

Super Resolution

Year Name Paper Codes
2018 ESRGAN ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks official
2020 DFDNet Blind Face Restoration via Deep Multi-scale Component Dictionaries official
2021 Real-ESRGAN Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data official
2021 GFPGAN GFP-GAN: Towards Real-World Blind Face Restoration with Generative Facial Prior official
2021 GPEN GAN Prior Embedded Network for Blind Face Restoration in the Wild official
2022 SwinIR Image Restoration Using Swin Transformer official
2022 VRT A Video Restoration Transformer official

ViT Transformer

Year Name Paper Codes
2017 google Attention Attention Is All You Need official
2020 ViT An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Google
2021 Token Labeling All Tokens Matter: Token Labeling for Training Better Vision Transformers official
2021 Tokens-to-Token ViT Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet official
2021 MAE Masked Autoencoders Are Scalable Vision Learners Meta


Year Name Paper Codes
2017 AdaIN Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization official
2018 lpips The Unreasonable Effectiveness of Deep Features as a Perceptual Metric OpenAI
2020 IBA Restricting the Flow: Information Bottlenecks for Attribution official
2021 Focal Frequency Loss Focal Frequency Loss for Image Reconstruction and Synthesis official
2022 ffcv MIT


Year Name Paper Codes
2020 3d photo inpainting 3D Photography using Context-aware Layered Depth Inpainting official
2021 ParameterizedMotion Learning a family of motor skills from a single motion clip official
2021 AnimeInterp Deep Animation Video Interpolation in the Wild SenseTime
2021 DALLE Zero-Shot Text-to-Image Generation OpenAI


