This work presents FaceX framework, a novel facial generalist model capable of handling diverse facial tasks simultaneously.
To achieve this goal, we initially formulate a unified facial representation for a broad spectrum of facial editing tasks, which macroscopically decomposes a face into fundamental identity, intra-personal variation, and environmental factors. Based on this, we introduce Facial Omni-Representation Decomposing (FORD) for seamless manipulation of various facial components, microscopically decomposing the core aspects of most facial editing tasks. Furthermore, by leveraging the prior of a pretrained StableDiffusion (SD) to enhance generation quality and accelerate training, we design Facial Omni-Representation Steering (FORS) to first assemble unified facial representations and then effectively steer the SD-aware generation process by the efficient Facial Representation Controller (FRC).
Our versatile FaceX achieves competitive performance compared to elaborate task-specific models on popular facial editing tasks. Full codes will be available soon.
@misc{han2023generalist,
title={A Generalist FaceX via Learning Unified Facial Representation},
author={Yue Han and Jiangning Zhang and Junwei Zhu and Xiangtai Li and Yanhao Ge and Wei Li and Chengjie Wang and Yong Liu and Xiaoming Liu and Ying Tai},
year={2023},
eprint={2401.00551},
archivePrefix={arXiv},
primaryClass={cs.CV}
}