BodyReLux

Abstract

Being able to relight human performance is a fundamental task for post production and content creation. We present BodyReLux, a subject-specific video diffusion-based framework for relighting full-body human performances in a temporally consistent way. Our model is trained on a hybrid dataset of pixel-aligned video relighting pairs, covering a diverse combination of lighting conditions, performances and viewpoints. To acquire such dataset, we combine traditional static One-Light-at-a-Time (OLAT) capture and a novel dynamic performance capture in which two smoothly varying lighting sequences are rapidly interleaved. Because the lighting operates above the human flicker-fusion threshold, the interleaving does not appear to strobe. We train our video relighting model from a pretrained text-to-video model to fully leverage the generative priors for producing high quality videos. To achieve accurate lighting control, we introduce a new lighting conditioning method that represents each light source as a token. We further condition on sequences of lighting using masked attention to support dynamic lighting control. Together with a carefully designed data augmentation pipeline, we achieve photorealistic, robust, and temporally consistent video relighting of subject-specific human performances.

Method Overview

We capture static OLAT data and bi-packed video data of a subject moving inside a large LED sphere, resulting in a dataset of video relighting training tuples consisting of two pixel-aligned videos under different lighting conditions and the corresponding lighting sequences. We train a video diffusion model with a novel lighting conditioning module that supports dynamic lighting control.

Data Capture

Our hybrid dataset combines two complementary capture strategies: (1) static OLAT images providing highly controllable illumination, and (2) dynamic bi-pack video sequences with smoothly varying paired lighting conditions. The bi-pack approach rapidly interleaves two lighting patterns above the human flicker-fusion threshold, enabling comfortable capture without strobing while producing pixel-aligned video relighting pairs.

Static OLAT Capture

Dynamic Bi-Pack Capture

Results

Example:

0:00 / 0:00

Comparisons

Example:

0:00 / 0:00

BibTeX

@article{ma2026bodyrelux,
  title     = {BodyReLux: Temporally Consistent Full-Body Video Relighting},
  author    = {Ma, Li and He, Mingming and Yu, Xueming and George, David M. and Ta{\c{s}}el, Ahmet Levent and Debevec, Paul and Philip, Julien},
  journal   = {ACM Transactions on Graphics (Proceedings of SIGGRAPH 2026)},
  year      = {2026},
  volume     = {45},
  number     = {4},
  publisher  = {ACM}
}