We present a frame interpolation algorithm that synthesizes multiple intermediate frames from two input images with large in-between motion. Recent methods use multiple networks to estimate optical flow or depth and a separate network dedicated to frame synthesis. This is often complex and requires scarce optical flow or depth ground-truth. In this work, we present a single unified network, distinguished by a multi-scale feature extractor that shares weights at all scales, and is trainable from frames alone.
2022: F. Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, C. Pantofaru, B. Curless
https://arxiv.org/pdf/2202.04901v1.pdf