Delta Distillation for Efficient Video Processing
Amirhossein Habibian (Qualcomm AI Research)
Efstratios Gavves (QUVA Lab)
Haitam Ben Yahia (Qualcomm AI Research)
Davide Abati (Qualcomm AI Research)
Fatih Porikli (Qualcomm AI Research)
CVPR 2023 oral
Summary
Delta Distillation is a technique to speed up the processing of videos. Let's say you have a stream of frames and a neural network that processes them one by one. Now let's assume this model is very accurate, but also expensive in terms of computation. How can you approximate the same results while easing the cost? With Delta Distillation, the first frame (keyframe) is processed as usual, whereas all successive ones get represented as differences with respect to the keyframe (deltas). Due to the high degree of temporal redundancy in video sequence, deltas convey way less information than raw video frames, and can be processed with a smaller model. This is exactly what Delta Distillation does: in every layer (teacher), the delta representation is computed with respect to the keyframe, and it is processed by a sibling layer (student) designed to be much cheaper:
Citation
@inproceedings{deltadist, title={Delta Distillation for Efficient Video Processing}, author={Habibian, Amirhossein and Ben Yahia, Haitam and Abati, Davide and Gavves, Efstratios and Porikli, Fatih}, booktitle={Proceedings of the 17th European Conference on Computer Vision}, year={2022}, organization={Springer} }
Results
We show through extensive experiments that delta distillation outperforms feature distillation for comparable student architectures and delivers state-of-the-art results for efficient video segmentation and object detection. Delta distillation consistently reduces computational cost by a factor of ∼ 2× for all the backbones, with no or small drop in accuracy. Moreover, we note how the proposed procedure improves the temporal consistency of the distilled model.
Looking for more papers with code?
* Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc.
