Pruning Blocks for CNN Compression and Acceleration via Online Ensemble Distillation
In this paper, we propose an online ensemble distillation (OED) method to automatically prune blocks/layers of a target network by transferring the knowledge from a strong teacher in an end-to-end manner.To accomplish this, we first introduce a soft mask to scale the output of each block in the target network and enforce the sparsity of the mask by