Can Gradient Descent Prune Neural Networks?

Original title: Neural Network Pruning by Gradient Descent

Authors: Zhang Zhang, Ruyi Tao, Jiang Zhang

This article introduces a pioneering neural network pruning technique that tackles the ballooning size of deep learning models. Their novel framework, leveraging the Gumbel-Softmax approach, refines network weights and structure in one go using stochastic gradient descent. The results astound: on the MNIST dataset, it retains remarkable accuracy with a mere 0.15% of the original network parameters. Beyond compression, it amplifies interpretability. Feature importance becomes easily extractable from the pruned network, offering insights into feature symmetry and the information pathways influencing outcomes. Surprisingly intuitive, this approach focuses on selecting essential features and leveraging data patterns for extreme sparsity. Not just a pruning strategy, it ushers in a new era for interpretable machine learning, promising a more efficient and transparent future for deep learning models.

Original article: https://arxiv.org/abs/2311.12526