Understanding the Efficacy of Over-Parameterization in Neural Networks
Understanding the Efficacy of Over-Parameterization in Neural Networks Understanding the Efficacy of Over-Parameterization in Neural Networks: Mechanisms, Theories, and Practical Implications Introduction Deep neural networks (DNNs) have become the cornerstone of modern artificial intelligence, driving advancements in computer vision, natural language processing, and myriad other domains. A key, albeit counter-intuitive, property of contemporary DNNs is their immense over-parameterization: these models often contain orders of magnitude more parameters than the number of training examples, yet they generalize remarkably well to unseen data. This phenomenon stands in stark contrast to classical statistical learning theory, which posits that models with excessive complexity relative to the available data are prone to overfitting and poor generalization. Intriguingly, empirical evidence shows that increasing the number of parameters in DNNs can lead ...