How deep neural networks learn — a dynamical-systems perspective

Abstract

After giving an introduction to deep learning, I will discuss how deep networks learn. This can be analysed and understood, in part, using concepts from dynamical-systems and random-matrix theory [1]. For deep neural networks, the maximal finite-time Lyapunov exponent forms geometrical structures in input space, akin to coherent structures in dynamical systems such as turbulent flow. Ridges of large positive exponents divide input space into different regions that the network associates with different classes in a classification task. The ridges visualise the geometry that deep networks construct in input space, and help to quantify how the learning depends on the network depth and width [2].

[1] Bernhard Mehlig, Machine Learning with neural networks, Cambridge University Press (2021).
[2] Storm, Linander, Bec, Gustavsson & Mehlig, Finite-time Lyapunov exponents of deep neural networks, Phys. Rev. Lett. 132 (2024) 057301.

Abstract

Document information

Document Score

Share this document

claim authorship