6.5. Deep neural network algorithms inspired by statistical physics and information theory
Large amounts of data, cheap computation, and efficient algorithms are driving the impressive performance and adoption of robust deep learning architectures. However, building, maintaining, and expanding these systems is still decidedly an art and requires a lot of trial and error. Learning and inference methods have a history of being inspired by and derived from the principles of statistical physics and information theory [151, 152]. We summarize examples to advance this theme to derive NN algorithms based on a confluence of ideas in statistical physics and information theory [153] and to feed them back into core MSM methods by prescribing new computational techniques for deep neural networks. (A) Generalization in deep NN: the approach utilizes algebraic topology [154, 155] to characterize the space of reachable functions using stochastic dynamics on data in order to build computationally efficient architectures and algorithms to train them [156-158]. (B) Characterizing the quality of representations and the performance of encoders, decoders: Recent works have proposed to exploit principles of representation learning to formulate variational approaches for the assessment of performance in deep learning algorithms [16] that provide guarantees on the performance of the final model.