Combining Bayesian inference with Neural Networks

It might be confusing at first how to reconcile the principles of Bayesian Inference with the framework of neural networks. Especially given the size of modern architectures and the nature of stochastic gradient based optimization which are usually not covered in Bayesian statistics resources.

I wrote this document for a talk I gave at my company on Bayesian ML where I was trying to derive this step by step. How to go from basic Bayesian Inference principles to a tractable approximation for NNs that can be computed in practice?

The PDF can be found here.