1Cademy - Deep Belief Networks (DBNs)

Concept

Deep Belief Networks (DBNs)

DBNs were one of the first nonconvolutional models to successfully admit training of deep architectures
They are generative models with several layers of latent/hidden variables in which the latent variables are typically binary, while the visible units may be binary or real.
Every unit in each layer is connected to every unit in each neighboring layer.
The probability distribution represented by the DBN is given by $P(h^{(l)}, h^{(l-1)}) \propto exp(b^{{(l)}^\top}h^{(l)} + b^{(l-1)\top}h^{(l-1)} + h^{(l-1)\top}W^{(l)}h^{(l)})$ $P(h_{i}^{(k)}=1 | h^{(k+1)}) = \sigma (b_{i}^{(k)}+W_{:,i}^{(k+1)\top}h^{(k+1)})\forall i, \forall k \in 1, ..., l-2$ $P(v_i=1 | h^{(1)}) = \sigma (b_i^{(0)}+W_{:,i}^{(1)\top}h^{(1)})\forall i$
In the case of real-valued visible units, we substitute $\mathbf{v}\sim\mathcal{N}(v;b^{(0)}+W^{(1)\top}h^{(1)},\beta^{-1})$
with $\beta$ diagonal for tractability