Learn Before
Concept

Sparse Autoencoders

  • It is an autoencoder that trains with the reconstruction error involving a sparsity penalty Ω(h)\Omega(h) on the code layer hh: L(x,g(f(x)))+Ω(h)L(x, g(f(x))) + \Omega(h), where g(h)g(h) is the decoder output, and h=f(x)h = f(x) the encoder output.
  • It is a framework that approximates the maximum likelihood training of a generative model that has hidden layers.
  • A model with visible variables xx and hidden variables hh, with an explicit joint distribution pmodel(x,h)=pmodel(h)pmodel(xh)p_{model}(x, h) = p_{model}(h)p_{model}(x | h). The log-likelihood can be decomposed as: log(pmodel(x))=loghpmodel(h,x)log(p_{model}(x)) = log \sum\limits_h p_{model} (h,x) We can think of the autoencoder as approximating this sum with a point estimate for just one highly likely value for hh, with this chosen hh, we are maximizing log(pmodel(h,x))=log(pmodel(h)+log(pmodel(xh)log(p_{model}(h, x)) = log(p_{model}(h) + log(p_{model}(x | h) Expressing the log-prior as an absolute value penalty, we obtain Ω(h)=λihi\Omega (h) = \lambda \sum\limits_i |h_i| log(pmodel(h))=i(λhilogλ2)=Ω(h)+const- log(p_{model}(h)) = \sum\limits_i (\lambda |h_i| - log \frac{\lambda}{2}) = \Omega (h) + const
Image 0

0

1

Updated 2021-07-23

Tags

Data Science