Relation

Log-Likelihood Gradient

The gradient of the log-likelihood which is used in Maximum Likelihood Estimation can be decomposed into the following: θlogp(x;θ)=θlogp~(x;θ)θlogZ(θ)\nabla_{\boldsymbol \theta} \log p(\mathbf{x}; \boldsymbol \theta) = \nabla_{\boldsymbol \theta} \log \tilde{p}(\mathbf{x}; \boldsymbol \theta) - \nabla_{\boldsymbol \theta } \log Z(\boldsymbol \theta) Where p~(x;θ)\tilde{p}(\mathbf{x}; \boldsymbol \theta) is the unnormalized probabilioty density and ZZ is the partition function. This is well-known as the decomposition into the positive phase and negative phase of learning. Due to the reliance of the partition function on the parameters, learning models by maximum likelihood is particularly difficult.

0

1

Updated 2021-07-22

References


Tags

Data Science

Related