Theory

Probit Model

An alternative to the softmax function for modeling categorical probabilities is the probit model, which assumes that the raw outputs o\mathbf{o} are corrupted versions of the true labels y\mathbf{y}. This corruption is modeled by adding noise ϵ\boldsymbol{\epsilon} drawn from a normal distribution. Mathematically, it is expressed as y=o+ϵ\mathbf{y} = \mathbf{o} + \boldsymbol{\epsilon}, where ϵiN(0,σ2)\epsilon_i \sim \mathcal{N}(0, \sigma^2). While conceptually appealing for explaining variance, the probit model is generally less effective and leads to a more difficult optimization problem compared to softmax regression.

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L