1Cademy - Logistic Regression Gradient Descent Derivation

Learn Before

Formula

Logistic Regression Gradient Descent Derivation

If we only have two features, $x_1$ and $x_2$ , in order to minimize the loss function, we can apply gradient descent to update $w_1$ , $w_2$ , and $b$ . To compute the derivatives of $\mathcal{L}(a, y)$ with respect to $w_1$ , $w_2$ , and $b$ , we need to compute the derivatives of $\mathcal{L}(a, y)$ with respect to $a$ and $z$ first. $\mathcal{L}(a, y) = -(y \log(a) + (1 - y) \log(1 - a)) \Rightarrow$ $\frac{d\mathcal{L}(a, y)}{da} = -\frac{y}{a}+\frac{1-y}{1-a}$ $a = \sigma(z) = \frac{1}{1 + e^{-z}} \Rightarrow \frac{da}{dz} = a(1-a) \Rightarrow$ $\begin{aligned} \frac{d\mathcal{L}(a, y)}{dz} & = \frac{d\mathcal{L}(a, y)}{da}\frac{da}{dz} \\ & = \left(-\frac{y}{a}+\frac{1-y}{1-a}\right)(a(1-a)) = a-y \end{aligned}$ $\begin{aligned} \frac{d\mathcal{L}(a, y)}{dw_1} & = \frac{d\mathcal{L}(a, y)}{dz}\frac{dz}{dw_1} = (a-y)x_1 \end{aligned}$ $\begin{aligned} \frac{d\mathcal{L}(a, y)}{dw_2} & = \frac{d\mathcal{L}(a, y)}{dz}\frac{dz}{dw_2} = (a-y)x_2 \end{aligned}$ $\begin{aligned} \frac{d\mathcal{L}(a, y)}{db} & = \frac{d\mathcal{L}(a, y)}{dz}\frac{dz}{db} = (a-y) \cdot 1 = a-y \end{aligned}$

0

1

Updated 2026-05-16

Contributors are:

Who are from:

University of Michigan - Ann Arbor

🏆 15

Google

✔️ 1

Learn Before

Related

Learn After