Formula

Logistic Regression Gradient Descent Derivation

If we only have two features, x1x_1 and x2x_2, in order to minimize the loss function, we can apply gradient descent to update w1w_1, w2w_2, and bb. To compute the derivatives of L(a,y)\mathcal{L}(a, y) with respect to w1w_1, w2w_2, and bb, we need to compute the derivatives of L(a,y)\mathcal{L}(a, y) with respect to aa and zz first. L(a,y)=(ylog(a)+(1y)log(1a))\mathcal{L}(a, y) = -(y \log(a) + (1 - y) \log(1 - a)) \Rightarrow dL(a,y)da=ya+1y1a\frac{d\mathcal{L}(a, y)}{da} = -\frac{y}{a}+\frac{1-y}{1-a} a=σ(z)=11+ezdadz=a(1a)a = \sigma(z) = \frac{1}{1 + e^{-z}} \Rightarrow \frac{da}{dz} = a(1-a) \Rightarrow dL(a,y)dz=dL(a,y)dadadz=(ya+1y1a)(a(1a))=ay\begin{aligned} \frac{d\mathcal{L}(a, y)}{dz} & = \frac{d\mathcal{L}(a, y)}{da}\frac{da}{dz} \\ & = \left(-\frac{y}{a}+\frac{1-y}{1-a}\right)(a(1-a)) = a-y \end{aligned} dL(a,y)dw1=dL(a,y)dzdzdw1=(ay)x1\begin{aligned} \frac{d\mathcal{L}(a, y)}{dw_1} & = \frac{d\mathcal{L}(a, y)}{dz}\frac{dz}{dw_1} = (a-y)x_1 \end{aligned} dL(a,y)dw2=dL(a,y)dzdzdw2=(ay)x2\begin{aligned} \frac{d\mathcal{L}(a, y)}{dw_2} & = \frac{d\mathcal{L}(a, y)}{dz}\frac{dz}{dw_2} = (a-y)x_2 \end{aligned} dL(a,y)db=dL(a,y)dzdzdb=(ay)1=ay\begin{aligned} \frac{d\mathcal{L}(a, y)}{db} & = \frac{d\mathcal{L}(a, y)}{dz}\frac{dz}{db} = (a-y) \cdot 1 = a-y \end{aligned}

Image 0

0

1

Updated 2026-05-16

Tags

Data Science

Related