What happens if we initialize the weights of a feed forward network to 0s?
If we initialize s as a matrix of 0s, then in each neuron from each layer would get the same value and no matter how long you train the network, gradient descent would get stuck at the same point. This is usually called the symmetry breaking problem.

0
1
Tags
Data Science
Related
Solutions for vanishing/exploding gradient
A Gentle Introduction to Exploding Gradients in Neural Networks
What happens if we initialize the weights of a feed forward network to 0s?
When are exploding/vanishing gradients a problem?
What happens if we initialize the weights of a feed forward network to 0s?
Suppose you have built a neural network. You decide to initialize the weights and biases to be zero. Which of the following statements is true?
The Problem with Constant Initialization