Learn Before
Leaky Units and Other Strategies for Multiple Time Scales
As mentioned in the introduction to long-term dependencies, there are 3 ways to address long-term dependencies, and 1 of them is "Leakage units and other multi-timescale strategies", which will be introduced in this unit
This idea has led to numerous approaches, I will introduce three approaches here.
1 - Adding Skip Connections through Time
Adding direct connections from variables in the distant past to current variables is one way to get coarse time scales. The gradient may vanish or exponentially about the time step. The algorithm introduces circular connections with d-delay to alleviate this problem. The rate of exponential decrease of the derivative is now related to Ď„/d correlation rather than Ď„. Since there is both a delay and a single-step connection, the gradient may still explode exponentially into t. This allows the learning algorithm to capture longer dependencies.
2- Leaky Units and a Spectrum of Different Time Scales
The idea is similar to the sliding average model and shadow variables in TensorFlow. The sliding average actually contains information over a long period of time. In the long-term dependence problem, we can use the sliding average instead of the real parameters, making more long-term information to be considered when the parameters are iterated.
3- Removing connection
Unlike skiping connections, which add edges, the unit can self-select favorable dependencies. Whereas a removing connection directly removes the connection of length 1, forcing the connection to be longer.
0
1
Tags
Deep Learning (in Machine learning)
Data Science
Related
Applications of RNN
RNN Basic Structure
RNN Extensions and Types
Loss Function for RNN
RNNs(Recurrent Neural Networks) vs HMMs (Hidden Markov Models)
RNNs vs Feedforward Neural Networks
Hybrid of Convolutional and Recurrent Neural Network
Why is an RNN (Recurrent Neural Network) used for machine translation, say translating English to French? (Check all that apply.)
RNN Problem
Different types of RNN (in terms of input/output)
Long Term Dependencies Problem
Modeling Sequences Conditioned on Context with RNNs
Leaky Units and Other Strategies for Multiple Time Scales
Convolutional Recurrent Neural Network (CRNN)
Pooling Layer in RNN
Inability of RNNs to Carry Forward Critical Information
Stacked RNNs
Bidirectional RNNs