Learn Before
Formalising Curriculum Learning
Let z be a random variable representing an example for the learner (possibly an (x,y) pair for supervised learning). Let P(z) be the target training distribution from which the learner should ultimately learn a function of interest. Let $0 \leq W_{\lambda}(z) \leq 1\lambda in the curriculum sequence, with $0 \leq \lambda \leq 1, and . The corresponding training distribution at step is
Consider a monotonically increasing sequence of values, starting from and ending at . The corresponding sequence of distributions a curriculum if the entropy of these distributions increases and is monotonically increasing in , i.e., .
This builds up a sequential training sequence. Weights initially try and favor the simpler examples that can be learned relatively easily. The training undergoes adaptation in weighting to increase the probability of difficult examples entering training as a result of which the entropy increase.
0
1
Tags
Data Science