Down-Weighting Auxiliary Data from a Different Distribution
When an additional training source has a very different distribution from the dev/test set, or when it is much larger than the target-distribution data, the auxiliary examples can be given lower weight. This can reduce the computational burden of making the model do well on both auxiliary and target-distribution examples.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
Avoid Randomly Shuffling Mixed-Source Data into Dev/Test Sets
Include Some Target-Distribution Examples in Training Alongside Auxiliary Data
Down-Weighting Auxiliary Data from a Different Distribution
Training Dev Set
Error Table Across Two Data Distributions and Three Error Types
Data Mismatch Between Training and Dev Set Distributions
Limited Practical Scope of Domain Adaptation for Different Data Distributions
Domain Adaptation for Different Data Distributions
Website Images and Mobile Phone Pictures as a Distribution Mismatch Example
Random 70/30 Train/Test Split Can Fail Under Distribution Shift
Learn After
Tuning Auxiliary Data Weight with the Dev Set
Which condition makes it appropriate to down-weight auxiliary training data according to Andrew Ng?
Down-weighting auxiliary data can reduce the computational burden of training a model on both auxiliary and target-distribution examples.
Re-weighting is needed only when you suspect the additional data has a very _____ distribution than the dev/test set.
Match each training scenario to its key implication for handling auxiliary data.
Order the reasoning steps a practitioner should follow when deciding whether to down-weight large-scale auxiliary training data.
In the 40x internet vs. mobile image scenario, what is the main cost of training equally on both sources without down-weighting?
Giving auxiliary data a lower weight in training is equivalent to removing it from the training set entirely.
By giving internet images a much lower _____, you don't have to build as massive a neural network.
Match each concept to its correct description in the context of auxiliary data down-weighting.
Order the causal steps explaining why down-weighting auxiliary data allows a smaller neural network to be used.