Learn Before
Neural Network Capacity and Irrelevant Training Data
Question: Besides wasting computational resources, what other negative impact does including unbeneficial training data (like scanned text in a casual photo detector) have on a neural network?
Sample answer: Including unbeneficial training data wastes the neural-network's representation capacity by forcing it to represent features that do not help with the target dev/test distribution.
Key points:
- Wastes neural-network representation capacity.
- Forces the network to allocate parameters/capacity to irrelevant features.
Rubric: The response must state that it wastes the representation capacity of the neural network.
0
1
References
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
In the cat-detector example, why does Machine Learning Yearning recommend excluding scanned historical documents that look nothing like the dev/test distribution?
True or False: Adding more training data always improves model performance when training, dev, and test sets share the same distribution.
According to Machine Learning Yearning, if the dev error curve has _____ (i.e., flattened out), adding more training data will not help you reach your performance goal.
Why does Machine Learning Yearning recommend leaving out training data that has no benefit for your model?
True or False: According to Machine Learning Yearning, adding more training data can actually hurt model performance.
If the dev error curve has _____, you can immediately tell that adding more training data won't help reach your performance goal.
Match each concept to its correct description in Machine Learning Yearning's discussion of when adding training data does not help.
Order the steps for using a learning curve to decide whether to collect more training data, as described in Machine Learning Yearning.
In the cat-detector example, why should a large collection of scanned historical documents be excluded from training?
True or False: According to Machine Learning Yearning, inspecting the learning curve can prevent wasting months collecting data that turns out not to help.
According to Machine Learning Yearning, data that has no _____ should be left out of training for computational reasons.
Match each training data scenario to the recommended action from Machine Learning Yearning.
Order the reasoning steps for evaluating whether a new data source (e.g., scanned documents) should be added to training, per Machine Learning Yearning.
Analyzing Computational and Representational Costs of Unhelpful Training Data
Evaluating the Inclusion of Historical Document Scans in a Casual Cat-Detector Model
Neural Network Capacity and Irrelevant Training Data