Checking a Mismatch Hypothesis on Training Dev Subsets
If the training and training dev sets include audio recorded within a car, double-check performance on that subset. If the system does well on car data in the training set but not on car data in the training dev set, that further validates the hypothesis that getting more car data would help.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
Checking a Mismatch Hypothesis on Training Dev Subsets
In a cat detector system with 10,000 user-uploaded images and 200,000 internet images, how should you split the target-distribution data?
True or False: We should use target-distribution training data alongside auxiliary data to help the network learn on it.
Adding target-distribution data to training means it contains data from the _____ distribution.
Match the cat detector data components with their distribution properties.
Order the decision process to allocate voice navigation data.
What is a benefit of having in-car audio in both training and training dev sets?
True or False: A training set from a different distribution than the dev/test set should still be used for learning.
Good performance on training but not on training dev validates the _____ hypothesis.
Match the performance outcomes on in-car audio with their implications.
Order the process of setting up and using a cat detector with mixed data.
Learn After
What does it indicate when a model performs well on car audio in the training set but poorly on car audio in the training dev set?
If both your training set and training dev set contain car audio, you should evaluate your system's performance specifically on that car-audio subset.
If a model does well on car audio in the training set but poorly on car audio in the training dev set, this validates the hypothesis that getting more _____ data would help.
When training and training dev sets both include car-recorded audio, what action should you take to investigate the data mismatch hypothesis?
If a model performs well on car audio in the training set but poorly on car audio in the training dev set, this further validates the hypothesis that getting more car data would help.
If the system does well on car data in the training set but not on car data in the _____, this further validates the mismatch hypothesis.
Match each observation about car-audio subset performance to its implication for the data mismatch hypothesis.
Order the diagnostic steps for using a shared subset (e.g., car audio) to check the data mismatch hypothesis.
What conclusion should you draw if your model achieves high accuracy on car audio in the training set but low accuracy on car audio in the training dev set?
Checking performance on a shared subset in the training and training dev sets can only refute—never further validate—a data mismatch hypothesis.
Ng recommends double-checking the system's performance on the car-audio _____ when both the training and training dev sets include car-recorded audio.
Match each key term in the mismatch hypothesis checking procedure to its correct description.
Order the reasoning steps for deciding whether to collect more car data, starting from suspecting a mismatch to reaching a validated conclusion.
Analyzing Mismatch Hypotheses via Training and Training Dev Subsets
Diagnosing Speech Recognition Performance in Car Audio Subsets
Hypothesis Validation from Training to Training Dev Subset Performance