Deep learning train/dev/test split
Data inputted into deep learning algorithms should be split into a distribution in which the majority of data (well over 70%+) is fed into the training set. This differs from traditional machine learning algorithms which had ~70% of the data given to the training set. This difference comes from the fact that there is now much more data available and deep learning algorithms improve with more data, whereas traditional machine learning algorithms improvement levels out after a certain amount of data is fed in.
For example, in a dataset of 1 million examples, you mighht decide that 10k examples are enough to evaluate which algorithm is better - 98% train, 1% dev, 1% test.
0
3
Contributors are:
Who are from:
Tags
Data Science
Related
Neural Network Reference
Convolutional Neural Network (CNN)
Recurrent Neural Network (RNN)
Generative Models
Circuit Theory
More about deep learning algorithms
Deep learning train/dev/test split
Deep Feedforward Networks (MLP = Multi-Layer Perceptrons)
Deep Learning Python libraries (frameworks)
What can convolutional neural networks be used for?
Generative adversarial network(GAN)
Optimization for Training Deep Models
Implementations of Deep Learning
Monte Carlo Methods
Deep Learning Frameworks
Adversarial Example
What does the following code do?
What does the following code do?
Deep learning train/dev/test split
The key purpose of splitting the dataset into training and test sets is:
The purpose of setting the random_state parameter in train_test_split is:
What would be the dimensions of X_train, y_train, X_test, and y_test?