Learn Before
Steps of Fine-Tuning in Transfer Learning
Fine-tuning is a common technique in transfer learning that consists of four main steps. First, a neural network, known as the source model, is pre-trained on a large source dataset. Second, a target model is created by copying the architecture and parameters of the source model, excluding the output layer. The output layer is excluded because it is assumed to be closely related to the specific labels of the source dataset, making it inapplicable to the target dataset. Third, a new, randomly initialized output layer is added to the target model to match the number of categories in the target dataset. Finally, the target model is trained on the target dataset; the new output layer is trained from scratch while the parameters of the other layers are fine-tuned.
0
2
Tags
Data Science
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
D2L
Dive into Deep Learning @ D2L
Related
Benefits of Pre-training
Model Training Strategy for Medical Imaging
A research team is developing a model to classify rare plant diseases from a small, specialized dataset of only 500 leaf images. They are considering several training strategies. Which of the following strategies best demonstrates an understanding of how to leverage an initial, general-purpose training phase to overcome the limitation of a small dataset?
Rationale for Pre-training with General Data
Steps of Fine-Tuning in Transfer Learning
Learn After
POPULAR PRE-TRAINED MODELS
Number of Layers to Freeze When Using a Pre-trained Model for Transfer Deep Learning
Implementing Freezing Layers When Using a Pre-trained Model for Transfer Deep Learning
An engineer needs to build a model to classify 15 types of local wildflowers using a custom dataset of only 900 images. They select a very deep and complex neural network that was previously trained on a dataset of over a million general-purpose images (e.g., animals, vehicles, household objects). The engineer's strategy is to retrain all layers of this complex network from scratch, using only their small wildflower dataset. What is the most likely outcome of this strategy?
You are tasked with building an image classifier for a new, specialized task (e.g., identifying specific types of industrial equipment), but you only have a small, custom dataset. You decide to adapt a model that has already been trained on a very large, general image dataset. Arrange the following steps in the correct logical order to implement this strategy.
Adapting a Pre-trained Network for a New Task
Hot Dog Recognition Example in Fine-Tuning
Computational Benefits of Freezing Pretrained Parameters