1Cademy - Transfer knowledge of a PTM to the downstream NLP tasks

How it works Courses Research Communities Benefits About Us

Learn Before

Applying and Adapting Pre-trained Models to Downstream Tasks

Concept

Transfer knowledge of a PTM to the downstream NLP tasks

Choose appropriate pre-training task, model architecture and corpus. Currently, the language model is the most popular pretraining task and can more efficiently solve a wide range of NLP problems. However, different pre-training tasks have their own bias and give different effects for different tasks. Besides, the data distribution of the downstream task should be approximate to PTMs.
Choose appropriate layers. Given a pre-trained deep model, different layers should capture different kinds of information, such as POS tagging, parsing, long-term dependencies, semantic roles, coreference.
There are two common ways of model transfer: feature extraction (where the pre-trained parameters are frozen), and fine-tuning (where the pre-trained parameters are unfrozen and fine-tuned). In feature extraction way, the pre-trained models are regarded as off-the-shelf feature extractors. Moreover, it is important to expose the internal layers as they typically encode the most transferable representations.

0

1

Updated 2022-05-29

Contributors are:

Mingyu Li

Who are from:

University of Michigan - Ann Arbor

University of Michigan - Ann Arbor

References

Pre-trained Models for Natural Language Processing: A Survey

Tags

Data Science

Related