Learn Before
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Mahabadi, R. K., Ruder, S., Dehghani, M., & Henderson, J. (2021). Parameter-efficient multi-task fine-tuning for transformers via shared hypernetworks. arXiv preprint arXiv:2106.04489.
0
1
Tags
Data Science
Related
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Delta Tuning
Instruction Fine-Tuning
Selecting an Efficient Fine-Tuning Strategy
A research lab needs to adapt a single, very large pre-trained language model (100B+ parameters) for 50 different, highly specialized downstream tasks. Their primary constraint is minimizing storage and computational costs, as creating and storing 50 full copies of the fine-tuned model is not feasible. Which fine-tuning strategy would be the most effective solution to this specific problem?
A development team is exploring different methods to adapt a large pre-trained language model for various applications. Match each of the following scenarios with the most appropriate fine-tuning strategy.