1Cademy - A development team is building a large-scale language model and has a fixed budget for the computational resources required for training. They observe that their current model, which has a moderately complex architecture, stops improving its performance even when they continue training it on their existing large dataset. To achieve a significant leap in the models capabilities, which of the following approaches represents the most effective use of their limited computational budget?

Project Alpha: Aims to train a model on a dataset ten times larger than any previously used, using a well-established architecture that has known limitations with very long text inputs.
Project Beta: Aims to develop a novel model architecture capable of processing entire books as a single input, but due to the experimental nature and computational cost of this new design, i

Learn Before

Core Topics in LLM Development and Scaling

Multiple Choice

A development team is building a large-scale language model and has a fixed budget for the computational resources required for training. They observe that their current model, which has a moderately complex architecture, stops improving its performance even when they continue training it on their existing large dataset. To achieve a significant leap in the model's capabilities, which of the following approaches represents the most effective use of their limited computational budget?

Updated 2025-10-06

Contributors are:

Who are from:

Learn Before

Related