1Cademy - A team of engineers is evaluating a new language models reasoning capabilities. They use an assessment method where the model must choose the single correct answer from a set of provided options for each question. Which of the following represents a primary limitation of this evaluation method for gauging the models genuine comprehension?

Learn Before

Multiple-Choice Question Answering

Multiple Choice

A team of engineers is evaluating a new language model's reasoning capabilities. They use an assessment method where the model must choose the single correct answer from a set of provided options for each question. Which of the following represents a primary limitation of this evaluation method for gauging the model's genuine comprehension?

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related