Google

A significant challenge for Large Language Models is commonsense reasoning. This type of task requires the model to make logical inferences based on implicit, real-world knowledge that is not explicitly stated in the prompt, often leading to errors despite clear instructions.

Commonsense Reasoning as a Challenging Task for LLMs

A researcher is designing a test to specifically evaluate a Large Language Model's commonsense reasoning capabilities, which rely on implicit, real-world knowledge not explicitly stated in the prompt. Which of the following prompts would be the most effective for this specific purpose?

Consider the following interaction with a Large Language Model. Analyze the model's response and explain why it represents a failure of commonsense reasoning, a known challenge for such systems.

Analysis of a Commonsense Reasoning Failure

An AI assistant is given the following prompt: 'I poured a glass of milk and heated it in the microwave for one minute. I then put a standard-sized ice cube into the hot milk. What will happen to the ice cube?' The AI responds: 'The ice cube will slowly cool the milk down, but it will likely remain mostly solid for a long time.' Evaluate the quality of the AI's response. In your evaluation, explain why this specific type of task is challenging for a language model, referencing the underlying principles of how these models process information.

Learn Before

Related