Learn Before
Challenging Reasoning Tasks for LLMs
Large Language Models (LLMs) struggle with tasks that require arithmetic and commonsense reasoning. These challenges arise because such problems often depend on implicit knowledge or logical deduction that is not explicitly provided in the prompt. Consequently, even with clear and precise instructions, LLMs may generate incorrect answers when the solution requires information beyond what is directly stated.
0
1
References
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.3 Prompting - Foundations of Large Language Models
Related
Challenging Reasoning Tasks for LLMs
Self-Refinement in LLMs
Model Ensembling for Text Generation
Output Ensembling
Retrieval-Augmented Generation (RAG)
LLM Tool Use with External APIs
Evolution of the Concept of Alignment in NLP
Analyze the two scenarios below, each showing an incorrect output from a language model. Which scenario provides the clearest example of a failure caused by the model's lack of implicit knowledge, rather than a simple factual error in its training data?
Analyzing an LLM's Reasoning Failure
Limitations of Pre-trained Knowledge in Standard LLMs
Explaining an LLM's Reasoning Error
Learn After
GSM8K Benchmark
Insufficiency of Simple Demonstrations for LLM Reasoning Tasks
A user gives a language model the following prompt: 'I have a box that contains a red ball and a blue ball. I take the red ball out and put it on the table. What is left in the box?' The model responds: 'The box contains a red ball and a blue ball.' Which of the following best analyzes the likely cause of the model's incorrect answer?
Commonsense Reasoning as a Challenging Task for LLMs
In-Context Learning (ICL)
The Challenge of Multi-Step Logical Inference for LLMs in Arithmetic Reasoning
Language Model Scheduling Error Analysis
Predicting LLM Reasoning Flaws