Stabilizing an LLM Workflow for Multi-Step Policy Compliance Decisions
You are deploying an internal LLM assistant to help procurement analysts decide whether a proposed vendor contract requires (a) a standard review, (b) an enhanced review, or (c) an automatic rejection. The decision depends on multiple interdependent rules (e.g., data types handled, cross-border transfers, subcontractors, and exception clauses). In pilot testing, a single prompt that asks for the final decision often misses a key condition; a zero-shot “Let’s think step by step” prompt sometimes produces a long rationale but forgets to clearly state the final decision; and when you add a few demonstrations, the model becomes more consistent but still makes occasional early-step mistakes that cascade into the wrong outcome.
Design a prompting workflow (you may describe it as a sequence of LLM calls) that uses: (1) in-context learning via demonstrations, (2) explicit problem decomposition in a least-to-most progression, and (3) an iterative self-refinement loop. Your answer must explain how information flows from one step to the next, how you would prevent or detect cascading errors from early steps, and how you would ensure the model always outputs an unambiguous final decision label (a/b/c) even when using chain-of-thought style reasoning.

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.3 Prompting - Foundations of Large Language Models
Ch.5 Inference - Foundations of Large Language Models
Ch.1 Pre-training - Foundations of Large Language Models
Related
Application of COT Prompting on GSM8K Benchmark
Structuring Logical Reasoning Steps for Demonstrations
Zero-Shot Chain-of-Thought (COT) Prompting
Application of CoT to Algebraic Calculation Problems
Benefits of Chain-of-Thought (CoT) Prompting
Incomplete Answers from Zero-Shot CoT Prompts
Chain-of-Thought as a Search Process
Supervising Intermediate Reasoning Steps for LLM Alignment
Limitations of Simple Chain-of-Thought Prompting
Creating a CoT Prompt by Incorporating Reasoning Steps
Alternative Trigger Phrases for Zero-Shot CoT Prompting
Incomplete Answers as a Potential Issue in Zero-Shot CoT Prompting
A developer is trying to improve a language model's ability to solve multi-step word problems. They compare two prompting strategies.
Strategy 1: Provide the model with a new word problem and ask for the final answer directly.
Strategy 2: Provide the model with a new word problem, but first show it an example of a similar problem where the solution is explicitly broken down into logical, sequential steps before reaching the final conclusion.
Why is Strategy 2 generally more effective for improving the model's reasoning on complex tasks?
Improving a Prompt for a Multi-Step Problem
Few-Shot Chain-of-Thought (CoT) Prompting
Practical Limitations of Chain-of-Thought Prompting
The primary benefit of a prompting technique that demonstrates a step-by-step reasoning process is that it permanently modifies the language model's internal weights, making it inherently better at solving similar problems in the future, even without the detailed prompt.
Designing a Prompting Workflow for a High-Stakes, Multi-Step Task
Choosing and Justifying a Prompting Strategy Under Context and Quality Constraints
Diagnosing and Redesigning a Prompting Approach for a Decomposed Workflow
Stabilizing an LLM Workflow for Multi-Step Policy Compliance Decisions
Debugging a Multi-Step LLM Workflow for Contract Clause Risk Triage
Designing a Robust Prompting Workflow for Multi-Step Root-Cause Analysis with Limited Examples
You’re building an internal LLM assistant to help ...
Your team is rolling out an internal LLM assistant...
You’re leading an internal enablement team buildin...
You’re building an internal LLM workflow to produc...
Example of One-Shot Chain-of-Thought (COT) Prompting
Problem-Solving Scenarios for Chain-of-Thought Prompting
Self-Consistency Method
Sub-problem Generation in Least-to-Most Prompting
Improving Least-to-Most Prompting with Advanced Techniques
Improving Problem Decomposition in Least-to-Most Prompting
An AI developer needs a large language model to solve a complex, multi-step logic puzzle that requires deducing a final answer from a series of interdependent clues. Initial attempts to solve the puzzle by providing the full puzzle and a few examples of other solved puzzles have consistently failed. Which of the following prompting strategies is the most effective next step, and why?
Analyzing a Problem-Solving Approach
A language model is tasked with solving the following logic puzzle: 'Sarah, David, and Emily are a doctor, a lawyer, and an engineer. The doctor is Emily's sister. David is not the lawyer.' To solve this complex problem, it is broken down into a series of simpler, sequential sub-problems. Arrange the following sub-problems in the correct logical order that builds towards the final solution.
Your team is rolling out an internal LLM assistant...
You’re building an internal LLM workflow to produc...
You’re building an internal LLM assistant to help ...
You’re leading an internal enablement team buildin...
Choosing and Justifying a Prompting Strategy Under Context and Quality Constraints
Designing a Prompting Workflow for a High-Stakes, Multi-Step Task
Diagnosing and Redesigning a Prompting Approach for a Decomposed Workflow
Stabilizing an LLM Workflow for Multi-Step Policy Compliance Decisions
Debugging a Multi-Step LLM Workflow for Contract Clause Risk Triage
Designing a Robust Prompting Workflow for Multi-Step Root-Cause Analysis with Limited Examples
Example of Final Problem Solving in Least-to-Most Prompting
Example of Self-Refinement in Machine Translation
Three-Step Framework for Self-Refinement in LLMs
Ideal Self-Refinement without Additional Training
Fine-Tuning LLMs for Self-Refinement Tasks
Task-Specific Models as an Alternative for Refinement
Self-Refinement as an LLM Alignment Issue
Self-Reflection in LLMs
A developer is using a large language model to generate a Python function for a complex data analysis task. The developer's workflow is as follows:
- The model generates an initial version of the function.
- The developer then prompts the same model, providing the initial function and asking it to 'act as a senior code reviewer, identify potential bugs or inefficiencies, and explain how to fix them.'
- Based on the model's feedback, a final, improved version of the function is produced.
This iterative process of generating an output, using the model to critique its own output, and then improving it based on that critique is best described as:
Applying an Iterative Improvement Framework
Product Design as an Analogy for Self-Refinement
Relationship between Self-Refinement and Self-Reflection in LLMs
Comparing Output Improvement Strategies
Your team is rolling out an internal LLM assistant...
You’re building an internal LLM workflow to produc...
You’re building an internal LLM assistant to help ...
You’re leading an internal enablement team buildin...
Choosing and Justifying a Prompting Strategy Under Context and Quality Constraints
Designing a Prompting Workflow for a High-Stakes, Multi-Step Task
Diagnosing and Redesigning a Prompting Approach for a Decomposed Workflow
Stabilizing an LLM Workflow for Multi-Step Policy Compliance Decisions
Debugging a Multi-Step LLM Workflow for Contract Clause Risk Triage
Designing a Robust Prompting Workflow for Multi-Step Root-Cause Analysis with Limited Examples
Divide-and-Conquer Paradigm
Example of a Classification Task for LLMs: Identifying AI Risks in a Document
Approaches to Multi-Step Reasoning in LLMs
Two-Step Problem Decomposition
Dynamic Problem Decomposition for Complex Reasoning
Compositionality in NLP
Outlining as a Method of Problem Decomposition for Generative Tasks
General Framework of Problem Decomposition
A team is using a large language model to automate complex tasks. They decide to implement a strategy where a main problem is broken down into a complete, fixed list of sub-problems before the model begins to solve any of them. The model will then solve each sub-problem in sequence. For which of the following tasks is this pre-defined decomposition approach LEAST likely to succeed?
Evaluating a Problem Decomposition Strategy for Multi-Hop QA
Illustrating the Need for Decomposition in Generative Tasks
Complex Reasoning Problems
Multi-hop Question Answering
A development team is building several applications powered by a large language model. Match each application's primary task with the most suitable strategy for breaking down the problem.
Designing a Decomposition-Driven LLM Workflow for a High-Stakes Corporate Task
Debugging a Decomposition-Based LLM Workflow Using Recursive Sub-Problems and Contextual QA Pairs
Evaluating and Redesigning a Decomposition Workflow Under Context and Cost Constraints
Designing a Decomposition-and-QA-Pair Workflow for Contract Review with Recursive Escalation
Stabilizing a Decomposition-Based LLM Workflow for a Regulated Customer-Email Triage System
Designing a Decomposition Workflow for Root-Cause Analysis of a Production Incident
Create a Recursive, Context-Carrying Decomposition Plan for LLM-Assisted KPI Narrative Generation
You are building an internal LLM assistant to answ...
You are designing an internal LLM workflow to answ...
You’re building an internal LLM workflow to answer...
Your team is rolling out an internal LLM assistant...
You’re building an internal LLM workflow to produc...
You’re building an internal LLM assistant to help ...
You’re leading an internal enablement team buildin...
Choosing and Justifying a Prompting Strategy Under Context and Quality Constraints
Designing a Prompting Workflow for a High-Stakes, Multi-Step Task
Diagnosing and Redesigning a Prompting Approach for a Decomposed Workflow
Stabilizing an LLM Workflow for Multi-Step Policy Compliance Decisions
Debugging a Multi-Step LLM Workflow for Contract Clause Risk Triage
Designing a Robust Prompting Workflow for Multi-Step Root-Cause Analysis with Limited Examples
Psychological Perspective on Problem Decomposition
Tool Use as Problem Decomposition in LLMs
Rationale for Using One-Shot and Few-Shot Learning
Few-Shot Learning
In-Context Learning as an Emergent Ability
Efficiency of In-Context Learning for Model Adaptation
Contribution of In-Context Learning to AI Generalization and Usability
Zero-Shot Learning with LLMs
One-Shot Learning
Factors Influencing In-Context Learning Effectiveness
Understanding the Emergence and Mechanics of In-Context Learning
Theoretical Interpretations of In-Context Learning
Providing Reference Information in Prompts
Instruction Generation in Self-Instruct
One-Shot Chain-of-Thought (CoT) Prompting
Scope of Zero-shot, One-shot, and Few-shot Learning
Few-Shot Learning in Prompting
Comparison of Zero-shot, One-shot, and Few-shot Learning
In-Context Learning as a Guiding Mechanism for LLM Predictions
Calculation Annotation
Final Answer Formatting Token
A developer needs a large language model to translate technical jargon into plain language. They construct a prompt containing several pairs of 'Jargon-to-Plain Language' examples, followed by a new piece of technical text. The model successfully provides a plain language translation for the new text. Which statement best analyzes the fundamental mechanism of this approach?
Evaluating Prompting Strategies for Task Adaptation
Using Demonstrations to Improve LLM Accuracy
In-Context Learning as Knowledge Activation
Differentiating Learning Methods
Your team is rolling out an internal LLM assistant...
You’re building an internal LLM workflow to produc...
You’re building an internal LLM assistant to help ...
You’re leading an internal enablement team buildin...
Choosing and Justifying a Prompting Strategy Under Context and Quality Constraints
Designing a Prompting Workflow for a High-Stakes, Multi-Step Task
Diagnosing and Redesigning a Prompting Approach for a Decomposed Workflow
Stabilizing an LLM Workflow for Multi-Step Policy Compliance Decisions
Debugging a Multi-Step LLM Workflow for Contract Clause Risk Triage
Designing a Robust Prompting Workflow for Multi-Step Root-Cause Analysis with Limited Examples
Example of In-Context Learning
Example of In-Context Learning for Translation
Augmented Input Formula in In-Context Learning
Example of a Zero-Shot COT Prompt
Comparison of Few-Shot and Zero-Shot CoT Prompting
Alternative Phrases for Triggering Chain-of-Thought Reasoning
A user wants a large language model to solve a multi-step word problem. The model's initial attempts provide only a final, incorrect answer. The user's goal is to modify the prompt to encourage the model to generate a detailed, step-by-step thought process first, which should lead to a more accurate final answer. Crucially, the user does not want to include a complete, solved example of another problem in the prompt. Which of the following prompt modifications best achieves this specific goal?
To successfully prompt a language model to generate a step-by-step thought process for a new problem, one must always include a complete, solved example of a similar problem within the prompt.
Structure of a Zero-Shot CoT Prompt for an Arithmetic Task
Identifying a Zero-Shot Reasoning Prompt
Your team is rolling out an internal LLM assistant...
You’re building an internal LLM workflow to produc...
You’re building an internal LLM assistant to help ...
You’re leading an internal enablement team buildin...
Choosing and Justifying a Prompting Strategy Under Context and Quality Constraints
Designing a Prompting Workflow for a High-Stakes, Multi-Step Task
Diagnosing and Redesigning a Prompting Approach for a Decomposed Workflow
Stabilizing an LLM Workflow for Multi-Step Policy Compliance Decisions
Debugging a Multi-Step LLM Workflow for Contract Clause Risk Triage
Designing a Robust Prompting Workflow for Multi-Step Root-Cause Analysis with Limited Examples
Zero-Shot CoT Example with Jack's Apples