Essay

Choosing a Search-and-Ensemble Strategy for a Regulated LLM Workflow

You lead an applied AI team deploying an LLM to draft short, regulator-facing incident summaries from internal event logs. Constraints: (1) you have a fixed budget of 2,000 LLM calls per week for all optimization and production, (2) outputs must be stable across weekly model version updates, and (3) the business will only accept a solution if you can explain why it is reliable (not just that it scored well once).

Propose an end-to-end approach that combines (a) automated prompt design framed explicitly as a search problem (define your search space, search strategy, and performance estimation), (b) an iterative LLM-based prompt search loop (evaluation–pruning–expansion) or an evolutionary computation approach (population, selection, variation operators), and (c) prompt ensembling at inference time (how many prompts, how you ensure diversity, and how you aggregate outputs).

In your answer, justify the key tradeoffs you make between: exploration vs. exploitation in the search, single-best prompt vs. ensemble reliability, and optimization spend vs. production spend under the 2,000-call budget. Conclude with a concrete stopping condition and a plan for monitoring/refreshing prompts after model updates without restarting from scratch.

0

1

Updated 2026-02-06

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Data Science

Related