Essay

Designing a Cost-Constrained Automated Prompt Optimization Pipeline

You lead an applied ML team deploying an LLM to classify incoming customer-support tickets into 12 categories. The business requires higher reliability than a single prompt provides, but you have a strict inference budget: at most 2 LLM calls per ticket in production. You are allowed an offline optimization phase using a labeled validation set of 5,000 tickets, and you can run up to 50,000 total LLM calls during offline experimentation.

Write a proposal for an automated prompt design approach that (a) frames prompt optimization explicitly as a search problem (define the search space, search strategy, and performance estimation), (b) uses an iterative LLM-based prompt search loop (evaluation → pruning → expansion) to discover strong candidate prompts, (c) incorporates an evolutionary computation idea (e.g., mutation/crossover/selection) to avoid stagnation and improve exploration, and (d) ends with a prompt ensembling strategy that respects the 2-call production limit (explain how you will choose/weight/aggregate outputs and why this improves reliability).

In your answer, justify key tradeoffs (exploration vs. exploitation, offline cost vs. expected production gains, diversity vs. overfitting to the validation set) and specify at least two concrete stopping conditions and two failure modes you would monitor for (with mitigations).

0

1

Updated 2026-02-06

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Data Science

Related