Essay

Choosing and Operationalizing a Sentiment Classifier Under Real Production Constraints

You lead an NLP team that must deploy a polarity classifier for customer feedback (labels: Positive, Negative, Neutral) into a regulated product where (a) outputs must be one of the three labels only (no extra text), (b) the system must be auditable and stable across weekly model updates, and (c) you have 8,000 labeled examples but also need a fast proof-of-value in 2 weeks.

Write a recommendation memo (as if to an engineering manager) that proposes an end-to-end approach and justifies it by explicitly comparing: (1) a single-text classifier built with a BERT-style encoder using the [CLS] representation plus a prediction head, versus (2) classification via prompt completion using a text-generation LLM.

Your memo must explain how you would ensure the final output is always one of {Positive, Negative, Neutral} in the prompt-completion approach (including a concrete label-mapping strategy and how you would handle non-literal outputs like “This review is mostly satisfied but with minor issues”), and how that requirement differs from the BERT approach. Conclude with the key tradeoffs you are accepting (time-to-ship, accuracy, auditability, and failure modes) and the conditions under which you would switch from one approach to the other after the initial launch.

0

1

Updated 2026-02-06

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.2 Generative Models - Foundations of Large Language Models

Ch.1 Pre-training - Foundations of Large Language Models

Related