1Cademy - Optimizing a Spam Detection Model

Learn Before

Probabilistic Model for Text Classification using an Encoder-Classifier Architecture

Case Study

Optimizing a Spam Detection Model

Based on the provided case study, which component of the model is the most likely source of the performance issues on the test data, and what specific action should be taken to address it? Justify your reasoning.

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

A text classification model is designed with two sequential components: an 'encoder' that transforms an input sentence into a numerical vector, and a 'classifier' that uses this vector to predict a category. During evaluation, it is discovered that the model performs poorly. A detailed inspection reveals that semantically opposite sentences, such as 'The movie was brilliant and captivating' and 'The movie was dull and boring', are both being transformed into nearly identical numerical vectors by
Optimizing a Spam Detection Model
Component Roles in a Probabilistic Text Classifier

Learn Before

Related