Learn Before
Model Parameterization by θ
A machine learning model, such as a Large Language Model, is a function whose behavior is governed by a set of tunable parameters. These parameters are collectively represented by the symbol θ. The process of training involves finding the optimal values for θ that enable the model to perform its designated task.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Probability Distribution Formula for an Encoder-Softmax Language Model
Auto-Regressive Generation Process
Formal Definition of LLM Inference
Model Parameterization by θ
A language model built with a deep neural network is given the input sequence 'The cat sat on the'. The model's vocabulary consists of the following tokens: {a, cat, hat, mat, on, sat, the}. What does the model produce as its immediate, direct output to predict the very next token?
Analyzing Language Model Outputs
Explaining Language Model Output Behavior
Learn After
A simple predictive model is defined by the function
output = (weight_1 * input_1) + (weight_2 * input_2) + bias. During the training process, the model adjusts its internal values to better predict the output based on the inputs. Which components of this function represent the model's tunable parameters (collectively denoted as θ)?Effect of Training on Model Parameters
Definition of Student's Probability Distribution ()
Analysis of Model Specialization