Learn Before
Example of Context-Based Prediction: Kitten Chasing Ball
In context-based prediction, a model generates text based on a given context. For instance, if the model is provided with the context [C] The kitten is, its objective is to predict the subsequent text to form a complete sentence, such as [C] The kitten is chasing the ball .. The target prediction can also be represented as a simple sequence of words or an indexed sequence, like chasing1 the2 ball3 ., where numbers indicate a specific prediction order.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of Predicting Masked Words: Kitten Playing
Example of Masked Language Modeling: Kitten Chasing Ball
Example of Context-Based Prediction: Kitten Chasing Ball
In a sequence-to-sequence model, an attention mechanism calculates a score for three input vectors (A, B, and C) relative to a single output vector (D). The scoring function is the simple dot product between the output vector and each input vector. You are given the following geometric relationships:
- Vector A points in a very similar direction to Vector D.
- Vector B is orthogonal (at a 90-degree angle) to Vector D.
- Vector C points in the opposite direction of Vector D.
Which input vector will receive the highest attention score, and what is the underlying reason for this?
Evaluating Attention Mechanisms in Machine Translation
Calculating a Dot Attention Score
Learn After
Examples of Masked Prediction Tasks
A language model is designed to generate the most probable sequence of words based on a given text. If the model is provided with the input: 'After a long day of hiking in the mountains, the tired traveler sat down by the campfire and...'. Which of the following continuations would the model most likely generate?
A language model is tasked with completing a sentence. Given the initial context 'After the storm, a brilliant rainbow', arrange the following words into the most logical sequence that the model would generate one word at a time to complete the thought.
Analyzing a Language Model's Prediction Error