Learn Before
A model calculates a bucket index b(d) for a large relative position offset d using the following formula, where n_b is the total number of buckets and dist_max is a maximum distance:
b(d) = (n_b/2) + floor( (log(d) - log(n_b/2)) / (log(dist_max) - log(n_b/2)) * (n_b/2) )
True or False: This formula establishes a linear relationship between the offset d and the bucket index b(d), meaning that as d increases, the bucket index b(d) increases at a constant rate.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
dist_max Parameter in T5 Bias
A model uses logarithmic bucketing to handle large relative position offsets. The bucket index
b(d)for a given distancedis calculated using the formula below. Given a system with 32 total buckets (n_b = 32), a maximum distance of 128 (dist_max = 128), and a specific offsetd = 64, what is the resulting bucket index?b(d) = (n_b/2) + floor( (log(d) - log(n_b/2)) / (log(dist_max) - log(n_b/2)) * (n_b/2) )(Note: Use the natural logarithm for all
logoperations.)Analyzing Parameter Impact on Logarithmic Bucketing
A model calculates a bucket index
b(d)for a large relative position offsetdusing the following formula, wheren_bis the total number of buckets anddist_maxis a maximum distance:b(d) = (n_b/2) + floor( (log(d) - log(n_b/2)) / (log(dist_max) - log(n_b/2)) * (n_b/2) )True or False: This formula establishes a linear relationship between the offset
dand the bucket indexb(d), meaning that asdincreases, the bucket indexb(d)increases at a constant rate.