Essay

LLM Architecture Selection for a Legal Tech Application

A legal technology firm is developing a tool to summarize and analyze legal contracts, which often exceed 50,000 tokens in length. They are considering two pre-trained language models:

  • Model A: A standard model known for its state-of-the-art performance on general language tasks. Its internal mechanisms have computational and memory requirements that grow quadratically as the input length increases.
  • Model B: A newer model with a modified internal structure designed to process long inputs more efficiently. Its memory usage scales more favorably with input length, but this modification leads to a minor, measurable decrease in performance on standard benchmark tests compared to Model A.

Evaluate the trade-offs between these two models for the firm's specific application. Which model would you recommend, and why? Justify your decision by explaining the underlying architectural challenge that Model B is designed to solve.

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science