Evaluating LLM Response Styles from Tool Output
A user asks a language model, 'What is the capital of Australia?'. The model uses a search tool which returns the following information: [Search Result]: Canberra is the capital city of Australia. The model considers two possible final answers to generate for the user:
Answer A: The answer is: Canberra.
Answer B: The capital of Australia is Canberra.
Critically evaluate these two responses. In your analysis, compare their effectiveness and discuss the potential advantages and disadvantages of each style for the end-user.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A user asks a language model: 'Who won the Best Picture award in 2023 and what was the movie about?' The model uses an external tool to retrieve information, and receives the following data: '[Search Result]: The 2023 Academy Award for Best Picture was won by "Everything Everywhere All at Once". The film is a sci-fi action-adventure about an exhausted Chinese-American woman who discovers she must connect with parallel universe versions of herself to prevent a powerful being from destroying the multiverse.' Based on this retrieved data, which of the following is the most appropriate final response for the model to generate?
Synthesizing Information for a Final Answer
Evaluating LLM Response Styles from Tool Output