Login
From:
Ai2 Blog
(Uncensored)
subscribe
Contextualized Evaluations: Judging Language Model Responses to Underspecified Queries
https://allenai.org/blog/contextualized-evaluations
links
backlinks
How do we evaluate LLMs on underspecified queries? We show that adding clarifying context flips model rankings and uncovers model biases.
Roast topics
Find topics
Roast it!
Roast topics
Find topics
Find it!
Roast topics
Find topics
Find it!