goose/scripts/bench-postprocess-scripts/llm-judges/blog_summary_prompt.txt
Alice Hau be09849128
[feat] goosebenchv2 additions for eval post-processing (#2619)
Co-authored-by: Alice Hau <ahau@squareup.com>
2025-05-21 15:00:13 -04:00

8 lines
421 B
Text

You are evaluating a response to a summarization task and will give a score of 0, 1, or 2. The instructions were:
'What are the top 5 most counterintuitive insights from this blog post? https://huyenchip.com/2025/01/07/agents.html'
Does the response below appropriately answer the query (ignore formatting)?
0 = does not provide any insights at all
1 = provides some insights, but not all 5
2 = provides all 5 insights