evaluation

Better Agents Through Evaluation: How To

See how testing your agents with LLM-judged questions (evaluation) will improve their quality, prevent regressions, and boost reliability for years…

1 month ago

The Archive pt 3: Don’t Hack Away on Vibes Alone

Let's not hack away on The Archive on vibes alone; let's evaluate our code. Using Promptfoo!

8 months ago