Evaluating and Debugging Generative AI Models ↗
Prerequisites: Python, experience with LLM applications
evaluation
testingCovers evaluation metrics, debugging techniques, and systematic testing for generative AI applications using Weights & Biases. The practical companion to the Evaluation & Testing learning path — the course provides hands-on practice with evaluation tools, while the path covers the full evaluation landscape across providers.