Marketplace
grey-haven-evaluation
Evaluate LLM outputs with multi-dimensional rubrics, handle non-determinism, and implement LLM-as-judge patterns. Essential for production LLM systems. Use when testing prompts, validating outputs, comparing models, or when user mentions 'evaluation', 'testing LLM', 'rubric', 'LLM-as-judge', 'output quality', 'prompt testing', or 'model comparison'.
$ Installer
git clone https://github.com/greyhaven-ai/claude-code-config /tmp/claude-code-config && cp -r /tmp/claude-code-config/grey-haven-plugins/core/skills/evaluation ~/.claude/skills/claude-code-config// tip: Run this command in your terminal to install the skill
Repository

greyhaven-ai
Author
greyhaven-ai/claude-code-config/grey-haven-plugins/core/skills/evaluation
15
Stars
2
Forks
Updated1w ago
Added1w ago