skill

advanced-evaluation

This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment. Use when you need help with advanced evaluation.

KindSkill
Installnpx -y github:anubhavg-icpl/vibe add advanced-evaluation
LicenseCC BY-NC-SA 4.0