From judgment
Use Judgment for agent tracing, evaluations, code judges, datasets, and monitoring. Use when integrating Judgment or judgeval, adding tracing to agents/workflows, creating evaluations or scorers, debugging traces, or looking up Judgment SDK usage and docs.
How this skill is triggered — by the user, by Claude, or both
Slash command
/judgment:judgmentThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill helps you use Judgment effectively across common agent development workflows: instrumenting applications, evaluating outputs, creating code judges, and looking up current Judgment docs.
This skill helps you use Judgment effectively across common agent development workflows: instrumenting applications, evaluating outputs, creating code judges, and looking up current Judgment docs.
Follow these principles for all Judgment work:
JUDGMENT_API_KEY and JUDGMENT_ORG_ID locally rather than pasting secrets.OfflineTracer traces for each input, then evaluate the generated offline examples in one batch. If the production agent is already traced with Judgment, leave that tracing intact and swap only the test harness initialization to client.offline_tracer(...).Creates bite-sized, testable implementation plans from specs or requirements, with file structure and task decomposition. Activates before coding multi-step tasks.
npx claudepluginhub judgmentlabs/skills --plugin judgment