Synaptic SkillsSynapticSkills
MarketplaceSkill GraphCriar SkillMCP ServerPlataformaEnterprise
v0.1.0-beta
Voltar ao Marketplace
AgentsAvançado

Eval Audit

porhamelsmu·hamelsmu· v1.0.0 · atualizado em 2026-04-11
82
Score

This skill evaluates and audits code or system configurations. It identifies potential vulnerabilities, inefficiencies, and deviations from established best practices.

llm-evaluationai-auditerror-analysisjudge-validationpipeline-hygieneeval-infrastructure
Linguagens
Python
0Stars
0Forks
0Usos
Fork

Documento do Skill

SKILL.mdeval-audit/workflow
1
Gather Eval Artifacts: — Collect traces, evaluator configs, judge prompts, labeled data, and metrics dashboards.
2
Connect to Infrastructure: — Access artifacts via an observability MCP server or local files.
3
Error Analysis: — Check for systematic error analysis on real or synthetic traces.
4
Evaluator Design: — Inspect evaluator design, focusing on binary pass/fail criteria and specific failure modes.
5
Judge Validation: — Validate LLM judges against human labels using TPR/TNR.
6
Human Review Process: — Evaluate the human review process, ensuring domain expertise and full trace visibility.
7
Labeled Data: — Assess the quantity and quality of labeled data, suggesting sampling strategies.
8
Pipeline Hygiene: — Verify that error analysis is re-run after significant changes and evaluators are maintained.

Telemetria de Agentes

Execuções
0
total
Taxa de Sucesso
0%
últimos 30d
Latência Média
0.0s
p50
Alucinação
0.0%
detecção
Tokens Entrada
0
avg 0/exec
Tokens Saída
0
avg 0/exec

Uso por Plataforma

Skills Relacionados

Compõe comError Analysis
70%
Hebbian Synapse
Composite0.700
w = 0.3·α + 0.5·β + 0.2·γ
83
Similar aAmazon Product Finder
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
84
Similar aX Research
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
79
Similar aBNB Chain MCP Skill
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
80

Árvore do Skill

Eval Audit
eval-audit
Fases Cognitivas6
1.SENSE
2.CONTEXTUALIZE
3.HYPOTHESIZE
4.EVALUATE
5.RECOMMEND
6.REFLECT
Triggers7
audit my LLM evaluation pipelinefind problems in my LLM evalsimprove my LLM evaluation processdiagnose issues with my AI evaluatorscheck my LLM judge promptsvalidate my LLM evaluation setupreview my LLM eval artifacts

Avaliar este Skill

Score Breakdown

⭐Avaliação Humana0%
🤖Sucesso de Agentes0%
🕐Atualidade100%
🔗Saúde de Dependências100%
🕸️Centralidade no Grafo0%
🛡️Segurança50%
CompositeScore = α·Humano + β·Agente + γ·Recência + δ·Deps + ε·Centralidade + ζ·Segurança

Instalação

$ synaptic mcp download eval-audit
$ synaptic skills detail eval-audit
$ synaptic skills live eval-audit

Links

GitHub Repository