Sample skill testing result

Your skill review is ready

Your automated skill testing review is ready.

Submission

customer-support-reply-draft-skill.zip

Score: 92/100

Status: Review complete

The submitted instructions provide a clear, well-scoped, and reliable framework for a custom AI skill that drafts first-pass replies to customer complaints about delayed orders. The role, task boundaries, and output expectations are explicitly defined, with strong emphasis on avoiding hallucination and preserving human review. The instructions include detailed input requirements, output structure, edge-case handling, and safety guardrails. Examples and a decision checklist further enhance reliability. Minor improvements could clarify fallback behavior for ambiguous inputs and reinforce escalation triggers, but overall the instructions are operationally complete and well-organized.

Strengths

  • Clear role definition and task boundaries focused on drafting customer support replies without fabricating facts.
  • Explicit instructions to ask for missing information rather than guessing, reducing hallucination risk.
  • Well-structured expected output with multiple sections (summary, inputs used, draft reply, risk checks, next steps).
  • Strong safety guardrails preventing unsupported promises, legal or safety conclusions, and requiring escalation when appropriate.
  • Inclusion of examples and a decision checklist to handle edge cases and ambiguous inputs.
  • Customization guidance that preserves core safety boundaries while allowing workflow tuning.
  • Setup and usage instructions that support reliable deployment in ChatGPT Custom GPT environment.

Issues to review

  • Clarify fallback behavior for ambiguous or borderline inputs (low): While the instructions specify asking one clarifying question if required inputs are missing, they could more explicitly define how to handle inputs that are ambiguous but not strictly missing, or partially conflicting, to avoid inconsistent outputs.
  • Escalation triggers could be more explicitly emphasized (low): The instructions mention escalation for legal, fraud, or safety concerns, but a more prominent or repeated emphasis on this in the main instructions might improve model reliability in sensitive cases.

Recommendations

  • Add explicit guidance on handling ambiguous or partially conflicting inputs beyond simply missing information, e.g., how to prioritize or flag uncertainties.
  • Reinforce escalation instructions by including a dedicated section or callout in the main instructions to ensure the model consistently flags sensitive cases.
  • Consider adding a few more concrete output format examples to illustrate the expected structure and tone in varied scenarios.
  • Include a brief note on how to handle user requests that fall outside the defined scope, to ensure polite refusal and boundary clarity.
Sample Skill Testing Result — Skills Junky