---
name: write-prompt-gpt-5
description: Guidelines for writing and reviewing prompts for GPT-5 (and 5 mini/nano). Use when creating or modifying any prompt that runs on GPT-5 — reasoning-heavy agents, research/planning turns, multi-step tool use, content generation. For prompts targeting GPT-4.1, use write-prompt-gpt-4-1 instead.
---

# Write Prompt — GPT-5 Prompt Engineering

**Use this skill for any prompt that runs on GPT-5 (including 5 mini and nano).**

GPT-5 is a reasoning model. It plans across multiple tool calls, weighs alternatives, and follows nuanced instructions. That makes it powerful, but it also means a sloppy prompt costs more here than on other models: GPT-5 will burn reasoning tokens trying to reconcile anything you leave contradictory or vague. This is a different discipline to prompting Claude or GPT-4.1.

Anchor source: OpenAI's [GPT-5 prompting guide](https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide). Verbatim OpenAI quotes are marked throughout.

---

## Core Principle: GPT-5 is surgical about prompts, but punished by contradictions

**OpenAI:** *"GPT-5 follows prompt instructions with surgical precision, which enables its flexibility to drop into all types of workflows."*

**OpenAI:** *"poorly-constructed prompts containing contradictory or vague instructions can be more damaging to GPT-5 than to other models."*

This is the single most important difference from GPT-4.1: GPT-5 will reason hard to reconcile any conflict you create. Two rules that overlap, or a rule with hidden exceptions, costs reasoning tokens AND degrades output. The audit lever for GPT-5 prompts is **contradictions**, not density.

---

## GPT-5 Prompting Principles

### 1. Resolve contradictions explicitly
**OpenAI:** *"poorly-constructed prompts containing contradictory or vague instructions can be more damaging to GPT-5 than to other models."*

If there are exceptions to a rule, state them explicitly as exceptions, not as separate competing rules.

- Bad: "Always check both sources." + later: "If the user names a source, use only that one."
- Good: "Check both sources. Exception: if the user names a source, use only that one."

The audit pass for any GPT-5 prompt: read every rule and ask "is there another rule anywhere in this prompt that contradicts this without an explicit exception?"

### 2. Core instructions first, edge cases after
GPT-5 weights the beginning of the system prompt more heavily. Put identity, behaviour, and core rules at the top. Tool-specific details and edge cases go later.

- GPT-4.1: weights the end of the prompt → repeat key rules at the end.
- GPT-5: weights the beginning of the prompt → core rules first.

### 3. Plan before each tool call, reflect after
**OpenAI:** *"You MUST plan extensively before each function call, and reflect extensively on the outcomes of the previous function calls."*

For tool-heavy turns, this `MUST` emphasis is OpenAI-endorsed (canonical phrase). Outside this published phrase, avoid emphasis markers (see Principle 9).

### 4. Define clear exploration criteria
**OpenAI:** *"Define clear criteria in your prompt for how you want the model to explore the problem space. This reduces the model's need to explore and reason about too many ideas... Parallelize discovery and stop as soon as you can act."*

For multi-step turns (research, complex queries, multi-tool searches), state when the model has enough data to act. Without this, GPT-5 keeps exploring — latency and cost both climb.

### 5. No few-shot input/output examples
**Opposite of GPT-4.1.** GPT-5 performs better with clear descriptive instructions than with input/output examples ("User says X → respond Y"). Few-shot examples reduce reasoning performance.

- Bad: "Instead of 'Task complete' say 'Done — anything else?'"
- Good: "Your default is direct and helpful. Keep replies tight."

### 6. No numbered step-by-step tool sequences
GPT-5 plans multi-step tool calls itself. Numbered instructions ("1. First call X, 2. Then call Y") are unnecessary and constrain it. List what data is needed, not the exact sequence.

- Bad: "1. Weather — call get_weather. 2. Calendar — call list_events with today's date."
- Good: "Gather: weather for the user's city, today's calendar events, today's tasks."

### 7. Structured specs with XML
**OpenAI:** *"using structured XML specs like `<[instruction]_spec>` improved instruction adherence."*

For complex instruction sets (multi-rule sections, behaviour contracts), wrap them in XML-style tags. GPT-5 parses these reliably as a unit.

```
<behaviour_spec>
- Default to direct, helpful replies
- Confirm completed tool calls by quoting from the result
- ...
</behaviour_spec>
```

Don't apply this to every section — use it where you have a tight ruleset that benefits from being parsed as a unit.

### 8. API-level controls
GPT-5 exposes two parameters that influence behaviour without changing the prompt:

- `reasoning_effort` — `"low"` reduces exploration depth and improves latency; `"medium"` (default) for typical turns; `"high"` for complex multi-step planning.
- `verbosity` — influences final answer length. Set it explicitly for prompts that need a specific output length (summaries, reports).

Set these at the API call site, not in the prompt. Prompts describe behaviour; parameters describe effort.

### 9. No emphasis markers
GPT-5 follows instructions precisely without needing CRITICAL, IMPORTANT, or **bold** emphasis. These are counter-productive — they burn reasoning tokens trying to assess relative priority.

- Bad: "**CRITICAL**: You MUST call the tools listed above"
- Good: "Call the tools listed above to gather data before composing your response."

**Exception:** the OpenAI-published *"You MUST plan extensively before each function call"* (Principle 3). Use that verbatim; avoid emphasis markers elsewhere.

### 10. Positive framing over negative
Positive framing reduces the reasoning tax.

- Bad: "NEVER generate data from memory. Do NOT fabricate information."
- Good: "Use only data from tool results. If a tool fails, say so."

### 11. Suppress tool preambles
GPT-5 naturally generates explanatory text before tool calls ("Let me check your calendar..."). When that text is noise, include: *"When using tools, call them directly without narrating what you're about to do."*

### 12. Match formatting to the output surface
State the formatting rules for wherever the output lands (chat, email, a messaging app, a document). Avoid asking for markdown the surface can't render. Keep one formatting block and reuse it across prompts rather than re-describing it each time.

---

## Prompt Structure (GPT-5)

```
1. Identity (who the assistant is, personality)
2. Behaviour (core rules: conciseness, tool usage, memory)
3. Context (current time, memory/preferences, recent state)
4. Tool-specific instructions (only when that capability is in play)
5. Special capabilities (voice, etc.)
6. Formatting (output-surface rules)
```

Key points:
- Identity and behaviour come first — GPT-5 weights the beginning.
- Include tool-specific sections only when that capability is relevant.
- No end-of-prompt anchor needed (GPT-5 doesn't need the duplication GPT-4.1 does).
- Tool-preamble suppression appears with the tool-calling encouragement, not as a separate rule.

---

## Research / Multi-Step Turn Guidelines

For research or data-gathering turns:
- List what data to gather, not step-by-step tool sequences.
- Define exploration criteria — when the model has enough data to stop (Principle 4).
- Include a format template with section headers and placeholder text.
- Use "Use only data from tool results" instead of anti-hallucination emphasis.
- Name common confusion points explicitly (e.g. "calendar events and tasks are different things").

---

## Common Failure Modes

**Reasoning loop on contradictions.** Two rules that overlap, or a rule with hidden exceptions, sends GPT-5 chasing reconciliation. Fix: Principle 1.

**Latency spikes from over-exploration.** Without clear exploration criteria, GPT-5 keeps calling tools or weighing alternatives. Fix: Principle 4.

**Tool preamble noise.** GPT-5 narrates before tool calls by default. Fix: Principle 11.

**Over-engineering simple turns.** Heavy `reasoning_effort` on a one-tool turn wastes time. Fix: Principle 8 — match effort to turn complexity.

**Few-shot examples degrading reasoning.** Examples written for a GPT-4.1 surface can hurt GPT-5 when prompts are shared across model tiers. Fix: keep examples out of GPT-5 prompts.

---

## Checklist

Before finalising any GPT-5 prompt:

- [ ] Contradictions resolved with explicit exceptions
- [ ] Core rules appear before edge cases (GPT-5 weights the beginning)
- [ ] Clear exploration criteria for multi-step turns (when to stop)
- [ ] Planning instruction included for tool-heavy turns (OpenAI canonical phrase)
- [ ] No CRITICAL/IMPORTANT/NEVER emphasis markers (except the OpenAI canonical `MUST`)
- [ ] All instructions use positive framing
- [ ] No few-shot input/output examples
- [ ] No numbered step-by-step tool sequences
- [ ] XML specs (`<...spec>`) used for tight, multi-rule sections that benefit from being parsed as a unit
- [ ] Tool-preamble suppression included for agent-facing prompts
- [ ] One reusable formatting block matched to the output surface
- [ ] `reasoning_effort` and `verbosity` set at the API call site to match turn complexity
- [ ] Prompt is concise — every sentence earns its place

---

## Sources

- [GPT-5 prompting guide (OpenAI Cookbook)](https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide)
- [A Practical Guide to Building Agents (OpenAI, PDF)](https://cdn.openai.com/business-guides-and-resources/a-practical-guide-to-building-agents.pdf)
- [Function calling guide (OpenAI)](https://platform.openai.com/docs/guides/function-calling)
- [Structured outputs guide (OpenAI)](https://platform.openai.com/docs/guides/structured-outputs)