---
name: write-prompt-gpt-4-1
description: Guidelines for writing and reviewing prompts for GPT-4.1 (and 4.1 mini/nano). Use when creating or modifying any prompt that runs on GPT-4.1 — agents, tool-callers, classifiers, extractors. For prompts targeting GPT-5, use write-prompt-gpt-5 instead.
---

# Write Prompt — GPT-4.1 Prompt Engineering

**Use this skill for any prompt that runs on GPT-4.1 (including 4.1 mini and nano).**

GPT-4.1 is not a reasoning model. It follows instructions literally and precisely, which makes it fast and steerable but means it will not infer intent the way a reasoning model does. If you want a behaviour, state it. This is a different discipline to prompting Claude or GPT-5.

Anchor source: OpenAI's [GPT-4.1 prompting guide](https://cookbook.openai.com/examples/gpt4-1_prompting_guide). Verbatim OpenAI quotes are marked throughout.

---

## Core Principle: GPT-4.1 follows instructions literally

**OpenAI:** *"GPT-4.1 is trained to follow instructions more closely and more literally than its predecessors, which tended to more liberally infer intent from user and system prompts."*

If you want a behaviour, state it explicitly. 4.1 will not infer, guess, or fill gaps.

- Reasoning model: "Handle the user's request" → the model infers it should call tools, check context, compose a reply.
- GPT-4.1: "Handle the user's request" → the model may respond from context without calling tools.

---

## GPT-4.1 Prompting Principles

### 1. Explicit tool-calling encouragement (essential)
Without this, 4.1 will answer from system-prompt context instead of calling tools — hallucinating data.

**OpenAI's canonical phrasing:** *"If you are not sure about file content or codebase structure pertaining to the user's request, use your tools to read files and gather the relevant information: do NOT guess or make up an answer."*

Adapt to your domain: *"Use your tools to look up data before responding. If a tool returns no results, say so."*

### 2. Plan before each tool call, reflect after
**OpenAI:** *"You MUST plan extensively before each function call, and reflect extensively on the outcomes of the previous function calls."*

This is the model's substitute for native reasoning. Include it for any tool-heavy turn. It is the only place in 4.1 prompting where a `MUST` emphasis marker is endorsed by OpenAI — outside this published phrase, avoid emphasis markers (see Principle 10).

### 3. Forced tool calls need an ask-don't-act escape valve
When `tool_choice: "required"` forces a call, 4.1 will hallucinate tool inputs if the user's intent is ambiguous. This is the failure mode behind "user said 'hey', agent created a reminder."

**OpenAI's mitigation, verbatim:** *"If told 'you must call a tool before responding to the user,' models may hallucinate tool inputs or call the tool with null values if they do not have enough information. Adding 'if you don't have enough information to call the tool, ask the user for the information you need' should mitigate this."*

So: any prompt that operates under tool-forcing should promote a clarifying question as the default when intent is ambiguous, not frame it as a special case. Say *"If you don't have enough information to act, ask the user for it — don't guess."*

### 4. Persistence reminder
4.1 can stop prematurely, returning partial results.

**OpenAI's canonical phrasing:** *"You are an agent — please keep going until the user's query is completely resolved, before ending your turn and yielding back to the user."*

### 5. End-of-prompt weighting
When instructions conflict, 4.1 follows the ones closer to the end of the prompt. For long prompts, repeat critical behavioural rules at both the start and the end.

- GPT-5 weights the beginning of the prompt → put core rules first.
- GPT-4.1 weights the end of the prompt → repeat key rules at the end.

### 6. Use examples to show tool selection and routing
Unlike reasoning models (where examples can reduce performance), GPT-4.1 benefits from examples. Use them to show:
- When to call which tool (especially when two tools could apply)
- What parameters to use for different inputs
- The decision path: "user says X → call tool Y, user says Z → call tool W instead"

**OpenAI on placement:** *"If your tool is particularly complicated and you'd like to provide examples of tool usage, we recommend that you create an `# Examples` section in your system prompt and place the examples there, rather than adding them into the 'description' field, which should remain thorough but relatively concise."*

When two tools could be confused, an example is more effective than a longer description.

### 7. Tool descriptions are the primary interface
4.1 relies heavily on the `name` and `description` fields in tool definitions. Make them unambiguous. State what the tool is for AND when to use it. If two tools could be confused, make the description explicit — e.g. "Add a to-do item to the user's task list. Tasks are action items and reminders — distinct from calendar events."

Keep descriptions concise. Examples belong in the system prompt's `# Examples` section (see Principle 6).

### 8. Tool overlap matters more than tool count
The metric for "too many tools" is overlap, not count.

**OpenAI (A Practical Guide to Building Agents):** *"The issue isn't solely the number of tools, but their similarity or overlap. Some implementations successfully manage more than 15 well-defined, distinct tools while others struggle with fewer than 10 overlapping tools."*

Audit for sibling tools answering the same shape with different filters — those are real consolidation candidates. Before adding routing prose to the system prompt telling the model which sibling to pick, ask whether the siblings should be one tool with parameters. Also, only give 4.1 the tools relevant to the current task — unrelated tools cause confusion.

### 9. Positive framing over negative
"Do X" beats "don't do X".

### 10. No emphasis markers (with one exception)
4.1 follows CRITICAL/MUST/NEVER too strictly, causing rigid behaviour, and over-indexes on emphasised instructions at the expense of others. Use plain language with equal weight.

**Exception:** OpenAI itself uses `MUST` in the planning instruction (Principle 2). Use it there verbatim; avoid emphasis markers elsewhere unless quoting an OpenAI-published canonical phrase.

### 11. Chain-of-thought (use selectively)
4.1 does not reason internally. For non-trivial decisions, prompting "think step by step about which tool to call" can help — but it adds output tokens and latency. Use it only when the task is genuinely ambiguous (e.g. choosing between two similar tools). Skip it for straightforward tool calls. The planning instruction (Principle 2) is the lighter, OpenAI-blessed version for tool-heavy turns.

### 12. Structured output
When you need a specific format, state it explicitly. 4.1 follows formatting instructions precisely but won't infer the format you want. For classifiers, request JSON explicitly (e.g. `response_format: { type: "json_object" }`).

### 13. Parallel tool calls
4.1 supports calling multiple tools in a single round. This works well in most cases, but rare edge cases with incorrect parallel calls have been observed. Set `parallel_tool_calls: false` if issues arise.

---

## Prompt Structure (GPT-4.1)

```
1. Identity (brief — personality is secondary for tool-calling)
2. Behaviour (concise core rules)
3. Tool-calling instructions ("use tools, respond from results"; planning instruction)
4. Tool-specific instructions (only the relevant tools, plus the ask-don't-act escape valve if tool-forced)
5. Context (memory, recent state)
6. # Examples (worked examples for ambiguous tool selection)
7. Formatting (your output format rules)
8. Repeat: tool-calling + persistence reminders (end-of-prompt anchor)
```

Key points:
- Identity is brief — 4.1 doesn't need a full personality prompt for tool-calling.
- Tool-calling instructions appear early AND late (dual anchoring).
- Only include the tools relevant to the task.
- Examples live in their own section, not in tool descriptions.

---

## Common Failure Modes

**Answering without tools.** 4.1 sees relevant data in the system prompt and responds from that instead of calling tools. Fix: Principle 1.

**Cross-tool confusion.** Given tools from multiple domains, 4.1 may pick the wrong one. Fix: Principle 8 — only expose relevant tools.

**Premature stopping.** 4.1 completes one action but stops before handling the rest of a multi-action request. Fix: Principle 4.

**Over-literal instruction following.** 4.1 follows emphasis markers (MUST, ALWAYS, NEVER) so strictly it becomes rigid. Fix: Principle 10.

**Hallucinating tool inputs under tool-forcing.** Told "you must call a tool" without a safety valve, 4.1 calls a tool with made-up arguments. Fix: Principle 3.

**Wrong tool from the same domain.** When two tools overlap, 4.1 picks on surface-level keyword matching rather than reasoning about which fits. Fix: Principle 6 (examples) + Principle 8 (consolidate overlap).

---

## Checklist

Before finalising any GPT-4.1 prompt:

- [ ] Explicit tool-calling encouragement included ("use tools to look up data, respond from results")
- [ ] Planning instruction included for tool-heavy turns ("plan before each function call, reflect after")
- [ ] Ask-don't-act escape valve present when running under `tool_choice: "required"`
- [ ] Persistence reminder included ("keep going until the request is fully handled")
- [ ] Key instructions repeated at the end of the prompt
- [ ] Tool descriptions are unambiguous
- [ ] Tool surface audited for overlap (sibling tools collapsed before adding routing prose)
- [ ] Only the relevant tools are exposed
- [ ] No emphasis markers (CRITICAL, MUST, NEVER) except OpenAI-published canonical phrases
- [ ] All instructions use positive framing ("do X" not "don't do X")
- [ ] Contradictions resolved with explicit exceptions
- [ ] Examples included for ambiguous tool selection, in a `# Examples` block (not in tool descriptions)
- [ ] Output format rules stated explicitly
- [ ] Prompt is concise — every sentence earns its place

---

## Sources

- [GPT-4.1 prompting guide (OpenAI Cookbook)](https://cookbook.openai.com/examples/gpt4-1_prompting_guide)
- [A Practical Guide to Building Agents (OpenAI, PDF)](https://cdn.openai.com/business-guides-and-resources/a-practical-guide-to-building-agents.pdf)
- [Function calling guide (OpenAI)](https://platform.openai.com/docs/guides/function-calling)
- [Structured outputs guide (OpenAI)](https://platform.openai.com/docs/guides/structured-outputs)
