Every major AI lab shipped the same primitive in the last six weeks. Anthropic added /goal to Claude Code. OpenAI shipped it inside Codex CLI and the Codex desktop app. Nous Research wired it into Hermes. The naming is consistent on purpose; this is the industry settling on a shared interface for one thing: an agent that runs in a closed loop until a measurable end-state is reached, without asking you for permission every step.
If you have been doing the manual “approve, send prompt, tell the agent to continue, repeat” dance, /goal is the slash command that ends it. You hand the agent a target, it works against the target, and it returns when the target is hit.
This guide is for developers and API builders. We will cover what /goal actually does under the hood, how to set it up in Codex and Claude Code, a prompt structure that produces real results instead of infinite loops, and how to plug the whole thing into your API workflow using Apidog.
Download Apidog for free if you want to follow along with the API examples later in the guide.
What /goal actually does
In one line: /goal lets an AI agent loop on a task until a stop condition fires, without round-tripping to you for approval.
The mechanic underneath is simple. A small, fast validator model runs after every step the main agent takes and answers one question: “Has the goal been met?” If no, the main model keeps going. If yes, the loop closes and the agent reports back. This is the same pattern the “Ralph loop” popularized in early 2026, except now it ships as a first-class command inside the official tools.
The contrast with normal agent use:
- Without
/goal: you are the loop. You read output, decide if it is correct, prompt the next step, approve a tool call, and so on. Every iteration costs you attention. - With
/goal: the agent owns the loop. It plans, executes, self-validates, and only surfaces when it finishes, hits a constraint, or runs out of budget.
A concrete example: telling Claude Code /goal create a landing page triggers research, scaffolding, styling, debugging, and a final preview, all in one continuous run. You walk away, come back, and either ship it or iterate.
Why this is suddenly everywhere
The reason /goal is shipping across vendors right now is that long-horizon agent tasks were failing in two predictable ways:
- Drift. Without a validator checking against the original target, models would wander off and produce confident-but-wrong output.
- Babysitting. Even when the model could do the work, users had to chaperone every iteration, which defeated the point of an agent.
A second validator model fixes both. It is cheap (small model, narrow prompt), and it gives the loop a hard stop condition. That is the entire trick. Once the labs realized this pattern worked, they all shipped it under the same name within weeks of each other.
Setting up /goal in Codex
The Codex CLI gives you the most control. Here is the minimum setup:
- Enable goals in the desktop app: open Codex desktop, go to Settings → Configuration, and set
goals = true. The CLI inherits this. - Launch the CLI in full-auto mode so you stop seeing approval prompts:
codex --approval-mode full-auto
- Set a goal:
/goal [your goal here]
That is it. Codex will print a confirmation that the goal is registered, then start running.

If you are non-technical, start inside the Codex desktop app instead of the CLI. The functionality is the same, but you get a UI for pausing, clearing, and watching token usage.
Setting up /goal in Claude Code
The Claude Code CLI works almost identically. Launch the CLI, type /goal, and follow it with the task description. Official docs live at the Claude Code documentation site.

If you are running into setup or config errors when launching Claude Code, the invalid custom3p enterprise config fix covers the most common failure mode. For a deeper look at how to drive Claude Code with multi-agent workflows alongside /goal, see our breakdown of Ruflo, a multi-agent layer on top of Claude Code.
One tip that is easy to miss: /goal shows you the live token count and a progress bar for the running task inside Claude Code. Watch the token count, not just the output. A goal that is burning tokens without progress is a sign the validator is failing to converge, and you should hit /pause or /goal clear.
The prompt structure that actually works
The syntax for /goal is trivial. The hard part is writing a prompt that produces a usable result instead of an agent that runs for two hours and hands you something subtly wrong.
Every effective /goal prompt has three components:
- The work: what you want done, in one line.
- The measurable end state: what “done” looks like, in a form the validator can check.
- The constraints: rules that must hold throughout.
The skeleton:
/goal [do the work] until [measurable end state] without [constraints that must hold]
A real example for a coding task:
/goal fix every failing test until npm test exits 0 without modifying any file outside the /auth directory
The end state is verifiable (npm test exit code), and the constraint is a hard boundary the validator can enforce on every iteration. The agent cannot fake completion because the validator runs the test command.
For ambiguous tasks (“make this UI feel modern”), /goal performs badly because the end state is not measurable. Either rewrite the goal to be measurable (“until Lighthouse accessibility score is 90+”), or stick with a normal prompt.
Advanced structure for longer tasks
For larger goals, expand the skeleton into four blocks:
/goal
Objective: [one-line goal]
Success criteria:
- [measurable criterion 1]
- [measurable criterion 2]
Constraints:
- [boundary 1]
- [boundary 2]
Context:
- [files, repos, API keys the agent should know about]
This format gives the validator concrete things to check on every loop iteration. Without success criteria, the validator falls back to a fuzzy semantic match, which is where drift comes from.
Examples worth stealing
/goal is not just for writing code. A few patterns that work well:
Research
/goal collect every public benchmark for Claude Opus 4.7 published since April 2026, save sources, and produce a markdown table sorted by date until the table covers at least 10 distinct benchmarks
Repo maintenance
/goal find dead code, unused dependencies, and stale files in this repo, then propose a PR description listing safe removals until every item has a justification
Documentation
/goal rewrite README.md so a new contributor can install, run, test, and understand the project until each of those four steps has a working command and an expected output
Feature work
/goal add a dark/light theme toggle, persist the choice in localStorage, update styles for both themes, and verify in the browser until the toggle works without a page reload and survives a refresh
The common pattern: every example pins down a verifiable end state. That is the line between a goal that finishes and a goal that runs in circles.
Pairing /goal with API development workflows
Most coverage of /goal so far has been about generic coding tasks. The more interesting use case for backend and platform engineers is API work, where the end state is almost always testable.
API endpoints are perfect for /goal because “done” is unambiguous: the request returns 200, the response schema matches, and the contract is documented. You can write a goal that says “make this endpoint pass its tests” and the validator has a concrete signal to read.
A workflow that holds up in practice:
- Design the contract first in Apidog. Define the endpoint, request schema, response schema, and example payloads inside Apidog. This becomes the source of truth.
- Export the spec. Apidog exports OpenAPI 3.x, which you hand to Codex or Claude Code as context.
- Run
/goal. Tell the agent: “implement the endpoint until every Apidog test case passes.” - The validator checks the test runner. Each loop iteration, the validator runs the Apidog CLI tests against the running service. The agent only finishes when every case is green.
This is materially better than letting the agent invent its own tests, because the contract is already locked. The agent cannot ship a passing test suite that misses edge cases the spec covered.
If you have not used Apidog before, the API platform combines design, mocking, testing, and documentation in one tool, which matters here because /goal works best when the validator only has to run one command to check state. Our design-first API workflow guide covers the contract-first setup in detail, and the API testing tool overview for QA engineers shows how to structure the test cases that the agent will iterate against.
If you are working with MCP servers (the protocol most AI coding tools now use to call external tools), the same pattern applies. See MCP server testing with Apidog for the setup that lets /goal agents safely run against your local MCP server.
Pro tips from running /goal in production
A few things you only learn after putting /goal through real work:
- One goal at a time. Both Codex and Claude Code restrict you to a single active goal. Trying to stack them produces weird state. Use
/goal clearbetween runs. - Pair with
/plan. A useful workflow is/planfirst, review the plan, then/goalwith the plan as context. This cuts iteration count in half because the agent does not redesign the approach mid-loop. - Use markdown files as scratchpads. Tell the agent to maintain a
progress.mdfile. You get a readable audit trail and the agent gets persistent context across iterations. - Let the model write its own goal. Drop your rough idea into a normal prompt and ask the model to turn it into a
/goalinvocation with success criteria. The model writes better goal prompts than you do, because it knows what the validator can actually check. - Watch the validator, not the main model. If the loop is not closing, the issue is almost always that the success criteria are unmeasurable. Tighten the criteria, do not retry the same goal.
/goalis for long-horizon work. For a one-line refactor, a normal prompt is faster. The autonomous loop has overhead.
When /goal will let you down
Honest limitations to keep in mind:
- Cost. A loop that runs for an hour burns more tokens than the same task done manually. Set a budget.
- Tasks with no signal. UX polish, prose tone, design taste; none of these have a clean validator. The agent will either give up or invent a fake stop condition.
- External side effects. A goal that involves sending email, making payments, or calling production APIs needs hard constraints. The agent will not infer caution on its own. If you are still building out access control around AI agents calling your APIs, the GitHub Copilot usage and billing API for teams writeup covers how the major vendors handle this.
- Stale context. Long-running goals can drift from the original spec if the codebase changes mid-loop. Pause and reset rather than letting it continue against old context.
What this means for how you build with AI
/goal is the shift from “AI as autocomplete” to “AI as a worker you brief and check on.” The interface change is small (one slash command), but the implication is large: the work you do as a developer moves toward writing better success criteria and constraints, and away from typing the actual lines of code.
The teams that get the most out of this are the ones who already have testable contracts, strong CI, and clear specs. If your API has a defined OpenAPI document and a test suite, you can hand a /goal agent an endpoint and a deadline. If your API only exists in someone’s head, the agent has nothing to validate against and the loop falls apart.
This is where API platforms become load-bearing infrastructure for AI workflows. Apidog is built around design-first API development, and that becomes a lot more useful when the agent doing the implementation can read your spec and check its own work against your test cases. Download Apidog if you want to set up the contract-first workflow described above.
FAQ
Does /goal work in the Codex web app? Yes. It works in Codex CLI, Codex desktop, the Codex app, and Claude Code CLI. Hermes also supports the same command. Feature parity across vendors is the point.
How is /goal different from a regular prompt? A regular prompt runs one turn and stops. /goal runs in a closed loop with a validator model checking for the stop condition after every step. The agent decides when to stop, not you.
Can the agent break out of the constraints I set? The validator enforces constraints on every iteration, so the agent should not violate them. In practice, the looser the constraint phrasing, the more room the agent has to interpret it. Be explicit: “without modifying any file outside /auth” is enforceable; “without breaking anything” is not.
Will /goal cost more than a normal Claude or Codex session? Yes. Expect to spend more tokens. The validator runs on a cheaper, smaller model, but the main model is still doing the work, and it does more of it autonomously. Set a budget or use /pause to control spend.
What if I want to test the agent’s output against a real API? Use a tool like Apidog to lock the API contract and run real test cases against the implementation. The agent’s validator can call the Apidog CLI, which gives you a measurable end state. See the free Claude API guide if you are bootstrapping a Claude-powered service with a limited budget.



