An “Ableton Live MCP” Show HN post climbed to 118 points and 78 comments earlier this week. The pattern is familiar by now: somebody wrote a Model Context Protocol server for an unlikely tool, the Claude Desktop crowd loved it, and a wave of “should I write one for X?” posts followed. MCP went from Anthropic-only experiment to a default agent integration layer in less than a year.
What that growth hides is a hole in tooling: nobody has shipped a clean way to test MCP servers. Hand-running JSON-RPC over stdio with a debugger is fine for hello-world; it falls apart the moment your server has 12 tools, 3 prompts, and a flaky upstream API. This guide gives you a hands-on playbook for testing MCP servers manually and automating those tests with Apidog, so you can ship MCP servers the way you would ship any other API: with a contract, a mock, and a regression suite.
If you are coming from a more general agent context, our agents.md guide pairs well with this; the conventions there make MCP server contracts easier to communicate to your team.
TL;DR
- MCP is the Model Context Protocol from Anthropic; it is JSON-RPC 2.0 over stdio or HTTP and exposes three primitives: tools, resources, and prompts.
- Testing an MCP server means verifying its
initialize,tools/list,tools/call,resources/read, andprompts/getresponses against a contract. - Start manual: drive the server from the command line with stdio, confirm responses by eye, and fix shape bugs before adding clients.
- Move to automated: capture the JSON-RPC traffic in Apidog, turn each call into a saved request, build assertions on response shape and content, and run the suite in CI.
- Use Apidog’s mock server to simulate upstream APIs your MCP server calls, so tests stay deterministic.
- Download Apidog to get the request collection, mock server, and CI runner in one place.
What MCP actually is, in two minutes
The Model Context Protocol specification defines a JSON-RPC 2.0 wire format with a small surface. A client (Claude Desktop, Cursor, your own agent) starts an MCP server, performs an initialize handshake, and then issues calls.
The five calls you will spend 90 percent of your time testing:
initialize: version negotiation and capability disclosure.tools/list: server returns the tools it exposes, including JSON Schema for arguments.tools/call: client invokes a tool by name with arguments.resources/listandresources/read: server exposes readable URI-addressed content.prompts/listandprompts/get: server exposes prompt templates the client can render.
Transport is either stdio (JSON-RPC frames newline-delimited on stdin/stdout) or streamable HTTP (typically POST / with SSE for streaming). Most local servers use stdio; remote servers use HTTP.
Why testing matters: every Claude Desktop user, every Cursor user, every IDE that adds MCP support is going to call your server. Bugs in tools/list shape break every client at once. The cost of a regression is high.
What you should be testing
A solid MCP server test suite covers six dimensions.
Protocol conformance. Does initialize return the right protocolVersion? Does the server advertise the capabilities it actually supports?
Schema correctness. Does each tool in tools/list have a valid JSON Schema for arguments? Are required fields marked? Is the description longer than three words? Empty descriptions break tool selection on Claude.
Tool behavior. For each tool, does tools/call return content blocks of the right type (text, image, resource)? Do error cases return an isError: true result rather than throwing JSON-RPC errors?
Resource access. Do resources/list URIs resolve when called via resources/read? Does pagination work past the first page?
Prompt rendering. Do prompts return well-formed messages arrays? Do argument substitutions land in the right places?
Failure modes. What happens when an upstream API is down? When a tool argument is missing? When the client times out? These are the bugs that show up in production, not at hello-world.
The rest of this guide walks through each of these, manual first, then automated.
Manual testing with stdio
Start with the simplest possible setup: a terminal, your server binary, and the MCP inspector or raw JSON-RPC by hand.
If you have not built a server yet, scaffold one with the official MCP SDK quickstart in Python or TypeScript. The two-tool weather example is enough to test against.
Run the server in inspector mode:
npx @modelcontextprotocol/inspector node your-server.js
The inspector boots a local web UI that speaks MCP to your server and shows you every request and response. This is the fastest way to confirm the server starts, advertises capabilities, and responds to tools/list.
Once the inspector view looks right, run the same flow with raw stdio so you can capture frames for Apidog:
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2026-04-01","capabilities":{}}}' | node your-server.js
You will get a JSON-RPC response on stdout. Save the request and response. Repeat for tools/list, tools/call, resources/list, and the rest. By the end of this exercise you have 6 to 12 canonical request-response pairs that define the wire-level contract of your server.
Two things to watch.
First, content blocks. A tool result returns content: [{ type: "text", text: "..." }] or content: [{ type: "image", data: "...", mimeType: "image/png" }]. Mixing types in one response is allowed; clients differ on how they render.
Second, errors. The MCP spec is clear: tool execution errors return a normal result with isError: true and a content block describing the failure. Do not throw JSON-RPC error responses from inside a tool; that signals a protocol-level failure, not a tool-level one. Many clients drop the connection on protocol errors.
From manual to automated: building the test suite in Apidog
Manual testing surfaces the obvious bugs. You move to automation when you start asking, “did my last change break tool 7’s argument schema?” and don’t want to type 12 curl commands to find out.
The pattern: take every request-response pair you saved during manual testing, paste it into Apidog as a saved request, add assertions, and run the suite on every push.
1. Create an Apidog project for the MCP server
Open Apidog, create a new project, set the base URL to your MCP server’s HTTP endpoint (or stdio bridge URL; see below). Apidog projects support both REST and JSON-RPC; create a JSON-RPC environment.
For stdio servers that have no HTTP face, run them behind a thin HTTP wrapper for testing. The official inspector ships one; alternatively, a 30-line Node script that reads JSON-RPC over HTTP and forwards to stdio works fine. We use the same pattern in API testing without Postman in 2026 for non-HTTP backends.
2. Save the canonical requests
For each of initialize, tools/list, tools/call, resources/list, resources/read, prompts/list, prompts/get, save the JSON-RPC body as a request. Apidog stores them with body, headers, and expected status.
A tools/call request looks like this in Apidog’s request body view:
{
"jsonrpc": "2.0",
"id": 42,
"method": "tools/call",
"params": {
"name": "get_weather",
"arguments": {
"city": "Tokyo"
}
}
}
3. Add assertions
The point of automation is asserting on the response, not sending the request. Apidog supports JSONPath assertions natively. For tools/list you want at least:
$.result.toolsexists$.result.tools.lengthis greater than zero- Every tool has
name,description, andinputSchema - Every
inputSchemais a valid JSON Schema
For tools/call on a known good input you want:
$.result.isErroris false (or absent)$.result.contentis an array- The first content block has the expected
typeand content
For tools/call on a known bad input (missing required argument) you want:
$.result.isErroris true$.result.content[0].textmatches your expected error message
Apidog stores these assertions per request. Failures show in the run report.
4. Mock the upstream APIs
Most MCP servers wrap an external API: weather data, GitHub, Linear, an internal database. You do not want CI runs to hit live APIs every commit. Two reasons: cost and flakiness.
Apidog’s built-in mock server fixes this. Define each upstream endpoint as a mocked route returning a realistic JSON body. Point your MCP server’s config at the mock URL during tests and at production during real runs. We cover the mock workflow in detail in contract-first API development.
The result: a test suite that runs in seconds, requires no external network, and catches schema regressions long before they ship.
5. Run the suite in CI
Apidog projects export to CLI runners. The apidog run command takes a project ID, runs every saved request, evaluates assertions, and exits non-zero on failure. Wire it into your GitHub Actions or any CI provider you already use.
A minimal workflow:
name: MCP server tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 22 }
- run: npm ci
- name: Start MCP HTTP wrapper
run: node test/wrapper.js &
- name: Run Apidog suite
run: npx apidog run --project-id $APIDOG_PROJECT --env ci
env:
APIDOG_PROJECT: ${{ secrets.APIDOG_PROJECT }}
APIDOG_TOKEN: ${{ secrets.APIDOG_TOKEN }}
Every push runs the entire MCP contract. Tool 7’s schema regression cannot land without surfacing.
What good test coverage looks like
A complete MCP server test plan in Apidog typically includes:
- 1
initializerequest with capability assertions - 1
tools/listrequest with shape and JSON Schema assertions - 2 to 4
tools/callrequests per tool: happy path, missing argument, invalid type, upstream error - 1
resources/listplus 1resources/readper resource family - 1
prompts/listplus 1prompts/getper prompt template
For a 10-tool server with 3 resources and 4 prompts, the suite hits 50 to 70 requests. Apidog runs that locally in under 10 seconds with the mock server warm.
Common mistakes when testing MCP servers
These are the patterns we see most often.
Skipping the initialize round trip. Several servers crash on tools/list if initialize was never called because they lazy-build their tool registry inside the handshake. Always run initialize first.
Asserting on raw error strings. Tool failure messages will change. Assert on isError: true and on a stable error code or a regex, not on exact string matches.
Letting the mock drift from production. A mock that returns shapes the real API never returns gives you green tests on a broken integration. Re-record mock fixtures from real responses every release.
Forgetting streaming. HTTP MCP servers stream tool results over SSE. Your test runner must handle SSE; Apidog does, but you have to enable streaming on the request.
Not testing concurrency. MCP clients send concurrent tools/call requests in agent loops. If your server holds shared state without locks, single-request tests pass and production breaks. Add a parallel-run test to the suite.
Confusing protocol errors with tool errors. The MCP spec separates them on purpose; mixing them up makes Claude Desktop close the connection. We covered the same kind of contract bug in API platform contract-first development.
Real-world use cases
A team building an internal MCP server for their company’s incident management API caught three regressions in one week using Apidog assertions on tools/list shape. Without the test, the bugs would have shipped to every engineer using Claude Desktop simultaneously.
A solo developer publishing an open-source MCP server for Notion uses Apidog mocks to run the test suite without hitting Notion’s rate limits during CI. Their suite runs on every PR, takes 8 seconds, and caches Apidog’s mock fixtures in the repo so contributors do not need API access to develop.
A platform team running 14 internal MCP servers built a shared Apidog workspace where every server’s contract lives. New servers inherit a base test suite; reviewers can compare schema diffs side-by-side before merging. The team reports two outages prevented in the first quarter, both caught by tools/list shape assertions on a PR that would have shipped a renamed argument to every Claude Desktop user inside the company.
A second team building an MCP server for an internal observability platform uses Apidog’s environment switcher to run the same suite against staging and production. Each environment points at a different mock fixture file, so the same 60 assertions confirm both deployments without rewriting a single request.
Conclusion
MCP went mainstream this year, but the testing story is still where REST API testing was a decade ago: ad hoc, manual, fragile. You do not have to wait for the ecosystem to catch up. Treat your MCP server as the API it is, design a contract, mock the upstreams, and run assertions in CI.
Five takeaways:
- An MCP server is a JSON-RPC API; test it with the same rigor as a REST API.
- Start manual with the official inspector, then capture canonical requests and move to automation.
- Apidog handles JSON-RPC, assertions, mocks, and CI in one project.
- Cover the six dimensions: protocol conformance, schema correctness, tool behavior, resource access, prompt rendering, and failure modes.
- Mock upstream APIs in Apidog so the test suite stays fast and deterministic.
Next step: open Apidog, create a project, paste the request bodies you captured manually, add JSONPath assertions for tools/list, and run the suite. You will know within an hour whether your server’s contract is solid enough to ship.
FAQ
What is MCP?
MCP, the Model Context Protocol, is Anthropic’s open spec for how AI clients (like Claude Desktop) call external tools, resources, and prompts. It is JSON-RPC 2.0 over stdio or streamable HTTP. The full MCP specification is published on modelcontextprotocol.io.
Can I test an MCP server without an HTTP wrapper?
Yes. The official MCP inspector speaks stdio directly and gives you a UI for manual testing. For automated testing in Apidog, wrap stdio in a thin HTTP server during CI; production traffic still goes over stdio.
How do I mock upstream APIs that my MCP server calls?
Define each upstream endpoint as a mock in your Apidog project, point the MCP server’s config at the mock URL during tests, and switch to production URLs at runtime. We walk through the same pattern in API testing tools for QA engineers.
What about streaming tool results?
HTTP MCP servers stream tool results over Server-Sent Events. Apidog supports SSE in saved requests; turn it on in the request settings and assert on the assembled stream.
Should I test the protocol version?
Yes. Pin the protocolVersion you support in initialize and assert against it. Mismatches cause silent client incompatibility.
Can I test my MCP server against real Claude Desktop?
You can, and you should at least once before each release. But do not rely on Claude Desktop as your test loop; it is slow, manual, and non-deterministic. Use Apidog for the regression suite and Claude Desktop for the smoke test.
Where can I see real MCP server examples?
The official MCP servers repository has reference implementations for filesystem, GitHub, Slack, Postgres, and dozens more. Read the tool definitions; they are the easiest way to understand what good MCP shape looks like.



