If your API works fine for one user but falls over under traffic, you need load testing, and k6 is one of the cleanest ways to do it. This guide covers what k6 is, how to install it, how to write your first script, and how to read the results, so you can treat load testing as part of your normal API performance testing routine. We’ll also look at how k6 fits alongside functional testing in CI, drawing on the official k6 documentation where the details matter.
What is k6?
k6 is an open-source load testing tool, now maintained by Grafana. You write your test as a JavaScript file, k6 runs it with a fast Go engine, and it hammers your endpoints with simulated traffic. The split is deliberate: you author tests in a language most developers already know, but the load generator itself runs as compiled Go, so a single machine can drive a lot of virtual users without choking.

k6 is built for one job and does it well: generating sustained, repeatable load and measuring how your system responds. It reports latency percentiles, request rates, error rates, and lets you set pass/fail rules on those numbers. That focus is the point. k6 is not an API client, a documentation tool, or a functional test framework. It’s a load engine.
A few terms you’ll meet constantly:
- Virtual user (VU): a simulated user running your script in a loop. More VUs means more concurrent load.
- Iteration: one full pass through your test function. A VU runs iterations back to back.
- Stage: a step in a load profile, used to ramp VUs up or down over time.
- Threshold: a pass/fail rule on a metric, like “95th percentile latency must stay under 500ms.”
- Check: a non-fatal assertion on a response, like “status was 200.” Failed checks get counted, but the test keeps running.
Installing k6
k6 ships as a single binary, so installation is short. On macOS with Homebrew:
brew install k6
On Windows with Chocolatey:
choco install k6
On Debian or Ubuntu, add the Grafana apt repository and install:
sudo gpg -k
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg \
--keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" \
| sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6
Confirm it works:
k6 version
A Docker image is also available if you’d rather not install anything locally. Check the install page in the docs for the current commands, since package details shift over time.
Your first k6 script
A k6 test is a JavaScript module with a default function. k6 calls that function once per iteration, per VU. Here’s a minimal script that hits one endpoint and checks the response:
import http from 'k6/http';
import { check, sleep } from 'k6';
export default function () {
const res = http.get('https://test-api.example.com/users');
check(res, {
'status is 200': (r) => r.status === 200,
'body is not empty': (r) => r.body.length > 0,
});
sleep(1);
}
Save it as script.js and run it:
k6 run script.js
By default k6 runs one VU for one iteration. That sleep(1) adds a one-second pause between iterations, which mimics a real user pausing between actions. Without sleep, each VU loops as fast as the network allows, which is useful for raw throughput tests but unrealistic for user-behavior simulation.
The check() calls are soft assertions. A failed check shows up in the summary but doesn’t stop the run. That’s intentional. Under heavy load you expect some failures, and you want the test to keep measuring so you can see how bad it gets.
VUs, stages, and thresholds
The first script runs a single user once. Real load testing is about controlling how many users hit your API and for how long. You configure that with an exported options object.
The simplest form sets a fixed number of VUs and a duration:
export const options = {
vus: 50,
duration: '30s',
};
That runs 50 virtual users for 30 seconds. More useful is a ramping profile built from stages, which lets you simulate traffic climbing, holding, and dropping:
export const options = {
stages: [
{ duration: '1m', target: 100 }, // ramp up to 100 VUs
{ duration: '3m', target: 100 }, // hold at 100 VUs
{ duration: '1m', target: 0 }, // ramp down to 0
],
thresholds: {
http_req_duration: ['p(95)<500'], // 95% of requests under 500ms
http_req_failed: ['rate<0.01'], // less than 1% errors
},
};
Thresholds are where k6 earns its place in CI. If a threshold fails, k6 exits with a non-zero code. That means a pipeline step can fail the build when latency or error rates cross a line you set. You’re encoding a performance budget as code, the same way you’d encode a functional assertion.
A quick map of the common load profiles and the question each one answers:
| Profile | Goal | What it tells you |
|---|---|---|
| Smoke | Tiny load, verify the script runs | The test itself is correct |
| Load | Expected normal traffic | Does the API hold up day to day |
| Stress | Push past expected peak | Where does it start to break |
| Spike | Sudden sharp jump in VUs | Can it survive a traffic surge |
| Soak | Moderate load over hours | Memory leaks, slow degradation |
You don’t need all five. Start with smoke and load. Add stress and spike once you know your normal numbers. For a broader survey of approaches and the metrics behind them, the performance testing fundamentals hold across every tool, not just k6.
Reading k6 results
When a run finishes, k6 prints a summary to the terminal. The lines that matter most:
- http_req_duration: total request time, shown as average, min, max, median, p90, and p95. The p95 and p99 percentiles tell you what your slowest users actually experience. Averages hide pain; percentiles surface it.
- http_req_failed: the share of requests that failed. Watch how this moves as VUs climb.
- http_reqs: total requests and requests per second. This is your throughput.
- iterations: how many full passes completed, and the rate.
- vus and vus_max: active and peak virtual users.
- checks: the pass rate on your
check()assertions.
Read percentiles, not averages. An average response time of 200ms sounds fine until you see a p99 of 4 seconds, which means one in a hundred users waits four seconds. That tail is where users churn.
For anything beyond eyeballing the terminal, k6 can stream results to external outputs. It writes JSON or CSV, and integrates with Grafana dashboards and Prometheus for live, visual analysis during a run. That pairing, k6 plus Grafana, is why you’ll often see the tool called “grafana k6.” For a one-off test the terminal summary is enough; for ongoing monitoring, ship the metrics somewhere you can chart them.
Where k6 fits, and where Apidog fits
k6 is a load engine. It answers “how does my system behave under sustained traffic.” It does not check whether your API returns the right data, matches its contract, or handles auth correctly across every endpoint. Those are functional and contract testing questions, and they need a different tool.
This is the split worth keeping clear. You want both kinds of testing in your pipeline, and they don’t compete:
| Concern | Best handled by | What it answers |
|---|---|---|
| Sustained heavy load, percentiles at scale | k6 | Does it stay fast under traffic |
| Functional correctness, contract, auth | Apidog | Does it return the right thing |
| Regression in CI on every commit | Apidog (apidog run) |
Did this change break an endpoint |
| Performance budgets in CI | k6 thresholds | Did latency or errors cross a line |
Apidog handles the correctness side. You design or import your API, build test scenarios with visual assertions, and run them in CI with apidog run, the same way you’d run a k6 script. The Apidog CLI guide walks through wiring those functional tests into a pipeline. Apidog also includes lighter performance-test features for quick checks, covered in the API performance testing in Apidog walkthrough, but it isn’t a k6-class load generator and isn’t trying to be.
A practical workflow looks like this. On every commit, Apidog runs your functional and contract tests to confirm the API still does what it should. On a schedule or before a release, k6 runs a load profile against a staging environment to confirm the API stays fast under traffic. Correctness gate and performance gate, each with the tool built for it.
If you’re comparing engines before committing, k6 sits next to JMeter, Gatling, and Locust. The load testing tools roundup and this Locust alternative comparison lay out the trade-offs if scripting language or scale changes your pick.
Frequently asked questions
Is k6 free?
Yes. k6 is open source under the AGPL license, and the binary is free to run locally with no cap on virtual users beyond your own hardware. Grafana also offers k6 Cloud, a paid service for running large distributed tests and storing results, but you never have to use it. The core tool covers most teams. If you want to scan other free options first, the load testing tools overview lists what each one gives away.
Do I need to know JavaScript to use k6?
You need basic JavaScript, not deep expertise. Most k6 scripts are a default function, some http.get or http.post calls, a few checks, and an options object. If you can read the examples in this guide, you can write a working test. There’s no build step and no framework to learn, just the k6 API.
What’s the difference between k6 and Apidog for performance testing?
k6 is a dedicated load generator built to drive sustained heavy traffic and report percentiles at scale. Apidog is an API platform focused on design, functional testing, and contract testing, with lighter performance-test features for quick checks. Use k6 when you need real load and CI performance budgets. Use Apidog for correctness, contract validation, and running functional tests on every commit. They solve different problems and work well together.
Can I run k6 in CI/CD?
Yes, and it’s a common setup. k6 exits with a non-zero code when a threshold fails, so any CI system can fail the build on a performance regression. Run k6 run script.js as a pipeline step, point it at a staging environment, and set thresholds for p95 latency and error rate. Pair it with functional tests from apidog run so each commit gets both a correctness check and a load check.
Conclusion
k6 gives you a clean, scriptable way to put real load on your API and measure what happens. Install the binary, write a short JavaScript file, set VUs and stages, add thresholds, and read the percentiles. That’s the whole loop. Keep load testing separate from functional testing, since each answers a different question, and run both in CI so nothing slips through.
For the correctness side of that split, Apidog lets you design, test, mock, and document your API in one place, then run functional tests in CI with apidog run. Download Apidog to pair contract-level confidence with your k6 load runs.



