How to Implement Pagination in REST APIs (Step by Step Guide)

When building REST APIs that return a list of resources, it's crucial to consider how to handle large datasets. Returning thousands or even millions of records in a single API response is impractical and can lead to significant performance issues, high memory consumption for both the server and the client, and a poor user experience. Pagination is the standard solution to this problem. It involves breaking down a large dataset into smaller, manageable chunks called "pages," which are then served sequentially. This tutorial will guide you through the technical steps of implementing various pagination strategies in your REST APIs.

💡

Want a great API Testing tool that generates beautiful API Documentation?

Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?

Apidog delivers all your demans, and replaces Postman at a much more affordable price!

button

Why is Pagination Essential?

Before diving into implementation details, let's briefly touch upon why pagination is a non-negotiable feature for APIs dealing with collections of resources:

Performance: Requesting and transferring large amounts of data can be slow. Pagination reduces the payload size of each request, leading to faster response times and reduced server load.
Resource Consumption: Smaller responses consume less memory on the server generating them and on the client parsing them. This is especially critical for mobile clients or environments with limited resources.
Rate Limiting and Quotas: Many APIs enforce rate limits. Pagination helps clients stay within these limits by fetching data in smaller pieces over time, rather than trying to get everything at once.
User Experience: For UIs consuming the API, presenting data in pages is much more user-friendly than overwhelming users with an enormous list or a very long scroll.
Database Efficiency: Fetching a subset of data is generally less taxing on the database compared to retrieving an entire table, especially if proper indexing is in place.

Common Pagination Strategies

There are several common strategies for implementing pagination, each with its own set of trade-offs. We'll explore the most popular ones: offset/limit (often referred to as page-based) and cursor-based (also known as keyset or seek pagination).

1. Offset/Limit (or Page-Based) Pagination

This is arguably the most straightforward and widely adopted pagination method. It works by allowing the client to specify two main parameters:

offset: The number of records to skip from the beginning of the dataset.
limit: The maximum number of records to return in a single page.

Alternatively, clients might specify:

page: The page number they want to retrieve.
pageSize (or per_page, limit): The number of records per page.

The offset can be calculated from page and pageSize using the formula: offset = (page - 1) * pageSize.

Technical Implementation Steps:

Let's assume we have an API endpoint /items that returns a list of items.

a. API Request Parameters:
The client would make a request like:
GET /items?offset=20&limit=10 (fetch 10 items, skipping the first 20)
or
GET /items?page=3&pageSize=10 (fetch the 3rd page, with 10 items per page, which is equivalent to offset=20, limit=10).

It's good practice to set default values for these parameters (e.g., limit=20, offset=0 or page=1, pageSize=20) if the client doesn't provide them. Also, enforce a maximum limit or pageSize to prevent clients from requesting an excessively large number of records, which could strain the server.

b. Backend Logic (Conceptual):
When the server receives this request, it needs to translate these parameters into a database query.

// Example in Java with Spring Boot
@GetMapping("/items")
public ResponseEntity<PaginatedResponse<Item>> getItems(
    @RequestParam(defaultValue = "0") int offset,
    @RequestParam(defaultValue = "20") int limit
) {
    // Validate limit to prevent abuse
    if (limit > 100) {
        limit = 100; // Enforce a max limit
    }

    List<Item> items = itemRepository.findItemsWithOffsetLimit(offset, limit);
    long totalItems = itemRepository.countTotalItems(); // For metadata

    // Construct and return paginated response
    // ...
}

c. Database Query (SQL Example):
Most relational databases support offset and limit clauses directly.

For PostgreSQL or MySQL:

SELECT *
FROM items
ORDER BY created_at DESC -- Consistent ordering is crucial for stable pagination
LIMIT 10 -- This is the 'limit' parameter
OFFSET 20; -- This is the 'offset' parameter

For SQL Server (older versions might use ROW_NUMBER()):

SELECT *
FROM items
ORDER BY created_at DESC
OFFSET 20 ROWS
FETCH NEXT 10 ROWS ONLY;

For Oracle:

SELECT *
FROM (
    SELECT i.*, ROWNUM rnum
    FROM (
        SELECT *
        FROM items
        ORDER BY created_at DESC
    ) i
    WHERE ROWNUM <= 20 + 10 -- offset + limit
)
WHERE rnum > 20; -- offset

Important Note on Ordering: For offset/limit pagination to be reliable, the underlying dataset must be sorted by a consistent and unique (or near-unique) key, or a combination of keys. If the order of items can change between requests (e.g., new items being inserted or items being updated in a way that affects their sort order), users might see duplicate items or miss items when navigating pages. A common choice is to sort by creation timestamp or a primary ID.

d. API Response Structure:
A good paginated response should not only include the data for the current page but also metadata to help the client navigate.

{
  "data": [
    // array of items for the current page
    { "id": "item_21", "name": "Item 21", ... },
    { "id": "item_22", "name": "Item 22", ... },
    // ... up to 'limit' items
    { "id": "item_30", "name": "Item 30", ... }
  ],
  "pagination": {
    "offset": 20,
    "limit": 10,
    "totalItems": 5000, // Total number of items available
    "totalPages": 500, // Calculated as ceil(totalItems / limit)
    "currentPage": 3 // Calculated as (offset / limit) + 1
  },
  "links": { // HATEOAS links for navigation
    "self": "/items?offset=20&limit=10",
    "first": "/items?offset=0&limit=10",
    "prev": "/items?offset=10&limit=10", // Null if on the first page
    "next": "/items?offset=30&limit=10", // Null if on the last page
    "last": "/items?offset=4990&limit=10"
  }
}

Providing HATEOAS (Hypermedia as the Engine of Application State) links (self, first, prev, next, last) is a REST best practice. It allows clients to navigate through the pages without having to construct the URLs themselves.

Pros of Offset/Limit Pagination:

Simplicity: Easy to understand and implement.
Stateful Navigation: Allows direct navigation to any specific page (e.g., "jump to page 50").
Widely Supported: Database support for OFFSET and LIMIT is common.

Cons of Offset/Limit Pagination:

Performance Degradation with Large Offsets: As the offset value increases, databases might become slower. The database often still has to scan through all offset + limit rows before discarding the offset rows. This can be inefficient for deep pages.
Data Skew/Missed Items: If new items are added or existing items are removed from the dataset while a user is paginating, the "window" of data can shift. This might cause a user to see the same item on two different pages or miss an item entirely. This is particularly problematic with frequently updated datasets. For example, if you are on page 2 (items 11-20) and a new item is added at the beginning of the list, when you request page 3, what was previously item 21 is now item 22. You might miss the new item 21 or see duplicates depending on the exact timing and deletion patterns.

2. Cursor-Based (Keyset/Seek) Pagination

Cursor-based pagination addresses some of the shortcomings of offset/limit, particularly performance with large datasets and data consistency issues. Instead of relying on an absolute offset, it uses a "cursor" that points to a specific item in the dataset. The client then requests items "after" or "before" this cursor.

The cursor is typically an opaque string that encodes the value(s) of the sort key(s) of the last item retrieved on the previous page.

Technical Implementation Steps:

a. API Request Parameters:
The client would make a request like:
GET /items?limit=10 (for the first page)
And for subsequent pages:
GET /items?limit=10&after_cursor=opaquestringrepresentinglastitemid
Or, to paginate backward (less common but possible):
GET /items?limit=10&before_cursor=opaquestringrepresentingfirstitemid

The limit parameter still defines the page size.

b. What is a Cursor?
A cursor should be:

Opaque to the client: The client shouldn't need to understand its internal structure. It just receives it from one response and sends it back in the next request.
Based on unique and sequentially ordered column(s): Typically, this is the primary ID (if it's sequential like a UUIDv1 or a database sequence) or a timestamp column. If a single column isn't unique enough (e.g., multiple items can have the same timestamp), a combination of columns is used (e.g., timestamp + id).
Encodable and Decodable: Often Base64 encoded to ensure it's URL-safe. It could be as simple as the ID itself, or a JSON object { "last_id": 123, "last_timestamp": "2023-10-27T10:00:00Z" } which is then Base64 encoded.

c. Backend Logic (Conceptual):

// Example in Java with Spring Boot
@GetMapping("/items")
public ResponseEntity<CursorPaginatedResponse<Item>> getItems(
    @RequestParam(defaultValue = "20") int limit,
    @RequestParam(required = false) String afterCursor
) {
    // Validate limit
    if (limit > 100) {
        limit = 100;
    }

    // Decode cursor to get the last seen item's properties
    // e.g., LastSeenItemDetails lastSeen = decodeCursor(afterCursor);
    // If afterCursor is null, it's the first page.

    List<Item> items;
    if (afterCursor != null) {
        DecodedCursor decoded = decodeCursor(afterCursor); // e.g., { lastId: "some_uuid", lastCreatedAt: "timestamp" }
        items = itemRepository.findItemsAfter(decoded.getLastCreatedAt(), decoded.getLastId(), limit);
    } else {
        items = itemRepository.findFirstPage(limit);
    }

    String nextCursor = null;
    if (!items.isEmpty() && items.size() == limit) {
        // Assuming items are sorted, the last item in the list is used to generate the next cursor
        Item lastItemOnPage = items.get(items.size() - 1);
        nextCursor = encodeCursor(lastItemOnPage.getCreatedAt(), lastItemOnPage.getId());
    }

    // Construct and return cursor paginated response
    // ...
}

// Helper methods for encoding/decoding cursors
// private DecodedCursor decodeCursor(String cursor) { ... }
// private String encodeCursor(Timestamp createdAt, String id) { ... }

d. Database Query (SQL Example):
The key is to use a WHERE clause that filters records based on the sort key(s) from the cursor. The ORDER BY clause must align with the cursor's composition.

Assuming sorting by created_at (descending) and then by id (descending) as a tie-breaker for stable ordering if created_at is not unique:

For the first page:

SELECT *
FROM items
ORDER BY created_at DESC, id DESC
LIMIT 10;

For subsequent pages, if the cursor decoded to last_created_at_from_cursor and last_id_from_cursor:

SELECT *
FROM items
WHERE (created_at, id) < (CAST('last_created_at_from_cursor' AS TIMESTAMP), CAST('last_id_from_cursor' AS UUID)) -- Or appropriate types
-- For ascending order, it would be >
-- The tuple comparison (created_at, id) < (val1, val2) is a concise way to write:
-- WHERE created_at < 'last_created_at_from_cursor'
--    OR (created_at = 'last_created_at_from_cursor' AND id < 'last_id_from_cursor')
ORDER BY created_at DESC, id DESC
LIMIT 10;

This type of query is very efficient, especially if there's an index on (created_at, id). The database can directly "seek" to the starting point without scanning irrelevant rows.

e. API Response Structure:

{
  "data": [
    // array of items for the current page
    { "id": "item_N", "createdAt": "2023-10-27T10:05:00Z", ... },
    // ... up to 'limit' items
    { "id": "item_M", "createdAt": "2023-10-27T10:00:00Z", ... }
  ],
  "pagination": {
    "limit": 10,
    "hasNextPage": true, // boolean indicating if there's more data
    "nextCursor": "base64encodedcursorstringforitem_M" // opaque string
    // Potentially a "prevCursor" if bi-directional cursors are supported
  },
  "links": {
    "self": "/items?limit=10&after_cursor=current_request_cursor_if_any",
    "next": "/items?limit=10&after_cursor=base64encodedcursorstringforitem_M" // Null if no next page
  }
}

Notice that cursor-based pagination typically doesn't provide totalPages or totalItems because calculating these would require a full table scan, negating some of the performance benefits. If these are strictly needed, a separate endpoint or an estimate might be provided.

Pros of Cursor-Based Pagination:

Performance on Large Datasets: Generally performs better than offset/limit for deep pagination, as the database can efficiently seek to the cursor's position using indexes.
Stable in Dynamic Datasets: Less susceptible to missed or duplicate items when data is frequently added or removed, as the cursor anchors to a specific item. If an item before the cursor is deleted, it doesn't affect subsequent items.
Suitable for Infinite Scrolling: The "next page" model fits naturally with infinite scroll UIs.

Cons of Cursor-Based Pagination:

No "Jump to Page": Users cannot directly navigate to an arbitrary page number (e.g., "page 5"). Navigation is strictly sequential (next/previous).
More Complex Implementation: Defining and managing cursors, especially with multiple sort columns or complex sort orders, can be more intricate.
Sorting Limitations: The sort order must be fixed and based on the columns used for the cursor. Changing sort order on the fly with cursors is complex.

Choosing the Right Strategy

The choice between offset/limit and cursor-based pagination depends on your specific requirements:

Offset/Limit is often sufficient if:
The dataset is relatively small or doesn't change frequently.
The ability to jump to arbitrary pages is a critical feature.
Implementation simplicity is a high priority.
Performance for very deep pages is not a major concern.
Cursor-Based is generally preferred if:
You're dealing with very large, frequently changing datasets.
Performance at scale and data consistency during pagination are paramount.
Sequential navigation (like infinite scrolling) is the primary use case.
You don't need to display total page counts or allow jumping to specific pages.

In some systems, a hybrid approach is even used, or different strategies are offered for different use cases or endpoints.

Best Practices for Implementing Pagination

Regardless of the chosen strategy, adhere to these best practices:

Consistent Parameter Naming: Use clear and consistent names for your pagination parameters (e.g., limit, offset, page, pageSize, after_cursor, before_cursor). Stick to one convention (e.g., camelCase or snake_case) throughout your API.
Provide Navigation Links (HATEOAS): As shown in the response examples, include links for self, next, prev, first, and last (where applicable). This makes the API more discoverable and decouples the client from URL construction logic.
Default Values and Max Limits:

Always set sensible default values for limit (e.g., 10 or 25).
Enforce a maximum limit to prevent clients from requesting too much data and overwhelming the server (e.g., max 100 records per page). Return an error or cap the limit if an invalid value is requested.

Clear API Documentation: Document your pagination strategy thoroughly:

Explain the parameters used.
Provide example requests and responses.
Clarify default and maximum limits.
Explain how cursors are used (if applicable), without revealing their internal structure if they are meant to be opaque.

Consistent Sorting: Ensure that the underlying data is sorted consistently for every paginated request. For offset/limit, this is vital to avoid data skew. For cursor-based, the sort order dictates how cursors are constructed and interpreted. Use a unique tie-breaker column (like a primary ID) if the primary sort column can have duplicate values.
Handle Edge Cases:

Empty Results: Return an empty data array and appropriate pagination metadata (e.g., totalItems: 0 or hasNextPage: false).
Invalid Parameters: Return a 400 Bad Request error if clients provide invalid pagination parameters (e.g., negative limit, non-integer page number).
Cursor Not Found (for cursor-based): If a provided cursor is invalid or points to a deleted item, decide on a behavior: return a 404 Not Found or 400 Bad Request, or gracefully fall back to the first page.

Total Count Considerations:

For offset/limit, providing totalItems and totalPages is common. Be mindful that COUNT(*) can be slow on very large tables. Explore database-specific optimizations or estimates if this becomes a bottleneck.
For cursor-based, totalItems is often omitted for performance. If needed, consider providing an estimated count or a separate endpoint that calculates it (potentially asynchronously).

Error Handling: Return appropriate HTTP status codes for errors (e.g., 400 for bad input, 500 for server errors during data fetching).
Security: While not directly a pagination mechanism, ensure that the data being paginated respects authorization rules. A user should only be able to paginate through data they are permitted to see.
Caching: Paginated responses can often be cached. For offset-based pagination, GET /items?page=2&pageSize=10 is highly cacheable. For cursor-based, GET /items?limit=10&after_cursor=XYZ is also cacheable. Ensure your caching strategy works well with how pagination links are generated and consumed. Invalidation strategies need to be considered if the underlying data changes frequently.

Advanced Topics (Brief Mentions)

Infinite Scrolling: Cursor-based pagination is a natural fit for infinite scrolling UIs. The client fetches the first page, and as the user scrolls near the bottom, it uses the nextCursor to fetch the subsequent set of items.
Pagination with Complex Filtering and Sorting: When combining pagination with dynamic filtering and sorting parameters, ensure that:
For offset/limit: The totalItems count accurately reflects the filtered dataset.
For cursor-based: The cursor encodes the state of both sorting and filtering if those affect what "next" means. This can significantly complicate cursor design. Often, if filters or sort order change, pagination is reset to the "first page" of the new view.
GraphQL Pagination: GraphQL has its own standardized way of handling pagination, often referred to as "Connections." It typically uses cursor-based pagination and has a defined structure for returning edges (items with cursors) and page info. If you're using GraphQL, adhere to its conventions (e.g., Relay Cursor Connections specification).

Conclusion

Implementing pagination correctly is fundamental to building scalable and user-friendly REST APIs. While offset/limit pagination is simpler to start with, cursor-based pagination offers superior performance and consistency for large, dynamic datasets. By understanding the technical details of each strategy, choosing the one that best fits your application's needs, and following best practices for implementation and API design, you can ensure that your API efficiently delivers data to your clients, no matter the scale. Remember to always prioritize clear documentation and robust error handling to provide a smooth experience for API consumers.