Skip to main content

A Guide to Calculating Tokens & Costs for ChatGPT APIs

In this section, you will learn how to calculate the tokens and cost for ChatGPT APIs.

ChatGPT APIs support the use of Server-Sent Events (SSE) for responses, which involves returning the response to a question character by character. SSE is a real-time communication technology based on the HTTP protocol, commonly used in debugging scenarios for large language models (LLM) APIs.

Developers often encounter scenarios where they need to concatenate SSE events and calculate the token count and estimated cost. In this article, we will use the example of debugging a specific AI application to demonstrate how to automatically convert the number of characters in the input and output into token values, and estimate the approximate cost in real-time during a single API debugging session.

Prerequisites

Before calculating the cost of question-answering during the debugging process, it is necessary to understand the billing conditions of the large model providers. Accroding to OpenAI's pricing page, you need to first record the number of tokens used for each question-answering process and convert it to currency.

In this article, we are using JPY(Japanese Yen) as example.

GPT 价格

Therefore, in order to calculate the cost of question-answering during API debugging, the following two steps are involved:

1. Calculate the number of input and output tokens.

2. Convert the tokens values to Japanese Yen using the real-time exchange rate.

Tokens Count Conversion Library

To accurately convert the content to token values, a third-party token conversion library is needed. The following example uses the OpenAI GPT Token Counter library to convert input/output data to token counts during the API debugging process.

Node.js sample code:

const openaiTokenCounter = require('openai-gpt-token-counter');

const text = process.argv[2]; // Get the test content from command line arguments
const model = "gpt-4"; // Replace with the OpenAI model you want to use

const tokenCount = openaiTokenCounter.text(text, model);
const characterCount = text.length; // Calculate the number of characters

console.log(`${tokenCount}`);

Rename the Node.js script to gpt-tokens-counter.js and place it in the external program directory of Apidog for calling, as described in the Calling Other Programming Languages section.

Install the Tokens Count Conversion Library Package:

Execute the following command in the directory where the script is located to initialize the script's running environment.

npm install openai-gpt-token-counter

Real-Time Exchange Rate API

After obtaining the tokens values for the input and output, it is necessary to estimate the cost in JPY by using a real-time exchange rate API. This article will call the Currencylayer API to get the real-time exchange rate. Sign up for an account and obtain an API Key.

Input Cost

Converting Input Values to Tokens

The input values can be understood as the questions and prompts filled in by the user when querying the AI application. Therefore, a custom script needs to be added in the Pre-Processors to extract the query parameter from the request body and convert it to tokens values.

Here is an example code for adding the token value conversion script in the Pre-Processors section:

try {
var jsonData = JSON.parse(pm.request.body.raw);
var content = jsonData.messages[0].content; // obtains the content of messages
var result_input_tokens_js = pm.execute('./gpt-tokens/gpt-tokens-counter.js',[content])
console.log(content);
pm.environment.set("RESULT_INPUT_TOKENS", result_input_tokens_js);
console.log("Input Tokens count: " + pm.environment.get("RESULT_INPUT_TOKENS"));
} catch (e) {
console.log(e);
}

After clicking the "Send" button, the calculated input values can be seen in the console.

Conversion to Actual Cost(JPY)

After obtaining the value of Tokens consumed from the input, it is necessary to request a real-time exchange rate API to obtain a conversion factor. This factor is then multiplied by the Tokens value to calculate the actual cost in JPY. Add the following script to the pre-operation:

pm.sendRequest("http://apilayer.net/api/live?access_key=YOUR-API-KEY&currencies=JPY&source=USD&format=1", (err, res) => {
if (err) {
console.log(err);
} else {
const quotes = res.json().quotes;
const rate = parseFloat(quotes.USDJPY).toFixed(3);
pm.environment.set("USDJPY_RATE", rate);
var USDJPY_RATE = pm.environment.get("USDJPY_RATE");
// Retrieve the RESULT_INPUT_TOKENS variable from the previous script
var RESULT_INPUT_TOKENS = pm.environment.get("RESULT_INPUT_TOKENS");

// Calculate the tokens exchange rate value
const tokensExchangeRate = 0.03; // Price of 1000 tokens in USD (with GPT-4-8k context input pricing as reference)

// Calculate the estimated price in JPY
const JPYPrice = ((RESULT_INPUT_TOKENS / 1000) * tokensExchangeRate * USDJPY_RATE).toFixed(2);

pm.environment.set("INPUT_PRICE", JPYPrice);

console.log("Estimated cost: " + "¥" + JPYPrice);
}
});

Output Cost

Concatenating the Response

When the Content-Type parameter in the response returned by the API contains text/event-stream, Apidog automatically parses the returned data as SSE events. Usually, each return in the SSE event only contains fragment characters (usually 1 character). In this case, it is necessary to concatenate all the returned content into a complete sentence.

Go to the Post-Processors in the API definition and add a custom script to extract the response content and complete the concatenation.

Sample code for concatenating responses:

// Get the response text
const text = pm.response.text()
// Split the text into lines
var lines = text.split('\n');
// Create an empty array to store the "content" parameter
var contents = [];
// Iterate through each line
for (var i = 0; i < lines.length; i++) {
const line = lines[i];
// Skip lines that do not start with "data:"
if (!line.startsWith('data:')) {
continue;
}
// Try to parse the JSON data
try {
var data = JSON.parse(line.substring(5).trim()); // Remove the leading "data: "
// Get the "content" parameter from the "choices" array and add it to the array
contents.push(data.choices[0].delta.content);
} catch (e) {
// Ignore the current line if it is not valid JSON data
}
}
// Join the "content" parameters using the join() method
var result = contents.join('');
// Display the result in the "Visualize" tab of the body
pm.visualizer.set(result);
// Print the result to the console
console.log(result);

After making the request, you can retrieve the complete response content in the console.

Convert Tokens Output Value

After obtaining the complete response content, it is necessary to convert it into Tokens value using a third-party library. Add the following custom script in the post-processing operation, so that Apidog can call the external gpt-tokens-counter.js script (please refer to Preparation: Tokens Value Conversion Library for the specific code) to obtain the Tokens value.

// Get the response text
const text = pm.response.text()
// Split the text into lines
var lines = text.split('\n');
// Create an empty array to store the "content" parameter
var contents = [];
// Iterate through each line
for (var i = 0; i < lines.length; i++) {
const line = lines[i];
// Skip lines that do not start with "data:"
if (!line.startsWith('data:')) {
continue;
}
// Try to parse the JSON data
try {
var data = JSON.parse(line.substring(5).trim()); // Remove the leading "data: "
// Get the "content" parameter from the "choices" array and add it to the array
contents.push(data.choices[0].delta.content);
} catch (e) {
// Ignore the current line if it is not valid JSON data
}
}
// Join the "content" parameters using the join() method
var result = contents.join('');
// Display the result in the "Visualize" tab of the body
pm.visualizer.set(result);
// Print the result to the console
console.log(result);

// Calculate the number of output tokens.
var RESULT_OUTPUT_TOKENS = pm.execute('./gpt-tokens/gpt-tokens-counter.js', [result])
pm.environment.set("RESULT_OUTPUT_TOKENS", RESULT_OUTPUT_TOKENS);

console.log("Output Tokens count: " + pm.environment.get("RESULT_OUTPUT_TOKENS"));

Actual Cost (JPY)

Similar to the cost calculation scheme mentioned in the previous section, the actual cost (JPY) is obtained by multiplying the Tokens value with the exchange rate.

Add the following script in the post-processing operation:

pm.sendRequest("http://apilayer.net/api/live?access_key=YOUR-API-KEY&currencies=JPY&source=USD&format=1", (err, res) => {
if (err) {
console.log(err);
} else {
const quotes = res.json().quotes;
const rate = parseFloat(quotes.USDJPY).toFixed(3);
pm.environment.set("USDJPY_RATE", rate);
var USDJPY_RATE = pm.environment.get("USDJPY_RATE");
// Get the RESULT_OUTPUT_TOKENS variable from the previous postman script
var RESULT_OUTPUT_TOKENS = pm.environment.get("RESULT_OUTPUT_TOKENS");

// Calculate tokens exchange rate
const tokensExchangeRate = 0.06; // USD price per 1000 tokens (based on GPT-4-8k context input pricing)

// Calculate estimated price in JPY
const JPYPrice = ((RESULT_OUTPUT_TOKENS / 1000) * tokensExchangeRate * USDJPY_RATE).toFixed(2);

pm.environment.set("OUTPUT_PRICE", JPYPrice);

console.log("Output cost (JPY): " + JPYPrice + "元");
}
});

Estimated Total Cost

Finally, add a custom script in the post-processing phase that can automatically calculate the total cost of inputs and outputs.

// Summing up input and output costs

const INPUTPrice = Number(pm.environment.get("INPUT_PRICE"));
// Get the input price variable and convert it to a number

const OUTPUTPrice = Number(pm.environment.get("OUTPUT_PRICE"));
// Get the output price variable and convert it to a number

console.log("Total cost: " + "¥" + (INPUTPrice + OUTPUTPrice));
// Print the total cost: the sum of the input price and output price.

Allowing to estimate the approximate cost of the current request during the process of debugging the API.