Google has recently updated its Bard AI model to the brand-new Gemini. Potentially one of the most capable and general AI models that Google has built to date, Gemini has provided 3 models for Gemini 1.0: Gemini Ultra, Gemini Pro, and Gemini Nano.
Apidog is a free-to-use API tool, so start today by clicking the button below! 👇 👇 👇
What is Google's Gemini?
Gemini is Google's newest general AI model (or service), built for multimodality.
Gemini is an AI model that is generalized to be able to process various types of information, inclusive of text, code, audio, image, and video.
This time, Google has chosen to release three different Gemini AI models: Gemini Ultra, Gemini Pro, and Gemini Nano, each with their signature descriptions, quoting their Gemini 1 report:
- Gemini Ultra: The most capable Gemini model that delivers state-of-the-art performance across a wide range of highly complex tasks, including reasoning and multimodal tasks. It is efficiently serveable at scale on TPU accelerators due to the Gemini architecture.
- Gemini Pro: A performance-optimized model in terms of cost as well as latency that delivers significant performance across a wide range of tasks. This model exhibits strong reasoning performance and broad multimodal capabilities.
- Gemini Nano: Google's most efficient model that is designed to run on-device. They have trained two versions of Nano, with 1.8B (Nano-1) and 3.25B (Nano-2) parameters, targeting low and high-memory devices respectively. Nano is trained by distilling from larger Gemini models and is 4-bit quantized for deployment, providing the best-in-class performance.
Google's Gemini is also famous for being the first model ever to outperform human experts in Massive Multitask Language Understanding (MMLU).
Gemini Model Benchmarks
At the time of their release, Gemini's biggest competitor is OpenAI's GPT-4.
With 4 different areas for comparison, Gemini Ultra, the most powerful Gemini model, beats GPT-4 in almost every aspect, except for commonsense reasoning for everyday tasks. Gemini Ultra excels in accurate Python coding generation, mathematical problem-solving skills, and general MMLU.
Google has also made a more comprehensive benchmark report with Gemini Pro and other relevant AI models that are available for use:
To provide more context on the Gemini models' ability, Google has made a normalized internal test, using Gemini Pro as its main benchmark.
Gemini AI Functionalities
The Gemini AI is designed to be natively multimodal for further effectiveness. It aids Gemini in understanding and reasoning with all kinds of inputs, therefore it excels in aiding its users who are seeking help in:
Sophisticated Reasoning
Gemini's complex multimodal reasoning capabilities can allow the AI model to digest and process complex information. Gemini is therefore skilled at inferencing hidden meanings amid large amounts of data.
You can also use Gemini to extract specific information from thousands of documents - it can filter and understand information based on the criteria inputted, and provide you with the information you seek.
Advanced Coding
Gemini is capable of understanding, explaining, and generating code. It can generate code for widely used programming languages such as Python, Java, C++, and Go.
This means that if you have code you do not understand, you can run it through Gemini for a breakdown of what the code does. On top of that, Gemini can help provide code for functionalities that you are struggling with.
Idea Generation From Various Inputs
Gemini can respond to various types of inputs. This includes PDF files, pictures, texts, and videos. With accurate and detailed identification skills, Gemini can become a source of inspiration - a brainstorming tool if you'd like to call it that way.
About Gemini API
The release of Gemini comes with the Google Gemini API, allowing developers to a vast variety of AI-based applications. With Gemini, you are no longer bonded to just text - you can also input images to generate input-relevant outputs.
Availability of Gemini API
Currently, there is a list of available languages and regions where Gemini API can operate. Check out these links below to find out if you are eligible to use Gemini API!
Available languages for Gemini API
Available regions for Gemini API
Gemini API Pricing
Google provides a free version of the Gemini Pro. Although it encourages users to create apps with the Gemini Pro API, users should be aware that the prompts and responses involved in the free version of the API are recorded, and used in the research and development process of Gemini Pro. In other words, Google will have a record of whatever the API receives and provides (no privacy).
As Gemini is relatively new, the complete pricing for Gemini API is not fully out, however, a quoted price for input at the output has been provided. It will cost $0.000125 to input 1000 characters, and $0.0025 to input a message. To give a response, Gemini API will charge $0.000375 to output 1000 characters. However, the paid version of Gemini API will support more than 60 queries per minute.
Alternative Manual Step-by-step Guide on How to Use Gemini API
The Gemini AI model is one of the most powerful AI models accessible for free. If you are interested in creating applications with it, continue reading this section below.
Step 1: Obtain Gemini API Key
To gain access to the Gemini API, we first have to get the Gemini API key from Google AI for Developers.
You will then enter the Google AI Studio dashboard, where they will prompt to you choose to start a new prompt or get an API key.
Locate the button above to create an API key.
Gemini provides the option of choosing whether you want to work on a project already on Google Cloud (perhaps on a team project) or create a brand new project.
Once you have chosen an option, Gemini will generate an API key for you!
Step 2 - Copy the cURL Code
Firstly, go to the Google AI for Developers website, and copy the URL as highlighted in the image above. Do not include the last portion 2> /dev/null
as it is not supposed to be part of the cURL Code.
Next, open Apidog, and select the purple +
button around the top left corner of the Apidog window. You should be able to see Import cURL
. Alternatively, you can use the Ctrl + I
shortcut.
Paste the cURL code into the window, and press the OK
button.
Apidog allows users to import existing cURL code into new requests! Furthermore, you can modify these cURL code requests according to how you want them to function.
Alternative Manual Steps if cURL Code Does Not Work
Get Apidog to Create an API with Gemini API Key
This article will show how to use the Gemini API key with Apidog, a design-first API development tool.
First, create a new project on Apidog. You can name it Gemini API, or anything that you wish!
Then, press the New API
button.
Now, go to the Google AI for Developers website, and copy the URL as highlighted in the image above.
Return to Apidog, and paste the Gemini API URL you copied in the highlighted zone shown in the picture. As this example is a POST request, also ensure to change the method from GET to POST.
Notice that the query parameter at the end of the URL is removed. Don't worry - in Apidog, any query parameters will be automatically extracted and filled in Request Params
, found under the Params
section, as shown in the picture.
You will have to replace this section with the generated Gemini API key obtained earlier. Paste it in the highlighted section shown in the image above. Click save afterwards to save you progress.
Return to the Google AI for Developers website to copy the body of the POST request. Copy the highlighted portion of the body.
Move back to Apidog, and under the Request section, select the Body
header, and select json
. Then, paste the POST request body in the Example
section.
If you want to change the prompt that will be sent to Gemini API, you can edit the string found within the quotation marks of the "text"
element.
Lastly, hit the Send
button above to make a request. You should then receive a response from Gemini API!
Conclusions
The Google Gemini API is an extremely powerful tool that many developers today can utilize for small programs, applications, and businesses. With the ability to process both text and image input, Gemini API can provide users with insightful responses that involve intelligent, contextual inferences.
Apidog, aside from building APIs, can also provide a simple and intuitive environment for testing, mocking, and documenting APIs. With a lot of automated processes to help increase a developer's efficiency, consider Apidog to be your next API platform!