How to Run EXAONE Deep Locally Using Ollama

Running advanced AI models locally has become a practical solution for developers and researchers who need speed, privacy, and control. EXAONE Deep, an innovative inference AI model from LG AI Research, excels at solving complex problems in math, science, and coding. By using Ollama, a platform designed to deploy large language models on local hardware, you can set up EXAONE Deep on your own machine with ease.

💡

Boost Your Workflow with Apidog
Working with AI models like EXAONE Deep often involves API integration. Apidog is a free, powerful tool that makes API testing and debugging a breeze. Download Apidog today to streamline your development and ensure smooth communication with your local AI setup.

button

Let’s dive into the process.

What Are EXAONE Deep and Ollama?

Before we proceed, let’s clarify what EXAONE Deep and Ollama are and why they matter.

EXAONE Deep is a cutting-edge AI model developed by LG AI Research. Unlike typical language models, it’s an inference AI, meaning it focuses on reasoning and problem-solving. It autonomously generates hypotheses, verifies them, and provides answers to complex questions in fields like mathematics, science, and programming. This makes it a valuable asset for anyone tackling technical challenges.

Meanwhile, Ollama is an open-source platform that lets you run large language models, including EXAONE Deep, on your local machine. It uses containerization to manage the model’s dependencies and resources, simplifying the deployment process. By running EXAONE Deep locally with Ollama, you gain several advantages:

Privacy: Your data stays on your device, avoiding cloud exposure.
Speed: Local processing cuts down latency from network calls.
Flexibility: You control the setup and can tweak it as needed.

Prerequisites for Running EXAONE Deep Locally

To run EXAONE Deep locally, your system must meet certain hardware and software standards. Since this is a resource-heavy AI model, having the right setup is critical. Here’s what you need:

Hardware Requirements

RAM: At least 16GB. More is better for smoother performance.
GPU: A dedicated NVIDIA GPU with at least 8GB of VRAM. This ensures the model runs efficiently, as EXAONE Deep relies on GPU acceleration for inference tasks.
Storage: Enough free space (20-50GB) to store the model and its dependencies.

Software Requirements

Operating System: Linux or macOS. Windows may work but often requires extra steps, so Linux/macOS is recommended.
Internet: A stable connection to download Ollama and the EXAONE Deep model.

With these in place, you’re ready to install Ollama and get EXAONE Deep running. Let’s transition to the installation process.

Installing Ollama on Your System

Ollama is your gateway to running EXAONE Deep locally, and its installation is straightforward. Follow these steps to set it up:

Download Ollama:

Head to the Ollama website and grab the binary for your OS.

Alternatively, for Linux or macOS, use this terminal command:

curl -fsSL https://ollama.ai/install.sh | sh

This script automates the download and setup.

Check the Installation:

After installing, verify that Ollama works by running:

ollama --version

You should see the version number (e.g., 0.1.x). If not, double-check your installation or consult the Ollama GitHub for help.

Once Ollama is installed, you’re set to download and run EXAONE Deep. Let’s move to that next.

Setting Up and Running EXAONE Deep with Ollama

Now that Ollama is ready, let’s get EXAONE Deep up and running. This involves downloading the model and launching it locally.

Step 1: Download the EXAONE Deep Model

Ollama hosts EXAONE Deep in its model library. To pull it to your machine, run:

ollama pull exaone-deep

This command fetches the model files. Depending on your internet speed and the model’s size (which can be several gigabytes), this might take a few minutes. Watch the terminal for progress updates.

Step 2: Launch the Model

Once downloaded, start EXAONE Deep with:

ollama run exaone-deep

This command fires up the model, and Ollama spins up a local server. You’ll see a prompt where you can type questions or commands. For example:

> Solve 2x + 3 = 7

The model processes it and returns the answer (e.g., x = 2).

Step 3: Customize Settings (Optional)

Ollama lets you tweak how EXAONE Deep runs. For instance:

GPU Layers: Offload computation to your GPU with flags like --num-gpu-layers.
Memory Limits: Adjust RAM usage if needed.
Check the Ollama docs for specifics, as these options depend on your hardware.

At this point, EXAONE Deep is operational. However, typing prompts in the terminal isn’t the only way to use it. Next, we’ll explore how to interact with it programmatically using its API—and how Apidog fits in.

Using Apidog to Interact with EXAONE Deep

For developers building applications, accessing EXAONE Deep via its API is more practical than the command line. Fortunately, Ollama provides a RESTful API when you run the model. Here’s where Apidog, an API testing tool, becomes invaluable.

Understanding the Ollama API

When you launch EXAONE Deep with ollama run exaone-deep, it opens a local server, typically at http://localhost:11434. This server exposes endpoints like:

/api/generate: For sending prompts and getting responses.
/api/tags: To list available models.

Setting Up Apidog

Follow these steps to use Apidog with EXAONE Deep:

Install Apidog:

Download it and install it. It’s free and works on all major OSes.

button

Create a New Request:

Open Apidog and click “New Request.”
Set the method to POST and the URL to http://localhost:11434/api/generate.

Configure the Request:

In the request body, add JSON like this:

{
  "model": "exaone-deep",
  "prompt": "What is the square root of 16?",
  "stream": false
}

This tells EXAONE Deep to process your prompt.

Send and Test:

Hit “Send” in Apidog. You’ll see the response (e.g., {"response": "4"}) in the tool’s interface.

Use Apidog to tweak the request, test edge cases, or automate repetitive calls.

Why Use Apidog?

Apidog simplifies API work by:

Visualizing Responses: See exactly what EXAONE Deep returns.
Saving Time: Store and reuse requests instead of retyping them.
Debugging: Spot errors in your API calls quickly.

With Apidog, integrating EXAONE Deep into your projects becomes seamless. But what if you hit a snag? Let’s cover troubleshooting next.

Troubleshooting Common Issues

Running a model like EXAONE Deep locally can sometimes trip you up. Here are common problems and fixes:

Problem: GPU Memory Error

Symptom: The model crashes with a “CUDA out of memory” message.
Fix: Lower the batch size or GPU layers. Run nvidia-smi to check usage and adjust settings via Ollama’s flags.

Problem: Model Won’t Start

Symptom: ollama run exaone-deep fails with an error.
Fix: Ensure Docker is running (docker ps). Check Ollama logs (ollama logs) for clues and verify the model downloaded fully.

Problem: API Doesn’t Respond

Symptom: Apidog requests time out or return errors.
Fix: Confirm the server runs (curl http://localhost:11434) and the endpoint matches Ollama’s docs.

Optimization Tip

For better performance, upgrade your GPU or add RAM. EXAONE Deep thrives on strong hardware.

With these solutions, you’ll keep your setup humming. Let’s wrap up.

Conclusion

Running EXAONE Deep locally using Ollama unlocks a world of AI-powered reasoning without cloud dependency. This guide has shown you how to install Ollama, set up EXAONE Deep, and use Apidog to interact with its API. From solving math problems to coding assistance, this setup empowers you to tackle tough tasks efficiently.

Ready to explore? Fire up Ollama, download EXAONE Deep, and grab Apidog to streamline your workflow. The power of local AI is at your fingertips.

button