How to Use Ollama App on Windows and Mac

Ollama now runs natively on both macOS and Windows, making it easier than ever to run local AI models. In this guide, you'll learn how to set up Ollama, use its new GUI app, chat with files, send images to models, and even integrate it with your development workflow using tools like Apidog.

Emmanuel Mumba

Emmanuel Mumba

31 July 2025

How to Use Ollama App on Windows and Mac

Running large language models (LLMs) locally used to be the domain of hardcore CLI users and system tinkerers. But that’s changing fast. Ollama, known for its simple command-line interface for running open-source LLMs on local machines, just released native desktop apps for macOS and Windows.

And they’re not just basic wrappers. These apps bring powerful features that make chatting with models, analyzing documents, writing documentation, and even working with images drastically easier for developers.

In this article, we’ll explore how the new desktop experience improves the developer workflow, what features stand out, and where these tools actually shine in daily coding life.

💡
If you're building or testing APIs while working with local LLMs like Ollama, Apidog is a powerful tool to have in your workflow. It lets you run, test, and debug LLM APIs locally even without an internet connection, making it perfect for developers working with self-hosted models.
button

Why Local LLMs Still Matter

While cloud-based tools like ChatGPT, Claude, and Gemini dominate headlines, there’s a growing movement toward local-first AI development. Developers want tools that are:

Ollama taps directly into this trend, letting you run models like LLaMA, Mistral, Gemma, Codellama, Mixtral, and others natively on your machine - now with a much smoother experience.


Step 1: Download Ollama for Desktop

Go to ollama.com and download the latest version for your system:

Install it like a regular desktop app. No command-line setup is required to get started.

Step 2: Launch and Pick a Model

Once installed, open the Ollama desktop app. The interface is clean and looks like a simple chat window.

You’ll be prompted to choose a model to download and run. Some options include:

Choose one and the app will automatically download and load it.

A Smoother Onboarding for Developers - An Easier Way to Chat with Models

Previously, using Ollama meant firing up a terminal and issuing ollama run commands to start a model session. Now, the desktop app opens like any native application, offering a simple and clean chat interface.

You can now talk to models the same way you would in ChatGPT — but entirely offline. This is perfect for:

The app gives you immediate access to local models like codellama or mistral with no setup beyond a simple installation.

And for developers who love customization, the CLI still works behind the scenes letting you toggle context length, system prompts, and model versions via terminal if needed.


Drag. Drop. Ask Questions.

Chat with Files

One of the most developer-friendly features in the new app is file ingestion. Just drag a file into the chat window  whether it’s a .pdf, .md, or .txt and the model will read its contents.

Need to understand a 60-page design doc? Want to extract TODOs from a messy README? Or summarize a client’s product brief? Drop it in and ask natural language questions like:

This feature can dramatically cut down time spent scanning documentation, reviewing specs, or onboarding into new projects.


Go Beyond Text

Multimodal Support

Select models within Ollama (such as Llava-based ones) now support image input. That means you can upload an image, and the model will interpret and respond to it.

Some use cases include:

While this is still early-stage compared to tools like GPT-4 Vision, having multimodal support baked into a local-first app is a big step for developers building multi-input systems or testing AI interfaces.


Private, Local Docs — at Your Command

Documentation Writing

If you're maintaining a growing codebase, you know the pain of documentation drift. With Ollama, you can use local models to help generate or update documentation without ever pushing sensitive code to the cloud.

Just drag a file — say utils.py — into the app and ask:

This becomes even more powerful when paired with tools like [Deepdocs] that automate documentation workflows using AI. You can pre-load your project’s README or schema files, then ask follow-up questions or generate change logs, migration notes, or update guides — all locally.


Performance Tuning Under the Hood

With this new release, Ollama also improved performance across the board:

These upgrades make the app flexible for everything from local agents to dev tools to personal research assistants.


CLI and GUI: Best of Both Worlds

The best part? The new desktop app doesn’t replace the terminal — it complements it.

You can still:

ollama pull codellama
ollama run codellama

Or expose the model server:

ollama serve --host 0.0.0.0

So if you're building a custom AI interface, agent, or plugin that relies on a local LLM, you can now build on top of Ollama’s API and use the GUI for direct interaction or testing.

Test Ollama’s API Locally with Apidog

Apidog Product UI

Want to integrate Ollama into your AI app or test its local API endpoints? You can spin up Ollama's REST API using:

bash tollama serve

Then, use Apidog to test, debug, and document your local LLM endpoints.

button

test local llms using Apidog

Why use Apidog with Ollama:

Developer Use Cases That Actually Work

Here’s where the new Ollama app shines in real developer workflows:

Use Case How Ollama Helps
Code Review Assistant Run codellama locally for refactor feedback
Documentation Updates Ask models to rewrite, summarize, or fix doc files
Local Dev Chatbot Embed into your app as a context-aware assistant
Offline Research Tool Load PDFs or whitepapers and ask key questions
Personal LLM Playground Experiment with prompt engineering & fine-tuning

For teams worried about data privacy or model hallucinations, local-first LLM workflows offer an increasingly compelling alternative.


Final Thoughts

The desktop version of Ollama makes local LLMs feel less like a hacky science experiment and more like a polished developer tool.

With support for file interaction, multimodal inputs, document writing, and native performance, it’s a serious option for developers who care about speed, flexibility, and control.

No cloud API keys. No background tracking. No per-token billing. Just fast, local inference with the choice of whatever open model suits your needs.

If you’ve been curious about running LLMs on your machine, or if you're already using Ollama and want a smoother experience, now’s the time to try it again.

Explore more

How Claude Code Prompts Boost Coding Efficiency

How Claude Code Prompts Boost Coding Efficiency

Boost your coding efficiency with Claude Code Prompts. Learn prompt engineering, integrate Claude into workflows.

28 July 2025

How to Debug Your Code with AI Using Cursor Bugbot

How to Debug Your Code with AI Using Cursor Bugbot

Learn how to debug your code efficiently with AI using Cursor Bugbot. Discover its features, setup process, and best practices for optimal results.

25 July 2025

How to Properly Use Base URLs in Apidog

How to Properly Use Base URLs in Apidog

Simplify API management in Apidog by organizing services into modules and assigning base URLs per environment. Ideal for teams and microservices.

25 July 2025

Practice API Design-first in Apidog

Discover an easier way to build and use APIs