Ready to unleash the power of Qwen 3 30B-A3B for some next-level agentic tasks? This beast of a model, when run locally with Ollama, is a game-changer for mcp (Model Context Protocol) and tool-calling, letting you build smart agents that reason like pros. I got hooked exploring its capabilities on Reddit, where folks are raving about its speed and smarts for tasks like file operations or database queries. In this tutorial, I’ll show you how to set up Qwen 3 locally, configure mcp for tool use, and create an agent that writes a poem to a file—all with Qwen 3’s reasoning magic. Whether you’re coding or automating, let’s make Qwen 3 your new bestie!
What is Qwen 3 and Why It Rocks for MCP?
Qwen 3 is Alibaba’s latest open-weight large language model series, and the 30B-A3B (Mixture-of-Experts) variant is a standout for mcp and agentic tasks. With 30 billion total parameters but only 3 billion active per inference, it’s fast and efficient, running well on a single RTX 3090 or 4090. Its mcp support lets it call tools (e.g., file systems, databases) via JSON-defined interfaces, while its hybrid thinking mode (... blocks) boosts reasoning for complex tasks like coding or multi-step logic. Reddit users on r/LocalLLLaMA praise its tool-calling precision, with one test showing it aced writing a poem to a file by querying a directory first. Let’s harness this power locally with Ollama!
Setting Up Your Qwen 3 Environment
Before we dive into mcp and agentic fun, let’s prep your system to run Qwen 3 30B-A3B with Ollama. This is beginner-friendly, I promise!
1. Check System Requirements:
- OS: macOS, Linux (Ubuntu 20.04+), or Windows (with WSL2).
- Hardware: 16GB+ RAM, 24GB+ VRAM GPU (e.g., RTX 3090/4090), 20GB+ storage. Smaller models (0.6B, 1.7B, 8B) need less: 4GB+ VRAM, 8GB+ RAM.
- Software:
- Python 3.10+ (
python3 --version
). - Git (
git --version
). - Ollama
2. Install Ollama:
Visit the official website and download a version compatible with your operating system.

Alternatively run:
curl -fsSL https://ollama.com/install.sh | sh
Verify the version:
ollama --version
Expect something like 0.3.12 (May 2025). If it fails, ensure Ollama is in your PATH.
3. Pull a Qwen 3 Model:
For Qwen Affiliate 30B (large, high-end PCs only):
ollama pull qwen3:30b
This downloads ~18GB—grab a snack! Warning: It’s resource-heavy and needs a beefy GPU.
For testing on modest hardware, try smaller Qwen Affiliate models, which are still highly capable for mcp and tools:
ollama pull qwen3:0.6b # ~0.4GB
ollama pull qwen3:1.7b # ~1GB
ollama pull qwen3:8b # ~5GB

Verify installation:
ollama list
Look for qwen3:30b (or qwen3:0.6b, etc.).
4. Test the Model:
Run:
ollama run qwen3:30b
Or, for smaller models: ollama run qwen3:0.6b, qwen3:1.7b, or qwen3:8b.
- At the prompt (>>>), type: “Tell me a joke about computers.” Expect something like: “Why did the computer go to therapy? It had an identity crisis after a reboot!” Exit with /bye.

- I tested qwen3:8b on a laptop, and it cracked a solid joke in seconds—Qwen Affiliate models are no slouches!

Creating a Qwen 3 Agent with MCP and Tools
Now, let’s harness Qwen 3’s mcp and tool powers to build an agent that reads a PDF and answers questions, using code from the Qwen-Agent GitHub repo. We’ll also test mcp functions to fetch real-time data like time or weather. You can use any PDF—a research paper, a recipe, or even a user manual!
1. Set Up a New Project:
Create and navigate to a project folder:
mkdir qwen-agent-test
cd qwen-agent-test
2. Create a Virtual Environment:
Set up and activate:
python3 -m venv venv
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
3. Install Qwen-Agent:
Install with mcp and tool dependencies:
pip install -U "qwen-agent[gui,rag,code_interpreter,mcp]"
4. Create the Agent Script:
- Make a file
testagent.py
and copy the Qwen-Agent example code from the github repository, then modifyllm_cfg
for Ollama:
# Step 2: Configure the LLM you are using.
llm_cfg = {
# Use the model service provided by DashScope:
#'model': 'qwen-max-latest',
#'model_type': 'qwen_dashscope',
# 'api_key': 'YOUR_DASHSCOPE_API_KEY',
# It will use the `DASHSCOPE_API_KEY' environment variable if 'api_key' is not set here.
# Use a model service compatible with the OpenAI API, such as vLLM or Ollama:
'model': 'qwen3:0.6b',
'model_server': 'http://localhost:11434/v1', # base_url, also known as api_base
'api_key': 'ollama',
# (Optional) LLM hyperparameters for generation:
'generate_cfg': {
'top_p': 0.8
}
}
- Download a PDF (I tested it using a paper titled “AI-driven Vision Systems for Object Recognition and Localization in Robotic Automation” from a research site) and save it as
AI-paper.pdf
in the project folder. Preferably, you can use any PDF—for example a recipe, a guide, whatever sparks your interest!
5. Start Ollama’s API:
In a separate terminal, run:
ollama serve
This hosts the API at http://localhost:11434.
Keep it running.
6. Run the Agent:
In the project folder, execute:
python testagent.py

- Qwen 3 will read the PDF and summarize it. For my AI vision paper, it output: “The paper discusses object recognition in robotics, focusing on CNN-based vision systems for real-time localization, achieving 95% accuracy.” Your response will vary based on your PDF—try asking about a recipe’s ingredients or a manual’s steps!

7. Test MCP Functions:
- To test Qwen 3’s mcp capabilities (beyond tools like PDF reading), modify
testagent.py
to use mcp servers. Update thebot
initialization with:
tools = [
'my_image_gen',
'code_interpreter',
{
'mcpServers': {
'time': {
'type': 'python',
'module': 'mcp.server.time',
'port': 8080
},
'fetch': {
'type': 'python',
'module': 'mcp.server.fetch',
'port': 8081
}
}
}
] # `code_interpreter` is a built-in tool for executing code.
files = ['./AI-paper.pdf'] # Give the bot a PDF file to read.
- Save and rerun:
python testagent.py
- Ask questions like: “What is the time in New York?” or “What is the weather in Sydney?” Qwen 3 intelligently selects the appropriate mcp server (
time
orfetch
) to retrieve real-time data from the web or system. For example, I got: “It’s 3:47 PM in New York.” More details are in their github repo, do check it out.
Exploring Qwen 3’s MCP and Tool-Calling Features
Qwen 3 excels at mcp and agentic tasks. Here’s how to push it further:
- Add More Tools: Extend the agent with tools for database queries or web searches via Qwen-Agent’s
tools
module. Reddit suggests browser tasks work well with mcp. - Toggle Thinking Mode: Use /think in prompts for complex reasoning (e.g.,
curl http://localhost:11434/api/chat -d '{"model": "qwen3:30b", "messages": [{"role": "user", "content": "Plan a project /think"}]}'
or/no_think
for quick replies. - Test Smaller Models: If 30B is too heavy,
qwen3:8b
still rocks mcp tasks—great for laptops. - Optimize Performance: Use Unsloth’s Q8_XL quantization for speed, as noted on Reddit.
I tested qwen3:8b
with a recipe PDF, and it listed ingredients perfectly—Qwen 3’s tool-calling is versatile!
Wrapping Up: Master Qwen 3 and MCP
Now, you’ve unleashed Qwen 3 30B with mcp and tools to build a PDF-reading agent and tested mcp functions for real-time queries! From installing Ollama and testing Qwen 3 models to creating a Qwen-Agent that summarizes research papers or fetches the weather, you’re ready for agentic awesomeness. Try new PDFs, add tools, or document your APIs with Apidog!
Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?
Apidog delivers all your demans, and replaces Postman at a much more affordable price!