Why you need to use LM Studio not Ollama for AI on mini PCs

I’ve been playing around a lot more recently with LLMs running locally on my PC, and for the most part, that’s been with Ollama.

Ollama is a fantastic tool that makes it extremely simple to download and run LLMs on your own PC, made even easier with the launch of its new GUI application. It can hook into your workflow beautifully, but it has one glaring flaw.

If you don’t have a dedicated GPU, its performance isn’t that good. I’ve been using it with an RTX 5080 (and soon to try an RTX 5090), and it flies along, but on something like the new Geekom A9 Max mini PC, which just landed for review, it’s a very different story.

Ollama does not officially support integrated graphics on Windows. Out of the box, it’ll default to using the CPU, and honestly, there are times when I’m not interested in looking for workarounds.

Instead, enter LM Studio, which makes it so simple to leverage integrated GPUs, even I can do it in seconds. That’s what I want, that’s what you should want, and it’s why you need to ditch Ollama if you want to leverage your integrated GPU.

What is LM Studio?

The performance here is excellent for a fairly hefty model using what is a mobile AMD APU. (Image credit: Windows Central)

Without going too far into the weeds, LM Studio is another app you can use on your Windows PC to download LLMs and play about with them. It does things a little differently from Ollama, but the end result is the same.

It also has a significantly more advanced GUI than Ollama’s official app, which is another tick towards using it. To get the most from Ollama outside of the terminal, you need to use something third-party, such as OpenWebUI or the Page Assist browser extension.

It’s ultimately a one-stop shop to find and install models, and then interact with them through a familiar-feeling chatbot interface. There are many advanced features you can play with, but for now, we’ll leave it simple.

The big win is that LM Studio supports Vulkan, which means you can offload models on both AMD and Intel to the integrated GPU for compute. That’s a big deal, because I’m yet to see in my own testing a situation where using the GPU wasn’t faster than the CPU.

So, how do you use an integrated GPU in LM Studio?

Offloading gpt-oss:20b to an integrated AMD GPU in LM Studio

To make sure you’re utilizing your iGPU is as simple as changing a value on this slider. (Image credit: Windows Central)

The other beauty of just going the LM Studio route is that there’s nothing fancy or technical that needs to be done in order to use your iGPU with an LLM.

To use a model, you simply load it up with the dropdown box at the top. When you select the one you want, a bunch of settings will appear. In this instance, we’re only really interested in the GPU offload one.

It’s a sliding scale and works in layers. Layers are essentially the blocks that make up the LLM, and instructions will pass down them one by one, until the final layer produces the response. It’s up to you how many you offload, but anything below all of them means your CPU will be picking up some of the slack.

Once you’re happy, click load model, and it’ll load into memory and operate within your designated parameters.

Windows Central Senior Editor Ben Wilson holding a Geekom A8 Max mini PC with a red background

Using LM Studio you, too, can be this happy using a mini PC for local AI! (Image credit: Ben Wilson | Windows Central)

As an example, on the aforementioned Geekom A9 Max, I have Radeon 890M integrated graphics, and I’m wanting to use basically all of it. When I’m using AI, I’m not gaming, so I want all of the GPU to focus on the LLM. I set 16GB of my 32GB total system memory as reserved for the GPU to load the model into, and get to work.

With a model like gpt-oss:20b I can load the entire thing into that dedicated GPU memory, use the GPU for compute, leave the rest of the system memory and the CPU well alone and get around 25 tokens per second.

Is it as fast as my desktop PC with an RTX 5080 inside? Not at all. Is it faster than using the CPU? Absolutely. With the added benefit that it isn’t using almost all of my available CPU resources, resources that other software on the PC might want to use. The GPU is basically dormant most of the time; why wouldn’t you want to use it?

I could probably get even more performance if I spent the time getting into the weeds, but that’s not the focus here. The focus is on LM Studio, and how, if you don’t have a dedicated GPU, it’s hands-down the tool to use for local LLMs. Just install it and everything works.

What's Hot

This Ryzen AI-powered OLED laptop with 24GB RAM is only $700

Made by Google 2025: What to expect from Google’s new Pixel hardware

‘Kirby Air Riders’ Is Coming to Switch 2 and It’s ‘Basically Like Mario Kart’

This Ryzen AI-powered OLED laptop with 24GB RAM is only $700

The AI-Powered PDF Marks the End of an Era

I tested Datadog network monitoring and found it amazing for analytics and integrations review

Apple Pencil With ‘Trackball’ Tip, Ability to Draw on Any Surface Described in Patent Document

Samsung Galaxy Z Fold 7 and Galaxy Z Flip 7: First Impressions

The Bezos-funded climate satellite is lost in space

Most Popular

Best Fitbit fitness trackers and watches in 2025

There are still 200+ Prime Day 2025 deals you can get

The best earbuds we’ve tested for 2025

Our Picks

This Ryzen AI-powered OLED laptop with 24GB RAM is only $700

Made by Google 2025: What to expect from Google’s new Pixel hardware

‘Kirby Air Riders’ Is Coming to Switch 2 and It’s ‘Basically Like Mario Kart’

Subscribe to Updates

What's Hot

Why you need to use LM Studio not Ollama for AI on mini PCs

Related Posts

Subscribe to Updates