Close Menu
TechUpdateAlert

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    My Health Anxiety Means I Won’t Use Apple’s or Samsung’s Smartwatches. Here’s Why

    December 22, 2025

    You can now buy the OnePlus 15 in the US and score free earbuds if you hurry

    December 22, 2025

    Today’s NYT Connections: Sports Edition Hints, Answers for Dec. 22 #455

    December 22, 2025
    Facebook X (Twitter) Instagram
    Trending
    • My Health Anxiety Means I Won’t Use Apple’s or Samsung’s Smartwatches. Here’s Why
    • You can now buy the OnePlus 15 in the US and score free earbuds if you hurry
    • Today’s NYT Connections: Sports Edition Hints, Answers for Dec. 22 #455
    • Android might finally stop making you tap twice for Wi-Fi
    • Today’s NYT Mini Crossword Answers for Dec. 22
    • Waymo’s robotaxis didn’t know what to do when a city’s traffic lights failed
    • Today’s NYT Wordle Hints, Answer and Help for Dec. 22 #1647
    • You Asked: OLED Sunlight, VHS on 4K TVs, and HDMI Control Issues
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechUpdateAlertTechUpdateAlert
    • Home
    • Gaming
    • Laptops
    • Mobile
    • Software
    • Reviews
    • AI & Tech
    • Gadgets
    • How-To
    TechUpdateAlert
    Home»How-To»NVIDIA RTX 5090 outperforms AMD and Apple running local OpenAI language models
    How-To

    NVIDIA RTX 5090 outperforms AMD and Apple running local OpenAI language models

    techupdateadminBy techupdateadminOctober 20, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Nvidia 2
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Developers and creatives looking for greater control and privacy with their AI  are increasingly turning to locally run models like OpenAI’s new gpt-oss family of models, which are both lightweight and incredibly functional on end-user hardware. Indeed, you can have it run on consumer GPUs with just 16GB of memory. That makes it possible to use a wide range of hardware – with NVIDIA GPUs emerging as the best way to run these sorts of open-weight models.

    While nations and companies rush to develop their own bespoke AI solutions to a range of tasks, open source and open-weight models like OpenAI’s new gpt-oss-20b are finding much more adoption. This latest release is roughly comparable to the GPT-4o mini model which proved so successful over the past year. It also introduces chain of thought reasoning to deeply think through problems, adjustable reasoning levels to adjust thinking capabilities on-the-fly, expanded context length, and efficiency tweaks to help it run on local hardware, like NVIDIA’s GeForce RTX 50 Series GPUs.

    But you will need the right graphics card if you want to get the best performance. NVIDIA’s GeForce RTX 5090 is its flagship card that’s super-fast for gaming and a range of professional workloads. With its Blackwell architecture, tens of thousands of CUDA cores, and 32GB of memory, it’s a great fit for running local AI.

    Llama.cpp is an open-source framework that lets you run LLMs (large language models) with great performance especially on RTX GPUs thanks to optimizations made in collaboration with NVIDIA. Llama.cpp offers a lot of flexibility to adjust quantization techniques and CPU offloading.

    Llama.cpp has published their own tests of gpt-oss-20b, where the GeForce RTX 5090 topped the charts at an impressive 282 tok/s. That’s in comparison to the Mac M3 Ultra (116 tok/s) and AMD’s 7900 XTX (102 tok/s). The GeForce RTX 5090 includes built-in Tensor Cores designed to accelerate AI tasks maximizing performance running gpt-oss-20b locally.

    Note: Tok/s, or tokens per second, measures tokens, a chunk of text that the model reads or outputs in one step, and how quickly they can be processed.

    NVIDIA

    For AI enthusiasts that just want to use local LLMs with these NVIDIA optimizations, consider the LM Studio application, built on top of Llama.cpp. LM Studio adds support for RAG (retrieval-augmented generation) and is designed to make running and experimenting with large LLMs easy—without needing to wrestle with command-line tools or deep technical setup.

    NVIDIA RTX 5090 article 2

    NVIDIA

    Another popular open source framework for AI testing and experimentation is Ollama. It’s great for trying out different AI models, including the OpenAI gpt-oss models, and NVIDIA worked closely to optimize performance, so you’ll get great results running it on an NVIDIA GeForce RTX 50 Series GPU. It handles model downloads, environment setup and GPU acceleration automatically, as well as built-in model management to support multiple models simultaneously, integrating easily with applications and local workflows.

    Ollama also offers an easy way for end users to test the latest gpt-oss model. And in a similar way to llama.cpp, other applications also make use of Ollama to run LLMs. One such example is AnythingLLM with its straightforward, local interface making it excellent for those just getting started with LLM benchmarking.

    NVIDIA RTX 5090 article 3

    NVIDIA

    If you have one of the latest NVIDIA GPUs (or even if you don’t, but don’t mind the performance hit), you can try out gpt-oss-20b yourself on a range of platforms. LM Studio is great if you want a slick, intuitive interface that lets you grab any model you want to try out and it works on Windows, macOS, and Linux equally well.

    AnythingLLM is another easy-to-use option for running gpt-oss-20b and it works on both Windows x64 and Windows on ARM. There’s also Ollama, which isn’t as slick to look at, but it’s great if you know what you’re doing and want to get setup quickly.

    Whichever application you use to play around with gpt-oss-20b, though, the latest NVIDIA Blackwell GPUs seem to offer the best performance.

    AMD Apple Language local Models Nvidia OpenAI Outperforms RTX running
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleGoogle Has a Bed Bug Infestation in Its New York Offices
    Next Article An AWS Outage Broke the Internet While You Were Sleeping, and the Troubles Continue
    techupdateadmin
    • Website

    Related Posts

    Gadgets

    Lenovo Legion Pro 5 gaming laptop deal packs OLED, RTX 5060, and 32GB RAM

    December 20, 2025
    Gadgets

    Please, Apple, Give Us These Basic Upgrades in 2026

    December 19, 2025
    Mobile

    Apple Plans OLED iMac Upgrade, but the Wait Could Be Long

    December 19, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    NYT Strands hints and answers for Monday, August 11 (game #526)

    August 11, 202545 Views

    These 2 Cities Are Pushing Back on Data Centers. Here’s What They’re Worried About

    September 13, 202542 Views

    Today’s NYT Connections: Sports Edition Hints, Answers for Sept. 4 #346

    September 4, 202540 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    Best Fitbit fitness trackers and watches in 2025

    July 9, 20250 Views

    There are still 200+ Prime Day 2025 deals you can get

    July 9, 20250 Views

    The best earbuds we’ve tested for 2025

    July 9, 20250 Views
    Our Picks

    My Health Anxiety Means I Won’t Use Apple’s or Samsung’s Smartwatches. Here’s Why

    December 22, 2025

    You can now buy the OnePlus 15 in the US and score free earbuds if you hurry

    December 22, 2025

    Today’s NYT Connections: Sports Edition Hints, Answers for Dec. 22 #455

    December 22, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact us
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    © 2026 techupdatealert. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.