Raspberry Pi 5 successfully accelerates LLMs using an eGPU and Vulkan

ARaspberry Pi 5hooked up to an AMDRadeon-powered eGPU has been demonstrated using the graphics hardware to accelerate running a Large Language Model (LLM). Of course, it’s Pi wizard Jeff Geerling again, and in the video embedded below, he talks us through his experience of leveraging theVulkanAPI support to enjoy GPU-accelerated local AI on the Raspberry Pi 5.

In our last Raspberry Pi 5 connected to an eGPU progress report, we highlighted the modern AAA4K gaming possibilitiesof this unlikely pairing. Games likeDoom Eternal, Crysis Remastered, Red Dead Redemption 2, andForza Horizon 4were demoed running at 4K on our favorite $50 SBC. With most struggling to maintain performance above say 25fps, actual enjoyment of the titles would be another question.

Raspberry Pi 5 accelerates LLMs using an eGPU

Geerling ended his fun and informative video, last time, with an update on the Pi 5’s LLM support. He noted that he hadn’t managed to GPU accelerate any LLMs on the Pi 5, but smaller models could run on the CPU, in the Pi’s RAM. Moreover, with AMD basically ruling out ROCm support on Arm, prospects didn’t look good.

Thankfully, in the world of enthusiast-driven tech, things can change quickly. In his latest video, Geerling reveals the answer to GPU-accelerated LLMs on the Pi 5 is the Vulkan API (with an experimental patch). Vulkan can even outperformAMD’s ROCmon hardware / systems that offer the choice between, notes Geerling, so it is by no means merely a poor man’s choice.

YouTube

At around two minutes into the video, Geerling walks us through his hardware setup. The most esoteric thing here are the two boards used to hook up the GPU to the Pi. He used an adaptor to convert the Pi’s PCIe express FFC connector to an M.2 slot. Into the M.2 slot, he plugged an M.2 toOCuLinkadaptor, with a cable to a GPU OCuLink riser. In the video, he uses anRX 6700 XTagain (you’ll need a spare PC PSU too, among several other bits and pieces).

Software setup is currently a bit more involved, requiring the user to compile their own Linux kernel, collect together a handful of drivers and patches, and more. More guidance is available viaGeerling’s blog.

Raspberry Pi 5 accelerates LLMs using an eGPU

Casting more light onto the benefits of his hardware and software wrangling, the Pi enthusiast and TechTuber provides some performance figures and comparisons.

Get Tom’s Hardware’s best news and in-depth reviews, straight to your inbox.

Raspberry Pi 5 accelerates LLMs using an eGPU

It is interesting to hear Geerling propose the Pi plus eGPu as an alternative which is almost as fast and efficient as an M1 MaxMac Studio(64GB). He also highlighted that the cost of the whole caboodle is about $700 new, but a lot cheaper if you already have some of the bits and pieces (especially for those with a spare old GPU).

Adding theRTX 4090benchmark to the mix (second slide) shows how much LLM performance a powerful modern PC can muster. That’s great if you want a 600W system generating hundreds of tokens per second (T/s), but for home use offline AI then 40-60 T/s should be plenty. Moreover, whoever pays your energy bill might be pleased with the ~12W system idling power consumption of this efficient Pi-based (Pi 5 plus RX 6700 XT) solution.

Mark Tyson

Mark Tyson is a news editor at Tom’s Hardware. He enjoys covering the full breadth of PC tech; from business and semiconductor design to products approaching the edge of reason.