I’ve had my hands on the new Orange Pi 5 Pro for over a month now, and it’s a pretty solid device. It’s a bit more expensive than a Raspberry Pi 5, but it also comes with a few more features. One of those killer features is a NPU that can run at up to 6 TOPS. That’s all well and good, but making use of it is convoluted and takes some professional programming skills to fully unlock. Although Orange Pi doesn’t have the massive community that Raspberry Pi does, there is a dedicated cadre of developers working hard to open up the NPU on the Orange Pi.

How to run an LLM on the Orange Pi 5 Pro’s NPU

Full disclosure: getting access to the NPU on the Orange Pi is kind of a pain. You need to install a bespoke version of the latest version of Ubuntu, and you need to install special software that will let you run specifically converted LLMs on your NPU instead of your CPU. The whole process is definitely more complicated than running Ollama, but you can get some serious gains if you're up to the challenge. So, if you’re still here, let’s dive into it!

Installing the operating system

The first thing you need to do is install the proper operating system. I recommend a customized Ubunto built specifically for Rockchip SoCs by GitHub user Joshua Riek. You’re going to need version 24.04 because it has the latest version of the NPU driver that is needed to run the LLMs. Flashing the OS onto an SD card for your Orange Pi is basically the same as for a Raspberry Pi, but here’s a quick overview:

Download the OS image to your computer of choice. Open up a program that will help flash your SD card. We'll use balenaEtcher for this guide. Select Flash from file and select the OS image file you downloaded earlier. Close Make sure your SD card is inserted and click Select target. Choose your SD card from the list and click Select. Close Select Flash!

This process can take up to 10 minutes to complete, so just be patient while the program does its thing.

Once you’ve successfully flashed the OS, transfer the SD card into your Orange Pi and power it up. Make sure you have a keyboard and monitor handy, because we’ll need direct access to our SBC at least long enough to install SSH.

Installing SSH

After you get through the initial OS setup on your OPi, open Terminal by pressing Ctrl + Alt + T. Type in sudo apt install openssh-server into your terminal. This will allow you to access your Orange Pi from another computer. The reason we’re going to all this trouble is because the program that runs the LLM won’t work if the desktop is running, but it will from the SSH terminal.

Before you leave your Orange Pi, take note of its IP address, and then log in via SSH. If SSH isn’t your thing, you can still follow this guide on your device, but you’ll have to press Ctrl + Alt + F5 to exit your desktop and work exclusively in your shell.

Installing RKNN LLM and RKNN Toolkit 2

Now we can begin installing the software that will run our LLMs. RKNN LLM is the program that will run the LLM on our machine. RKNN Toolkit 2 is the bit of software that lets other software talk to the NPU. We’re going to install both in one fell swoop with a script provided by GitHub user Pelochus. Type this into your terminal:

sudo curl https://raw.githubusercontent.com/Pelochus/ezrknpu/main/install.sh | sudo bash

This will take between 5 and 10 minutes to run, so just be patient.

Installing an LLM

Once you’re done installing RKNN LLM and RKNN Toolkit 2, you can install a model (which will take another 5 to 10 minutes). In order for an LLM to take advantage of the NPU on the Rockchip RK3588S SoC on your Orange Pi, it needs to be converted using the RKNN Toolkit 2 (which is far beyond the scope of this guide).

Fortunately, Pelochus maintains a Hugging Face repository of LLMs converted to be used with the RK3588S. Unfortunately, not all of those models are compatible with the software we just downloaded. You need to look for a model that has been converted with the RKLLM runtime 1.0.1. We’re going to install Microsoft’s Phi-3 Mini model with 3.8B parameters. Enter this command into your terminal:

GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/Pelochus/phi-3-mini-rk3588

The first part of the command in all-caps will ensure that we only clone the smaller files first. If we attempt to clone the entire repository at once, we might get some errors. Next, navigate to the new directory we just made (cd ~/phi-3-mini-rk3588) and run the command git lfs pull. This will download the big, multi-gigabyte model file.

Running an LLM on your Orange Pi 5 Pro NPU

If everything went according to plan, you should be ready to launch your LLM. In Terminal, type rkllm phi-3-mini-4k-rk3588.rkllm.

So how does it run? It’s much faster than the Phi-3 that we ran through Ollama on an overclocked Raspberry Pi 5, but it also seems to suffer from multiple personality disorder where it will talk to itself. It also appears to have a hard limit to its output and will cut itself off mid-sentence if it reaches its limit (which might be a good thing given how verbose Phi-3 can be).