llama.cpp_opencl_rx_6600_xt_on_ubuntu_24.04
๐ง Llama.cpp + OpenCL (RX 6600 XT on Ubuntu 24.04)
๐ Summary
- GPU: AMD RX 6600 XT
- Runtime: Mesa Rusticl (no ROCm needed)
- Model Format: .gguf (quantized)
- Server: llama-server HTTP + Web UI
- Client: Web browser or PowerShell Invoke-RestMethod
๐งฐ Step-by-step Procedure
Based on this cmd history:
4 sudo apt install rocm-opencl-dev 5 mkdir rocm 6 cd rocm 7 wget https://repo.radeon.com/amdgpu-install/6.3.3/ubuntu/noble/amdgpu-install_6.3.60303-1_all.deb 8 sudo apt install ./amdgpu-install_6.3.60303-1_all.deb 11 sudo reboot 16 cd rocm/ 18 sudo apt update 19 sudo apt full-upgrade 20 sudo apt install rocm-opencl-dev 22 sudo apt install clinfo 24 clinfo 31 sudo apt install mesa-opencl-icd clinfo 32 sudo nano /etc/drirc 33 sudo reboot 35 sudo clinfo | grep 'Device Name' 36 mkdir oclLlama 37 cd oclLlama/ 38 git clone https://github.com/ggerganov/llama.cpp 39 cd llama.cpp 46 make LLAMA_OPENCL=1 47 sudo apt install cmake build-essential libclblast-dev ocl-icd-opencl-dev 48 mkdir build 49 cd bould 50 cd build/ 51 cmake .. -DLLAMA_CLBLAST=on 52 sudo apt install curl 53 sudo apt install libcurl4-openssl-dev 54 cmake .. -DLLAMA_CLBLAST=on 55 cmake --build . --config Release -j$(nproc) 59 ls -lh /home/facundo/oclLlama/llama.cpp/models/*.gguf 60 cd ~/oclLlama/llama.cpp/models 68 ls *.gguf 69 ./bin/llama-server --model /home/facundo/oclLlama/llama.cpp/models/phi-2.Q4_K_M.gguf --host 0.0.0.0 --port 11434 --n-gpu-layers 100 70 ls .. 71 cd .. 72 ls 73 cd build/ here I uploaded the .gguf from my laptop 74 ./bin/llama-server --model /home/facundo/oclLlama/llama.cpp/models/phi-2.Q4_K_M.gguf --host 0.0.0.0 --port 11434 --n-gpu-layers 100
Gepetto said:
# 1. Optional: ROCm was tried (not strictly needed with Rusticl) sudo apt install rocm-opencl-dev # 2. Mesa drivers + OpenCL ICD loader sudo apt install mesa-opencl-icd clinfo # 3. Confirm GPU visibility clinfo | grep 'Device Name' # โ Should list RX 6600 XT # 4. Get the source mkdir -p ~/oclLlama cd ~/oclLlama git clone https://github.com/ggerganov/llama.cpp cd llama.cpp # 5. Install build deps sudo apt install cmake build-essential \ libclblast-dev ocl-icd-opencl-dev \ libcurl4-openssl-dev # 6. Build with OpenCL mkdir build && cd build cmake .. -DLLAMA_CLBLAST=on -DCMAKE_BUILD_TYPE=Release cmake --build . --config Release -j$(nproc) # 7. Download or upload a model mkdir -p ~/oclLlama/llama.cpp/models # (SCP from your laptop or download a .gguf file to /models/ dir) # 8. Run the API server ./bin/llama-server \ --model ~/oclLlama/llama.cpp/models/phi-2.Q4_K_M.gguf \ --host 0.0.0.0 \ --port 11434 \ --n-gpu-layers 100
๐ Test Access from Browser
- โ Built-in Web UI opens
๐งช Test from PowerShell
Invoke-RestMethod -Uri "http://ryzen-ubuntu.facundoitest.space:11434/v1/completions" ` -Method Post ` -ContentType "application/json" ` -Body '{ "model": "phi-2.Q4_K_M.gguf", "prompt": "OpenCL advantages?", "max_tokens": 64 }'
โ Confirm GPU Usage
radeontop # real-time GPU load strings ./bin/main | grep -i clblast ldd ./bin/main | grep -i opencl
๐งน Optional Cleanup
sudo apt purge rocm-opencl-dev amdgpu-install
llama.cpp_opencl_rx_6600_xt_on_ubuntu_24.04.txt ยท Last modified: 2025/07/01 13:30 by oso
