llama.cpp_opencl_rx_6600_xt_on_ubuntu_24.04
This is an old revision of the document!
🧠 Llama.cpp + OpenCL (RX 6600 XT on Ubuntu 24.04)
📅 Summary
- GPU: AMD RX 6600 XT
- Runtime: Mesa Rusticl (no ROCm needed)
- Model Format: .gguf (quantized)
- Server: llama-server HTTP + Web UI
- Client: Web browser or PowerShell Invoke-RestMethod
—
🧰 Step-by-step Procedure
# 1. Optional: ROCm was tried (not strictly needed with Rusticl) sudo apt install rocm-opencl-dev # 2. Mesa drivers + OpenCL ICD loader sudo apt install mesa-opencl-icd clinfo # 3. Confirm GPU visibility clinfo | grep 'Device Name' # → Should list RX 6600 XT # 4. Get the source mkdir -p ~/oclLlama cd ~/oclLlama git clone https://github.com/ggerganov/llama.cpp cd llama.cpp # 5. Install build deps sudo apt install cmake build-essential \ libclblast-dev ocl-icd-opencl-dev \ libcurl4-openssl-dev # 6. Build with OpenCL mkdir build && cd build cmake .. -DLLAMA_CLBLAST=on -DCMAKE_BUILD_TYPE=Release cmake --build . --config Release -j$(nproc) # 7. Download or upload a model mkdir -p ~/oclLlama/llama.cpp/models # (Copy from your laptop or download a .gguf file) # 8. Run the API server ./bin/llama-server \ --model ~/oclLlama/llama.cpp/models/phi-2.Q4_K_M.gguf \ --host 0.0.0.0 \ --port 11434 \ --n-gpu-layers 100
—
🌍 Test Access from Browser
- ✅ Built-in Web UI opens
—
🧪 Test from PowerShell
Invoke-RestMethod -Uri "http://ryzen-ubuntu.facundoitest.space:11434/v1/completions" ` -Method Post ` -ContentType "application/json" ` -Body '{ "model": "phi-2.Q4_K_M.gguf", "prompt": "OpenCL advantages?", "max_tokens": 64 }'
—
✅ Confirm GPU Usage
radeontop # real-time GPU load strings ./bin/main | grep -i clblast ldd ./bin/main | grep -i opencl
—
🧹 Optional Cleanup
sudo apt purge rocm-opencl-dev amdgpu-install
llama.cpp_opencl_rx_6600_xt_on_ubuntu_24.04.1751161042.txt.gz · Last modified: 2025/06/29 01:37 by oso
