Re: llama.cpp vulkan

2025-02-02 Thread openbsd-ports
Seems to work with ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: 0 = AMD Radeon RX 6600 (RADV NAVI23) (radv) | uma: 0 | fp16: 1 | warp size: 64 | matrix cores: none Not a proper benchmark, but with -ngl 32, responding happens *a lot* faster, and I've heard my GPU fan for I think the first

Re: llama.cpp vulkan

2025-02-01 Thread Jonathan Armani
Hi, Seems to work with my Radeon card. ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: 0 = AMD Radeon RX 6900 XT (RADV NAVI21) (radv) | uma: 0 | fp16: 1 | warp size: 64 | matrix cores: none llama-cli -m ./mistral-7b-instruct-v0.2.Q8_0.gguf [..] llama_perf_sampler_print:sampling time =

Re: llama.cpp vulkan

2025-02-01 Thread Stuart Henderson
On 2025/01/31 20:31, Chris Cappuccio wrote: > This might be a way to do it. Does anyone have a card to test against? No idea, descriptions for the vulkan-related ports don't give away much information about what's supported or how to use them. Had a thought earlier, it would be quite interesting

llama.cpp vulkan

2025-01-31 Thread Chris Cappuccio
This might be a way to do it. Does anyone have a card to test against? Index: Makefile === RCS file: /cvs/ports/misc/llama.cpp/Makefile,v retrieving revision 1.1 diff -u -p -u -r1.1 Makefile --- Makefile30 Jan 2025 22:55:11 -