Seems to work with
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6600 (RADV NAVI23) (radv) | uma: 0 |
fp16: 1 | warp size: 64 | matrix cores: none
Not a proper benchmark, but with -ngl 32, responding happens *a lot*
faster, and I've heard my GPU fan for I think the first
Hi,
Seems to work with my Radeon card.
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6900 XT (RADV NAVI21) (radv) | uma: 0 | fp16: 1
| warp size: 64 | matrix cores: none
llama-cli -m ./mistral-7b-instruct-v0.2.Q8_0.gguf
[..]
llama_perf_sampler_print:sampling time =
On 2025/01/31 20:31, Chris Cappuccio wrote:
> This might be a way to do it. Does anyone have a card to test against?
No idea, descriptions for the vulkan-related ports don't give away much
information about what's supported or how to use them.
Had a thought earlier, it would be quite interesting
This might be a way to do it. Does anyone have a card to test against?
Index: Makefile
===
RCS file: /cvs/ports/misc/llama.cpp/Makefile,v
retrieving revision 1.1
diff -u -p -u -r1.1 Makefile
--- Makefile30 Jan 2025 22:55:11 -