Re: NEW emulators/llama.cpp b4589

Chris Cappuccio Thu, 30 Jan 2025 10:04:07 -0800

Stuart Henderson [s...@spacehopper.org] wrote:
> 
> I'd be happy with misc. If we end up with dozens of related ports then
> maybe a new category makes sense but misc seems to fit and is not over-full.


Ok, here's a new spin for misc/llama.cpp with your patch applied.

Using this model an AMD EPYC 7313, I am getting 10 tokens/sec:

llama-cli --model DeepSeek-R1-Distill-Qwen-7B-Q8_0.gguf -c 131072 --threads 16 
--temp 0.6

With enough RAM you could run the actual DeepSeek R1. The distilled Qwen 7B is 
less useful.

llama.cpp.tar
Description: Unix tar archive

Re: NEW emulators/llama.cpp b4589

Reply via email to