Re: [Openembedded-architecture] Running Large Language Models Locally -- Contribute meta-ollama to Yocto

hongxu via lists.openembedded.org Fri, 27 Feb 2026 20:33:38 -0800

On 2/27/26 16:44, Koen Kooi wrote:

CAUTION: This email comes from a non Wind River email account!
Do not click links or open attachments unless you recognize the sender and know 
the content is safe.

Op 25 feb 2026, om 06:22 heeft Hongxu Jia <[email protected]> het 
volgende geschreven:

On 1/31/26 00:56, Richard Purdie wrote:

CAUTION: This email comes from a non Wind River email account!
Do not click links or open attachments unless you recognize the sender and know 
the content is safe.

Hi Hongxu,

On Fri, 2026-01-30 at 13:20 +0800, hongxu via lists.openembedded.org wrote:

Would you please approve to add this layer to
https://git.yoctoproject.org/, I will maintain
meta-ollama for recipe uprev and bug fix

Thanks for sharing this, it looks really interesting and I like the
idea of it a lot.

Also, thanks for volunteering to maintain it, that does help alleviate
various concerns.

Approval for this rests with the Yocto Project TSC so I have asked them
about it. We should have a decision after our next meeting.

Hi Richard,
Ping, what is the status of the approval from Yocto Project TSC?
About the suggestion to rename the layer to meta-ai or meta-llm and collect 
other LLM applications, such as llama.cpp.
I am open to it, but I strongly insist to maintain meta-ollama as a standalone 
layer, further more, if we support llama.cpp later, we should add it as 
meta-llama-cpp
Then we could customize and isolate each LLM application in a layer, including:
- Add model recipe for LLM application, the format of LLM differs for each 
application (ollama vs llama.cpp)
- Boot up LLM application for Yocto image, add systemd service file

- Customize GPU support (CUDA) for LLM application

If Yocto decides to use meta-ai or meta-llm, I suggest to add meta-ollama and 
meta-llama-cpp as the subset of it, such as meta-xx in meta-openembedded

What are the technical arguments for needing to put those into their own 
layers? As I've stated before, I'd like to have a single layer that has the 
most used AI/ML things and I can help maintaining that on company time. 
Splitting that into multiple layers would increase the friction for both 
consumers and developers of those layers.

As above mentioned, the format of LLM differs for each application, themodel of ollama from ollama regristry https://registry.ollama.ai/, themodle of llama.cpp is from huggingface https://huggingface.co/

if we add a model recipe, we have to split the application name if weplace them in one layer

About boot up management, ollama has a single server to listen onspecific port for all local models,

but llama.cpp start a server to listen one port for one model, it doesnot support one port for multiple models, and we have softwarellama-swap https://github.com/mostlygeek/llama-swap to manage multiplemodels for llama.cpp

Currently ollama and llama.cpp are standalone LLM application and nodirect relation with each other, splitting layers for LLM applicationwill help use focus on each LLM application,


It is helpful to classify the maintain scope, based on LLM application.


//Hongxu

regards,

Koen

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#2292): 
https://lists.openembedded.org/g/openembedded-architecture/message/2292
Mute This Topic: https://lists.openembedded.org/mt/117540395/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-architecture/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [Openembedded-architecture] Running Large Language Models Locally -- Contribute meta-ollama to Yocto

Reply via email to