branch: externals/minuet commit 9bfe43d55fe7f4e788572fcd754740186df30c40 Author: Milan Glacier <d...@milanglacier.com> Commit: Milan Glacier <d...@milanglacier.com>
doc: Add DeepInfra FIM example for using with OpenAI-FIM-Compatible provider. --- recipes.md | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 78 insertions(+), 4 deletions(-) diff --git a/recipes.md b/recipes.md index 9b088d4c90..e21a928660 100644 --- a/recipes.md +++ b/recipes.md @@ -1,7 +1,7 @@ # Launching the llama.cpp Server: Example Script -This guide provides several configuration variants for the `qwen2.5-coder` -based on local computing power, specifically the available VRAM. +This guide provides several configuration variants for the `qwen2.5-coder` based +on local computing power, specifically the available VRAM. ### **For Systems with More Than 16GB VRAM** @@ -72,10 +72,84 @@ llama-server \ > Symbols like `<|fim_begin|>` and `<|fim_suffix|>` are special tokens > that serve as prompt boundaries. Some LLMs, like Qwen2.5-Coder, have > been trained with specific tokens to better understand prompt -> composition. Different LLMs use different special tokens during +> composition. Different LLMs use different special tokens during > training, so you should adjust these tokens according to your > preferred LLM. ## **Acknowledgment** -- [llama.vim](https://github.com/ggml-org/llama.vim): A reference for CLI parameters used in launching the `llama.cpp` server. +- [llama.vim](https://github.com/ggml-org/llama.vim): A reference for CLI + parameters used in launching the `llama.cpp` server. + +# Using Non-OpenAI-Compatible FIM APIs with DeepInfra + +The `openai_fim_compatible` backend supports advanced customization to integrate +with alternative providers. + +- **`:transform`**: A list of functions that accept a plist containing fields + listed below. Each function processes and returns a transformed version of + these attributes. + + - `:end_point`: The API endpoint for the completion request. + - `:headers`: HTTP headers for the request. + - `:body`: The request body for the API. + +- **`:get_text_fn`**: Function to extract text from streaming responses. + +Below is an example configuration for integrating the `openai_fim_compatible` +backend with the DeepInfra FIM API and Qwen-2.5-Coder-32B-Instruct model. + +```lisp +(use-package minuet + :config + (setq minuet-provider 'openai-fim-compatible) + + (plist-put minuet-openai-fim-compatible-options :end-point "https://api.deepinfra.com/v1/inference/") + (plist-put minuet-openai-fim-compatible-options :api-key "DEEPINFRA_API_KEY") + (plist-put minuet-openai-fim-compatible-options :model "Qwen/Qwen2.5-Coder-32B-Instruct") + (plist-put minuet-openai-fim-compatible-options :transform '(minuet-deepinfra-fim-transform)) + + (minuet-set-optional-options minuet-openai-fim-compatible-options :max_tokens 56) + (minuet-set-optional-options minuet-openai-fim-compatible-options :stop ["\n\n"]) + + ;; DeepInfra FIM does not support the `suffix` option in FIM + ;; completion. Therefore, we must disable it and manually + ;; populate the special tokens required for FIM completion. + (minuet-set-optional-options minuet-openai-fim-compatible-options :suffix nil :template) + + ;; Custom prompt formatting for Qwen model + (minuet-set-optional-options minuet-openai-fim-compatible-options + :prompt + (defun minuet-deepinfra-fim-qwen-prompt-function (ctx) + (format "<|fim_prefix|>%s\n%s<|fim_suffix|>%s<|fim_middle|>" + (plist-get ctx :language-and-tab) + (plist-get ctx :before-cursor) + (plist-get ctx :after-cursor))) + :template) + + ;; Function to transform requests data according to DeepInfra's API format. + (defun minuet-deepinfra-fim-transform (data) + ;; DeepInfra requires the endpoint to be formatted as: https://api.deepinfra.com/v1/inference/$MODEL_NAME + `(:end-point ,(concat (plist-get data :end-point) + (--> data + (plist-get it :body) + (plist-get it :model))) + ;; No modifications needed for headers. + :headers ,(plist-get data :headers) + ;; DeepInfra uses `input` instead of `prompt`, and does not require :model in the request body. + :body ,(--> data + (plist-get it :body) + (plist-put it :input (plist-get it :prompt)) + (map-delete it :model) + (map-delete it :prompt)))) + + ;; Function to extract generated text from DeepInfra's JSON output. + (plist-put minuet-openai-fim-compatible-options + :get-text-fn + (defun minuet--deepinfra-get-text-fn (json) + ;; DeepInfra's response format is: `json.token.text` + (--> json + (plist-get it :token) + (plist-get it :text)))) + ) +```