branch: externals/minuet
commit 9bfe43d55fe7f4e788572fcd754740186df30c40
Author: Milan Glacier <d...@milanglacier.com>
Commit: Milan Glacier <d...@milanglacier.com>

    doc: Add DeepInfra FIM example for using with OpenAI-FIM-Compatible 
provider.
---
 recipes.md | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 78 insertions(+), 4 deletions(-)

diff --git a/recipes.md b/recipes.md
index 9b088d4c90..e21a928660 100644
--- a/recipes.md
+++ b/recipes.md
@@ -1,7 +1,7 @@
 # Launching the llama.cpp Server: Example Script
 
-This guide provides several configuration variants for the `qwen2.5-coder`
-based on local computing power, specifically the available VRAM.
+This guide provides several configuration variants for the `qwen2.5-coder` 
based
+on local computing power, specifically the available VRAM.
 
 ### **For Systems with More Than 16GB VRAM**
 
@@ -72,10 +72,84 @@ llama-server \
 > Symbols like `<|fim_begin|>` and `<|fim_suffix|>` are special tokens
 > that serve as prompt boundaries. Some LLMs, like Qwen2.5-Coder, have
 > been trained with specific tokens to better understand prompt
-> composition. Different LLMs use different special tokens during
+> composition.  Different LLMs use different special tokens during
 > training, so you should adjust these tokens according to your
 > preferred LLM.
 
 ## **Acknowledgment**
 
-- [llama.vim](https://github.com/ggml-org/llama.vim): A reference for CLI 
parameters used in launching the `llama.cpp` server.
+- [llama.vim](https://github.com/ggml-org/llama.vim): A reference for CLI
+  parameters used in launching the `llama.cpp` server.
+
+# Using Non-OpenAI-Compatible FIM APIs with DeepInfra
+
+The `openai_fim_compatible` backend supports advanced customization to 
integrate
+with alternative providers.
+
+- **`:transform`**: A list of functions that accept a plist containing fields
+  listed below. Each function processes and returns a transformed version of
+  these attributes.
+
+  - `:end_point`: The API endpoint for the completion request.
+  - `:headers`: HTTP headers for the request.
+  - `:body`: The request body for the API.
+
+- **`:get_text_fn`**: Function to extract text from streaming responses.
+
+Below is an example configuration for integrating the `openai_fim_compatible`
+backend with the DeepInfra FIM API and Qwen-2.5-Coder-32B-Instruct model.
+
+```lisp
+(use-package minuet
+  :config
+  (setq minuet-provider 'openai-fim-compatible)
+
+  (plist-put minuet-openai-fim-compatible-options :end-point 
"https://api.deepinfra.com/v1/inference/";)
+  (plist-put minuet-openai-fim-compatible-options :api-key "DEEPINFRA_API_KEY")
+  (plist-put minuet-openai-fim-compatible-options :model 
"Qwen/Qwen2.5-Coder-32B-Instruct")
+  (plist-put minuet-openai-fim-compatible-options :transform 
'(minuet-deepinfra-fim-transform))
+
+  (minuet-set-optional-options minuet-openai-fim-compatible-options 
:max_tokens 56)
+  (minuet-set-optional-options minuet-openai-fim-compatible-options :stop 
["\n\n"])
+
+  ;; DeepInfra FIM does not support the `suffix` option in FIM
+  ;; completion.  Therefore, we must disable it and manually
+  ;; populate the special tokens required for FIM completion.
+  (minuet-set-optional-options minuet-openai-fim-compatible-options :suffix 
nil :template)
+
+  ;; Custom prompt formatting for Qwen model
+  (minuet-set-optional-options minuet-openai-fim-compatible-options
+                               :prompt
+                               (defun 
minuet-deepinfra-fim-qwen-prompt-function (ctx)
+                                 (format 
"<|fim_prefix|>%s\n%s<|fim_suffix|>%s<|fim_middle|>"
+                                         (plist-get ctx :language-and-tab)
+                                         (plist-get ctx :before-cursor)
+                                         (plist-get ctx :after-cursor)))
+                               :template)
+
+  ;; Function to transform requests data according to DeepInfra's API format.
+  (defun minuet-deepinfra-fim-transform (data)
+    ;; DeepInfra requires the endpoint to be formatted as: 
https://api.deepinfra.com/v1/inference/$MODEL_NAME
+    `(:end-point ,(concat (plist-get data :end-point)
+                          (--> data
+                               (plist-get it :body)
+                               (plist-get it :model)))
+      ;; No modifications needed for headers.
+      :headers ,(plist-get data :headers)
+      ;; DeepInfra uses `input` instead of `prompt`, and does not require 
:model in the request body.
+      :body ,(--> data
+                  (plist-get it :body)
+                  (plist-put it :input (plist-get it :prompt))
+                  (map-delete it :model)
+                  (map-delete it :prompt))))
+
+  ;; Function to extract generated text from DeepInfra's JSON output.
+  (plist-put minuet-openai-fim-compatible-options
+             :get-text-fn
+             (defun minuet--deepinfra-get-text-fn (json)
+               ;; DeepInfra's response format is: `json.token.text`
+               (--> json
+                    (plist-get it :token)
+                    (plist-get it :text))))
+  )
+```

Reply via email to