97: refactor: update prompt system for FIM models. (#15)

ELPA Syncer Mon, 24 Mar 2025 16:00:25 -0700

branch: externals/minuet
commit 1f31be1e30825f423d417e8ef3aece26d7dcae0a
Author: milanglacier <d...@milanglacier.com>
Commit: GitHub <nore...@github.com>


    refactor: update prompt system for FIM models. (#15)
---
 README.md | 150 +++++++++++++++++++++++++++++++-------------------------------
 minuet.el |  38 +++++++++++-----
 prompt.md |  44 ++++++++++++++----
 3 files changed, 137 insertions(+), 95 deletions(-)

diff --git a/README.md b/README.md
index d0ec4b5c09..70b6dba310 100644
--- a/README.md
+++ b/README.md
@@ -30,15 +30,14 @@ Just as dancers move during a minuet.
 # Features
 
 - AI-powered code completion with dual modes:
-  - Specialized prompts and various enhancements for chat-based LLMs
-    on code completion tasks.
-  - Fill-in-the-middle (FIM) completion for compatible models
-    (DeepSeek, Codestral, and some Ollama models).
-- Support for multiple AI providers (OpenAI, Claude, Gemini,
-  Codestral, Ollama, and OpenAI-compatible providers)
+  - Specialized prompts and various enhancements for chat-based LLMs on code
+    completion tasks.
+  - Fill-in-the-middle (FIM) completion for compatible models (DeepSeek,
+    Codestral, and some Ollama models).
+- Support for multiple AI providers (OpenAI, Claude, Gemini, Codestral, Ollama,
+  and OpenAI-compatible providers)
 - Customizable configuration options
-- Streaming support to enable completion delivery even with slower
-  LLMs
+- Streaming support to enable completion delivery even with slower LLMs
 
 **With minibuffer frontend**:
 
@@ -57,8 +56,8 @@ Just as dancers move during a minuet.
 
 # Installation
 
-Currently you need to install from github via `package-vc` or
-`straight`, or manually install this package.
+Currently you need to install from github via `package-vc` or `straight`, or
+manually install this package.
 
 ```elisp
 
@@ -135,24 +134,25 @@ Example for Fireworks with `llama-3.3-70b` model:
 
 # API Keys
 
-Minuet AI requires API keys to function. Set the following environment 
variables:
+Minuet AI requires API keys to function. Set the following environment
+variables:
 
 - `OPENAI_API_KEY` for OpenAI
 - `GEMINI_API_KEY` for Gemini
 - `ANTHROPIC_API_KEY` for Claude
 - `CODESTRAL_API_KEY` for Codestral
-- Custom environment variable for OpenAI-compatible services (as specified in 
your configuration)
+- Custom environment variable for OpenAI-compatible services (as specified in
+  your configuration)
 
-**Note:** Provide the name of the environment variable to Minuet
-inside the provider options, not the actual value. For instance, pass
-`OPENAI_API_KEY` to Minuet, not the value itself (e.g., `sk-xxxx`).
+**Note:** Provide the name of the environment variable to Minuet inside the
+provider options, not the actual value. For instance, pass `OPENAI_API_KEY` to
+Minuet, not the value itself (e.g., `sk-xxxx`).
 
-If using Ollama, you need to assign an arbitrary, non-null environment
-variable as a placeholder for it to function.
+If using Ollama, you need to assign an arbitrary, non-null environment variable
+as a placeholder for it to function.
 
-Alternatively, you can provide a function that returns the API
-key. This function should be fast as it will be called with each
-completion request.
+Alternatively, you can provide a function that returns the API key. This
+function should be fast as it will be called with each completion request.
 
 ```lisp
 ;; Good
@@ -164,88 +164,85 @@ completion request.
 
 # Selecting a Provider or Model
 
-The `gemini-flash` and `codestral` models offer high-quality output
-with free and fast processing. For optimal quality, consider using the
-`deepseek-chat` model, which is compatible with both
-`openai-fim-compatible` and `openai-compatible` providers. For local
-LLM inference, you can deploy either `qwen-coder` or `deepseek-coder`
-through Ollama using the `openai-fim-compatible` provider.
+The `gemini-flash` and `codestral` models offer high-quality output with free
+and fast processing. For optimal quality, consider using the `deepseek-chat`
+model, which is compatible with both `openai-fim-compatible` and
+`openai-compatible` providers. For local LLM inference, you can deploy either
+`qwen-coder` or `deepseek-coder` through Ollama using the
+`openai-fim-compatible` provider.
 
-# System Prompt
+# Prompt
 
-See [prompt](./prompt.md) for the default system prompt used by `minuet` and
+See [prompt](./prompt.md) for the default prompt used by `minuet` and
 instructions on customization.
 
-Please note that the System Prompt only applies to chat-based LLMs (OpenAI,
-OpenAI-Compatible, Claude, and Gemini). It does not apply to Codestral and
-OpenAI-FIM-compatible models.
+Note that `minuet` employs two distinct prompt systems:
+
+1. A system designed for chat-based LLMs (OpenAI, OpenAI-Compatible, Claude, 
and
+   Gemini)
+2. A separate system designed for Codestral and OpenAI-FIM-compatible models
 
 # Configuration
 
-Below are commonly used configuration options. To view the complete
-list of available settings, search for `minuet` through the
-`customize` interface.
+Below are commonly used configuration options. To view the complete list of
+available settings, search for `minuet` through the `customize` interface.
 
 ## minuet-provider
 
-Set the provider you want to use for completion with minuet, available
-options: `openai`, `openai-compatible`, `claude`, `gemini`,
-`openai-fim-compatible`, and `codestral`.
+Set the provider you want to use for completion with minuet, available options:
+`openai`, `openai-compatible`, `claude`, `gemini`, `openai-fim-compatible`, and
+`codestral`.
 
 The default is `openai-fim-compatible` using the deepseek endpoint.
 
-You can use `ollama` with either `openai-compatible` or
-`openai-fim-compatible` provider, depending on your model is a chat
-model or code completion (FIM) model.
+You can use `ollama` with either `openai-compatible` or `openai-fim-compatible`
+provider, depending on your model is a chat model or code completion (FIM)
+model.
 
 ## minuet-context-window
 
-The maximum total characters of the context before and after cursor.
-This limits how much surrounding code is sent to the LLM for context.
+The maximum total characters of the context before and after cursor. This 
limits
+how much surrounding code is sent to the LLM for context.
 
-The default is 16000, which roughly equates to 4000 tokens after
-tokenization.
+The default is 16000, which roughly equates to 4000 tokens after tokenization.
 
 ## minuet-context-ratio
 
-Ratio of context before cursor vs after cursor. When the total
-characters exceed the context window, this ratio determines how much
-context to keep before vs after the cursor. A larger ratio means more
-context before the cursor will be used. The ratio should between 0 and
-`1`, and default is `0.75`.
+Ratio of context before cursor vs after cursor. When the total characters 
exceed
+the context window, this ratio determines how much context to keep before vs
+after the cursor. A larger ratio means more context before the cursor will be
+used. The ratio should between 0 and `1`, and default is `0.75`.
 
 ## minuet-request-timeout
 
-Maximum timeout in seconds for sending completion requests. In case of
-the timeout, the incomplete completion items will be delivered. The
-default is `3`.
+Maximum timeout in seconds for sending completion requests. In case of the
+timeout, the incomplete completion items will be delivered. The default is `3`.
 
 ## minuet-add-single-line-entry
 
-For `minuet-complete-with-minibuffer` function, Whether to create
-additional single-line completion items. When non-nil and a
-completion item has multiple lines, create another completion item
-containing only its first line. This option has no impact for
-overlay-based suggesion.
+For `minuet-complete-with-minibuffer` function, Whether to create additional
+single-line completion items. When non-nil and a completion item has multiple
+lines, create another completion item containing only its first line. This
+option has no impact for overlay-based suggesion.
 
 ## minuet-n-completions
 
-Number of completion items to request from the language model. This
-number is encoded as part of the prompt for the chat LLM. Note that
-when `minuet-add-single-line-entry` is true, the actual number of
-returned items may exceed this value. Additionally, the LLM cannot
-guarantee the exact number of completion items specified, as this
-parameter serves only as a prompt guideline. The default is `3`.
+Number of completion items to request from the language model. This number is
+encoded as part of the prompt for the chat LLM. Note that when
+`minuet-add-single-line-entry` is true, the actual number of returned items may
+exceed this value. Additionally, the LLM cannot guarantee the exact number of
+completion items specified, as this parameter serves only as a prompt 
guideline.
+The default is `3`.
 
 ## minuet-auto-suggestion-debounce-delay
 
-The delay in seconds before sending a completion request after typing
-stops.  The default is `0.2` seconds.
+The delay in seconds before sending a completion request after typing stops. 
The
+default is `0.2` seconds.
 
 ## minuet-auto-suggestion-throttle-delay
 
-The minimum time in seconds between 2 completion requests.  The
-default is `1.0` seconds.
+The minimum time in seconds between 2 completion requests. The default is `1.0`
+seconds.
 
 # Provider Options
 
@@ -264,9 +261,8 @@ You can customize the provider options using `plist-put`, 
for example:
 )
 ```
 
-To pass optional parameters (like `max_tokens` and `top_p`) to send to
-the REST request, you can use function
-`minuet-set-optional-options`:
+To pass optional parameters (like `max_tokens` and `top_p`) to send to the REST
+request, you can use function `minuet-set-optional-options`:
 
 ```lisp
 (minuet-set-optional-options minuet-openai-options :max_tokens 256)
@@ -323,8 +319,8 @@ Below is the default value:
 
 <details>
 
-Codestral is a text completion model, not a chat model, so the system prompt
-and few shot examples does not apply. Note that you should use the
+Codestral is a text completion model, not a chat model, so the system prompt 
and
+few shot examples does not apply. Note that you should use the
 `CODESTRAL_API_KEY`, not the `MISTRAL_API_KEY`, as they are using different
 endpoint. To use the Mistral endpoint, simply modify the `end_point` and
 `api_key` parameters in the configuration.
@@ -336,6 +332,8 @@ Below is the default value:
     '(:model "codestral-latest"
       :end-point "https://codestral.mistral.ai/v1/fim/completions";
       :api-key "CODESTRAL_API_KEY"
+      :template (:prompt minuet--default-fim-prompt-function
+                 :suffix minuet--default-fim-suffix-function)
       :optional nil)
     "config options for Minuet Codestral provider")
 ```
@@ -352,9 +350,9 @@ request timeout from outputing too many tokens.
 
 ## Gemini
 
-You should use the end point from Google AI Studio instead of Google
-Cloud. You can get an API key via their [Google API
-page](https://makersuite.google.com/app/apikey).
+You should use the end point from Google AI Studio instead of Google Cloud. You
+can get an API key via their
+[Google API page](https://makersuite.google.com/app/apikey).
 
 <details>
 
@@ -452,6 +450,8 @@ The following config is the default.
       :end-point "https://api.deepseek.com/beta/completions";
       :api-key "DEEPSEEK_API_KEY"
       :name "Deepseek"
+      :template (:prompt minuet--default-fim-prompt-function
+                 :suffix minuet--default-fim-suffix-function)
       :optional nil)
     "config options for Minuet OpenAI FIM compatible provider")
 ```
diff --git a/minuet.el b/minuet.el
index 897d3f3a46..8492ff89fa 100644
--- a/minuet.el
+++ b/minuet.el
@@ -250,6 +250,8 @@ def fibonacci(n):
     '(:model "codestral-latest"
       :end-point "https://codestral.mistral.ai/v1/fim/completions";
       :api-key "CODESTRAL_API_KEY"
+      :template (:prompt minuet--default-fim-prompt-function
+                 :suffix minuet--default-fim-suffix-function)
       :optional nil)
     "Config options for Minuet Codestral provider.")
 
@@ -271,6 +273,8 @@ def fibonacci(n):
       :end-point "https://api.deepseek.com/beta/completions";
       :api-key "DEEPSEEK_API_KEY"
       :name "Deepseek"
+      :template (:prompt minuet--default-fim-prompt-function
+                 :suffix minuet--default-fim-suffix-function)
       :optional nil)
     "Config options for Minuet OpenAI FIM compatible provider.")
 
@@ -284,10 +288,6 @@ def fibonacci(n):
        :n-completions-template minuet-default-n-completion-template)
       :fewshots minuet-default-fewshots
       :optional nil)
-    ;; (:generationConfig
-    ;;  (:stopSequences nil
-    ;;   :maxOutputTokens 256
-    ;;   :topP 0.8))
     "Config options for Minuet Gemini provider.")
 
 
@@ -768,13 +768,19 @@ arrive."
                              ("Accept" . "application/json")
                              ("Authorization" . ,(concat "Bearer " 
(minuet--get-api-key (plist-get options :api-key)))))
                   :timeout minuet-request-timeout
-                  :body (json-serialize `(,@(plist-get options :optional)
-                                          :stream t
-                                          :model ,(plist-get options :model)
-                                          :prompt ,(format "%s\n%s"
-                                                           (plist-get context 
:additional)
-                                                           (plist-get context 
:before-cursor))
-                                          :suffix ,(plist-get context 
:after-cursor)))
+                  :body
+                  (json-serialize
+                   `(,@(plist-get options :optional)
+                     :stream t
+                     :model ,(plist-get options :model)
+                     :prompt ,(funcall (--> options
+                                            (plist-get it :template)
+                                            (plist-get it :prompt))
+                                       context)
+                     ,@(when-let* ((suffix-fn (--> options
+                                                   (plist-get it :template)
+                                                   (plist-get it :suffix))))
+                           (list :suffix (funcall suffix-fn context)))))
                   :as 'string
                   :filter (minuet--make-process-stream-filter --response--)
                   :then
@@ -1027,6 +1033,16 @@ to be called when completion items arrive."
                                  minuet--auto-last-point (point))
                            (minuet-show-suggestion))))))))
 
+(defun minuet--default-fim-prompt-function (ctx)
+    "Default function to generate prompt for FIM completions from CTX."
+    (format "%s\n%s"
+            (plist-get ctx :additional)
+            (plist-get ctx :before-cursor)))
+
+(defun minuet--default-fim-suffix-function (ctx)
+    "Default function to generate suffix for FIM completions from CTX."
+    (plist-get ctx :after-cursor))
+
 (defun minuet--cleanup-auto-suggestion ()
     "Clean up auto-suggestion timers and hooks."
     (remove-hook 'post-command-hook #'minuet--maybe-show-suggestion t)
diff --git a/prompt.md b/prompt.md
index cf88c044b2..28c4a86e94 100644
--- a/prompt.md
+++ b/prompt.md
@@ -1,3 +1,28 @@
+# FIM LLM Prompt Structure
+
+The prompt sent to the FIM LLM follows this structure:
+
+```lisp
+'(:template (:prompt minuet--default-fim-prompt-function
+             :suffix minuet--default-fim-suffix-function))
+```
+
+The template contains two main functions:
+
+- `:prompt`: return language and the indentation style, followed by the
+  `context_before_cursor` verbatim.
+- `:suffix`: return `context_after_cursor` verbatim.
+
+Both functions can be customized to supply additional context to the LLM. The
+`suffix` function can be disabled by setting `:suffix` to `nil` via 
`plist-put`,
+resulting in a request containing only the prompt.
+
+Note: for Ollama users: Do not include special tokens (e.g., `<|fim_begin|>`)
+within the prompt or suffix functions, as these will be automatically populated
+by Ollama. If your use case requires special tokens not covered by Ollama's
+default template, disable the `:suffix` function by setting it to `nil` and
+incorporate the necessary special tokens within the prompt function.
+
 # Default Template
 
 `{{{:prompt}}}\n{{{:guidelines}}}\n{{{:n_completion_template}}}`
@@ -25,8 +50,10 @@ Guidelines:
 3. Provide multiple completion options when possible.
 4. Return completions separated by the marker `<endCompletion>`.
 5. The returned message will be further parsed and processed. DO NOT include
-   additional comments or markdown code block fences. Return the result 
directly.
-6. Keep each completion option concise, limiting it to a single line or a few 
lines.
+   additional comments or markdown code block fences. Return the result
+   directly.
+6. Keep each completion option concise, limiting it to a single line or a few
+   lines.
 7. Create entirely new code completion that DO NOT REPEAT OR COPY any user's
    existing code around `<cursorPosition>`.
 
@@ -66,11 +93,10 @@ def fibonacci(n):
 
 # Customization
 
-You can customize the `:template` by encoding placeholders within
-triple braces.  These placeholders will be interpolated using the
-corresponding key-value pairs from the table. The value can be a
-function that takes no argument and returns a string, or a symbol
-whose value is a string.
+You can customize the `:template` by encoding placeholders within triple 
braces.
+These placeholders will be interpolated using the corresponding key-value pairs
+from the table. The value can be a function that takes no argument and returns 
a
+string, or a symbol whose value is a string.
 
 Here's a simplified example for illustrative purposes (not intended for actual
 configuration):
@@ -89,8 +115,8 @@ configuration):
 ```
 
 Note that `:n_completion_template` is a special placeholder as it contains one
-`%d` which will be encoded with `minuet-n-completions`, if you want to
-customize this template, make sure your prompt also contains only one `%d`.
+`%d` which will be encoded with `minuet-n-completions`, if you want to 
customize
+this template, make sure your prompt also contains only one `%d`.
 
 Similarly, `:fewshots` can be a plist in the following form or a function that
 takes no argument and returns a plist in the following form:

[elpa] externals/minuet 1f31be1e30 45/97: refactor: update prompt system for FIM models. (#15)

Reply via email to