Loading docs/tech/llm-integration.md 0 → 100644 +139 −0 Original line number Diff line number Diff line # INJECT LLM integration !!! Warning - **Experimental Features:** These features are currently in an experimental state. Enabling them may negatively influence the performance of the IXP. - **LLM Inaccuracy:** The LLM can make mistakes. Large Language Models may produce inaccurate or misleading information ("hallucinations"). Instructors should always review generated suggestions and assessments for accuracy before accepting them. The Inject Exercise Platform (IXP) now leverages Large Language Models (LLMs) to streamline the execution and evaluation of digital tabletop exercises. ## Prerequisites To enable LLM features, you must have access to an LLM provider and configure the necessary environment variables during deployment. **Important:** The IXP currently supports only the OpenAI API standard. Regardless of your LLM provider (Self-hosted or Cloud), the API endpoint must accept requests formatted according to the OpenAI Chat Completions API specification. ## Provider Options You can choose between self-hosting your own model or using a cloud provider. ### 1. Self-hosted LLM (Recommended) We strongly recommend setting up your own LLM server if you have access to a server with a dedicated GPU. This ensures data privacy and often faster response times. | Solution | Pros | Cons | | --------------------------------------- | ---------------------------------------------------------------------- | ---------------------------------------------------------------------- | | [vLLM](https://docs.vllm.ai/en/latest/) | Recommended. Supports parallel request processing and high throughput. | Slightly more complex setup. | | [ollama](https://docs.ollama.com/) | Simple to set up and manage. | Generally slower than vLLM; strictly sequential processing by default. | ### 2. Cloud-provided LLM Alternatively, you can utilize commercial cloud services that provide API keys. - [Gemini API](https://ai.google.dev/api) - [OpenAI GPT API](https://openai.com/api/) **Configuration Note for Non-OpenAI Providers:** If you use a provider other than OpenAI (e.g., Google Gemini), you must locate their "OpenAI Compatibility" endpoint. Do not use their native proprietary URLs. Example: Gemini OpenAI-compatible Endpoint ```http https://generativelanguage.googleapis.com/v1beta/openai/chat/completions ``` ## Features The following features utilize the LLM integration: ### 1. Email Template Selection & Generation When an instructor replies to an email thread, the IXP uses the LLM to suggest the most appropriate response based on the exercise definition. **How it works:** The LLM analyzes the email thread content, information about the sender in the definition, and the related learning activities and objective (if present). It then behaves as follows: - **If no email templates exist in the exercise definition:** The LLM generates a completely new, context-aware response. - **If email templates exist:** - _Scenario A:_ The LLM suggests a template that fits the context. - _Scenario B:_ The LLM determines no existing template fits the current context; it generates a new custom response. **Tip:** For best results, ensure the `description` field in your _email.yml_ and `context` field of the _templates_ contains clear, relevant details about the email entity and existing templates. Since learning activities (LAs) and learning objective (LO) related to the email address are sent as well, having clear description in the LAs and LO can improve the quality of the LLM response too. ### 2. Automated Assessment Based on a Rubric To reduce instructor workload, exercise designers can use the `llm_assessment` object to define evaluation criteria for trainee inputs. This is particularly effective for evaluating long-form text inputs. LLM assessment is currently supported for: - Emails - Free-form questions (non-auto) By defining the `llm_assessment` object in the definition of the email participant or a free-form question, you can provide a prompt/criteria set. The LLM will evaluate the trainee's response against these criteria, saving the instructor from reading lengthy reports manually. ## Performance Optimization: LLM Pre-load By default, LLM features are triggered manually (e.g., clicking "Assess" or opening the email to reply), which requires the user to wait for the generation (seconds to minutes). Platform _ADMINs_ can globally enable LLM Pre-load in the platform configuration to reduce these wait times. This feature proactively contacts the LLM in the background. ### How Pre-load Works 1. **Email Suggestions:** When a team sends a new email, the IXP immediately asks the LLM to generate/select a suggestion for the anticipated instructor reply. When the instructor eventually opens the email to reply, the suggestion is likely already cached and appears instantly. 2. **Free-form Assessments:** When a questionnaire with `llm_assessment` object is submitted, the IXP immediately sends the answer to the LLM for evaluation. When the instructor clicks `Assess`, the result is likely ready. ### Limitations & Risks Since this implementation is experimental, please observe the following limitations: - **Race Conditions:** If an instructor attempts to use an LLM feature while the pre-load is still processing it, a new request will be triggered, and the instructor will still have to wait. - **Server Overload:** Pre-loading scales with the volume of trainee activity. - If many teams submit emails or questionnaires simultaneously, the IXP will flood the LLM provider with requests. - This can lead to long response queues, connection timeouts, or rate-limiting by the provider. **Recommendation:** Only enable the LLM Pre-load on the IXP for testing, small-scale exercises, exercises with low mail communication volume or a few questionnaires, or environments where you possess a high-performance, robust LLM server. mkdocs.yml +1 −0 Original line number Diff line number Diff line Loading @@ -47,6 +47,7 @@ nav: - Exercise log format: tech/log-format.md - Platform backend logs: tech/platform-logs.md - Version compatibility: tech/version-compatibility.md - LLM integration: tech/llm-integration.md - How to prepare an exercise?: - INJECT Process: INJECT_process/intro/overview.md - 01 Understand: Loading Loading
docs/tech/llm-integration.md 0 → 100644 +139 −0 Original line number Diff line number Diff line # INJECT LLM integration !!! Warning - **Experimental Features:** These features are currently in an experimental state. Enabling them may negatively influence the performance of the IXP. - **LLM Inaccuracy:** The LLM can make mistakes. Large Language Models may produce inaccurate or misleading information ("hallucinations"). Instructors should always review generated suggestions and assessments for accuracy before accepting them. The Inject Exercise Platform (IXP) now leverages Large Language Models (LLMs) to streamline the execution and evaluation of digital tabletop exercises. ## Prerequisites To enable LLM features, you must have access to an LLM provider and configure the necessary environment variables during deployment. **Important:** The IXP currently supports only the OpenAI API standard. Regardless of your LLM provider (Self-hosted or Cloud), the API endpoint must accept requests formatted according to the OpenAI Chat Completions API specification. ## Provider Options You can choose between self-hosting your own model or using a cloud provider. ### 1. Self-hosted LLM (Recommended) We strongly recommend setting up your own LLM server if you have access to a server with a dedicated GPU. This ensures data privacy and often faster response times. | Solution | Pros | Cons | | --------------------------------------- | ---------------------------------------------------------------------- | ---------------------------------------------------------------------- | | [vLLM](https://docs.vllm.ai/en/latest/) | Recommended. Supports parallel request processing and high throughput. | Slightly more complex setup. | | [ollama](https://docs.ollama.com/) | Simple to set up and manage. | Generally slower than vLLM; strictly sequential processing by default. | ### 2. Cloud-provided LLM Alternatively, you can utilize commercial cloud services that provide API keys. - [Gemini API](https://ai.google.dev/api) - [OpenAI GPT API](https://openai.com/api/) **Configuration Note for Non-OpenAI Providers:** If you use a provider other than OpenAI (e.g., Google Gemini), you must locate their "OpenAI Compatibility" endpoint. Do not use their native proprietary URLs. Example: Gemini OpenAI-compatible Endpoint ```http https://generativelanguage.googleapis.com/v1beta/openai/chat/completions ``` ## Features The following features utilize the LLM integration: ### 1. Email Template Selection & Generation When an instructor replies to an email thread, the IXP uses the LLM to suggest the most appropriate response based on the exercise definition. **How it works:** The LLM analyzes the email thread content, information about the sender in the definition, and the related learning activities and objective (if present). It then behaves as follows: - **If no email templates exist in the exercise definition:** The LLM generates a completely new, context-aware response. - **If email templates exist:** - _Scenario A:_ The LLM suggests a template that fits the context. - _Scenario B:_ The LLM determines no existing template fits the current context; it generates a new custom response. **Tip:** For best results, ensure the `description` field in your _email.yml_ and `context` field of the _templates_ contains clear, relevant details about the email entity and existing templates. Since learning activities (LAs) and learning objective (LO) related to the email address are sent as well, having clear description in the LAs and LO can improve the quality of the LLM response too. ### 2. Automated Assessment Based on a Rubric To reduce instructor workload, exercise designers can use the `llm_assessment` object to define evaluation criteria for trainee inputs. This is particularly effective for evaluating long-form text inputs. LLM assessment is currently supported for: - Emails - Free-form questions (non-auto) By defining the `llm_assessment` object in the definition of the email participant or a free-form question, you can provide a prompt/criteria set. The LLM will evaluate the trainee's response against these criteria, saving the instructor from reading lengthy reports manually. ## Performance Optimization: LLM Pre-load By default, LLM features are triggered manually (e.g., clicking "Assess" or opening the email to reply), which requires the user to wait for the generation (seconds to minutes). Platform _ADMINs_ can globally enable LLM Pre-load in the platform configuration to reduce these wait times. This feature proactively contacts the LLM in the background. ### How Pre-load Works 1. **Email Suggestions:** When a team sends a new email, the IXP immediately asks the LLM to generate/select a suggestion for the anticipated instructor reply. When the instructor eventually opens the email to reply, the suggestion is likely already cached and appears instantly. 2. **Free-form Assessments:** When a questionnaire with `llm_assessment` object is submitted, the IXP immediately sends the answer to the LLM for evaluation. When the instructor clicks `Assess`, the result is likely ready. ### Limitations & Risks Since this implementation is experimental, please observe the following limitations: - **Race Conditions:** If an instructor attempts to use an LLM feature while the pre-load is still processing it, a new request will be triggered, and the instructor will still have to wait. - **Server Overload:** Pre-loading scales with the volume of trainee activity. - If many teams submit emails or questionnaires simultaneously, the IXP will flood the LLM provider with requests. - This can lead to long response queues, connection timeouts, or rate-limiting by the provider. **Recommendation:** Only enable the LLM Pre-load on the IXP for testing, small-scale exercises, exercises with low mail communication volume or a few questionnaires, or environments where you possess a high-performance, robust LLM server.
mkdocs.yml +1 −0 Original line number Diff line number Diff line Loading @@ -47,6 +47,7 @@ nav: - Exercise log format: tech/log-format.md - Platform backend logs: tech/platform-logs.md - Version compatibility: tech/version-compatibility.md - LLM integration: tech/llm-integration.md - How to prepare an exercise?: - INJECT Process: INJECT_process/intro/overview.md - 01 Understand: Loading