Merge branch '103-describe-new-experimental-llm-capabilities' into 'main' (73ece846) · Commits · INJECT / inject-docs

docs/tech/llm-integration.md

0 → 100644

+139 −0

Original line number	Diff line number	Diff line
		# INJECT LLM integration

		!!! Warning

		- Experimental Features:
		These features are currently in an experimental state.
		Enabling them may negatively influence the performance of the IXP.
		- LLM Inaccuracy:
		The LLM can make mistakes.
		Large Language Models may produce inaccurate or misleading information ("hallucinations").
		Instructors should always review generated suggestions and assessments for accuracy before accepting them.

		The Inject Exercise Platform (IXP) now leverages Large Language Models (LLMs)
		to streamline the execution and evaluation of digital tabletop exercises.

		## Prerequisites

		To enable LLM features, you must have access to an LLM provider and configure
		the necessary environment variables during deployment.

		Important: The IXP currently supports only the OpenAI API standard.
		Regardless of your LLM provider (Self-hosted or Cloud), the API endpoint must accept requests
		formatted according to the OpenAI Chat Completions API specification.

		## Provider Options

		You can choose between self-hosting your own model or using a cloud provider.

		### 1. Self-hosted LLM (Recommended)

		We strongly recommend setting up your own LLM server if you have access to a server with a dedicated GPU.
		This ensures data privacy and often faster response times.

		\| Solution \| Pros \| Cons \|
		\| --------------------------------------- \| ---------------------------------------------------------------------- \| ---------------------------------------------------------------------- \|
		\| [vLLM](https://docs.vllm.ai/en/latest/) \| Recommended. Supports parallel request processing and high throughput. \| Slightly more complex setup. \|
		\| [ollama](https://docs.ollama.com/) \| Simple to set up and manage. \| Generally slower than vLLM; strictly sequential processing by default. \|

		### 2. Cloud-provided LLM

		Alternatively, you can utilize commercial cloud services that provide API keys.

		- [Gemini API](https://ai.google.dev/api)
		- [OpenAI GPT API](https://openai.com/api/)

		Configuration Note for Non-OpenAI Providers:
		If you use a provider other than OpenAI (e.g., Google Gemini),
		you must locate their "OpenAI Compatibility" endpoint.
		Do not use their native proprietary URLs.

		Example: Gemini OpenAI-compatible Endpoint

		```http
		https://generativelanguage.googleapis.com/v1beta/openai/chat/completions
		```

		## Features

		The following features utilize the LLM integration:

		### 1. Email Template Selection & Generation

		When an instructor replies to an email thread, the IXP uses the LLM to suggest
		the most appropriate response based on the exercise definition.

		How it works: The LLM analyzes the email thread content, information about the sender in the definition,
		and the related learning activities and objective (if present).
		It then behaves as follows:

		- If no email templates exist in the exercise definition:
		The LLM generates a completely new, context-aware response.
		- If email templates exist:
		- _Scenario A:_
		The LLM suggests a template that fits the context.
		- _Scenario B:_
		The LLM determines no existing template fits the current context;
		it generates a new custom response.

		Tip:
		For best results, ensure the `description` field in your _email.yml_ and `context` field of the _templates_
		contains clear, relevant details about the email entity and existing templates.
		Since learning activities (LAs) and learning objective (LO) related to the email address are sent as well,
		having clear description in the LAs and LO can improve the quality of the LLM response too.

		### 2. Automated Assessment Based on a Rubric

		To reduce instructor workload, exercise designers can use the `llm_assessment` object
		to define evaluation criteria for trainee inputs.
		This is particularly effective for evaluating long-form text inputs.

		LLM assessment is currently supported for:

		- Emails
		- Free-form questions (non-auto)

		By defining the `llm_assessment` object in the definition of the email participant or a free-form question,
		you can provide a prompt/criteria set.
		The LLM will evaluate the trainee's response against these criteria,
		saving the instructor from reading lengthy reports manually.

		## Performance Optimization: LLM Pre-load

		By default, LLM features are triggered manually (e.g., clicking "Assess" or opening the email to reply),
		which requires the user to wait for the generation (seconds to minutes).

		Platform _ADMINs_ can globally enable LLM Pre-load in the platform configuration to reduce these wait times.
		This feature proactively contacts the LLM in the background.

		### How Pre-load Works

		1. Email Suggestions:
		When a team sends a new email, the IXP immediately asks the LLM to generate/select a suggestion
		for the anticipated instructor reply.
		When the instructor eventually opens the email to reply,
		the suggestion is likely already cached and appears instantly.

		2. Free-form Assessments:
		When a questionnaire with `llm_assessment` object is submitted,
		the IXP immediately sends the answer to the LLM for evaluation.
		When the instructor clicks `Assess`, the result is likely ready.

		### Limitations & Risks

		Since this implementation is experimental, please observe the following limitations:

		- Race Conditions:
		If an instructor attempts to use an LLM feature while the pre-load is still processing it,
		a new request will be triggered, and the instructor will still have to wait.
		- Server Overload:
		Pre-loading scales with the volume of trainee activity.
		- If many teams submit emails or questionnaires simultaneously,
		the IXP will flood the LLM provider with requests.
		- This can lead to long response queues, connection timeouts,
		or rate-limiting by the provider.

		Recommendation:
		Only enable the LLM Pre-load on the IXP for testing, small-scale exercises,
		exercises with low mail communication volume or a few questionnaires, or environments where you possess a high-performance,
		robust LLM server.

mkdocs.yml

+1 −0

Original line number	Diff line number	Diff line
		@@ -47,6 +47,7 @@ nav:
		- Exercise log format: tech/log-format.md
		- Platform backend logs: tech/platform-logs.md
		- Version compatibility: tech/version-compatibility.md
		- LLM integration: tech/llm-integration.md
		- How to prepare an exercise?:
		- INJECT Process: INJECT_process/intro/overview.md
		- 01 Understand: