F #-: New version of Ray appliance
This version includes the following improvements: * Support for multi-gpu inference * Support for OpenAI API to interact with the model * Support for vLLMs and quantization * Includes an optional embedded web interface to interact with the deployed LLM * Updated base appliance to Ubuntu24.04 * Ray and vllm frameworks now run in a Python virtual environment
Loading
Please sign in to comment