Demo Platform LLM optimization and inference leveraging
Links
Red Hat

LLM optimization and inferencing

    • 1. vLLM & Performance Tuning
      • Securing vLLM Endpoints
      • Troubleshooting
      • Configuration
    • 2. LLM Compressor, Model Quantization and Sparsification
      • LLM Compressor
      • Model Optimization - Deep dive
      • Quantization in Practice - Lab
      • Quantization Pipeline - Lab
    • 3. LLM evaluation with GuideLLM
      • Load Testing
    • 4. RH Inference Server on Multiple Platforms
      • RHEL
      • OpenShift
      • OpenShift AI
      • Ubuntu
    • LLM Compressor: Executive Guide
      • Customer Qualification
      • Business Value
      • Model Selection
      • Managing Accuracy
      • Deployment Framework
      • Competitive Positioning
      • Team Guidance
      • Implementation
      • Common Objections
    • llm-d Technical Overview
      • What It Does
      • Architecture
      • Technical Requirements
        • Infrastructure
        • Prerequisites
        • Deployment
      • Key Features
    • LLM Compressor: Model Comparison Examples
  • Dev Mode
    • Inference Server on Multiple Platforms
    • openshift-icons
    • Attributes Page
  • LLM optimization and inferencing
    • master
  • LLM optimization and inferencing
  • 4. RH Inference Server on Multiple Platforms

Inference Server on Multiple Platforms

Existing lab resources

  1. RH Inference server on multiple platforms
    https://github.com/redhat-ai-services/inference-service-on-multiple-platforms

  2. RH Inference server tutorial
    https://docs.google.com/document/d/11-Oiomiih78dBjIfClISSQBKqb0Ij4UJg31g0dO5XIc/edit?usp=sharing

Potential Topics to Cover in the Lab

RHEL

OpenShift

OpenShift AI

Ubuntu

3. LLM evaluation with GuideLLM LLM Compressor: Executive Guide

Powered by

Demo Platform