Demo Platform Showroom Template Demo
Links
Red Hat

Serving at Scale

    • Advanced GPU Configuration
      • GPU Sharing
        • Timeslicing
        • MIG
        • MPS
      • GPU Aggregation
        • Tensor Parallelism
        • Pipeline Parallelism
        • Data Parallelism
        • Expert Parallelism
    • Model Serving with vLLM
      • RHAIIS vs RHOAI Capabilities
      • Multi-node vs multi-GPU overview
      • LLM GPU Requirements
      • Multi-GPU Lab
      • Multi-Node Lab
      • GitOps with KServe
      • Advanced vLLM Configuration
      • Accelerated Networking Considerations
      • Observability
    • Model as a Service (Maas)
      • MAAS Logical Architecture
      • API Gateway Capabilities and Requirements
      • IAM Capabilities and Requirements
      • Security Considerations
      • MAAS Hands-on Lab
  • Dev Mode
    • module-01-maas-removed
    • openshift-icons
    • Attributes Page
  • Serving at Scale
    • master
  • Serving at Scale
  • Model Serving with vLLM

Advanced Serving

Existing lab resources

  1. RH AI BU MultiNode MultiGPU - GitHub
    https://github.com/rh-aiservices-bu/multi-node-multi-gpu-poc

  2. RH AI BU MultiGPU LLMs
    https://github.com/rh-aiservices-bu/multi-gpu-llms?tab=readme-ov-file#72-multi-node---multiple-gpu-demos

  3. AI on OpenShift Article
    https://ai-on-openshift.io/odh-rhoai/nvidia-gpus/#aggregating-gpus-multi-gpu

Potential Topics to Cover in the Lab

Multi-GPU

Multi-Node

Expert Parallelism RHAIIS vs RHOAI Capabilities

Powered by

Demo Platform