Module 1: LLM Deployment Strategies
Introduction
This module covers deploying Red Hat Inference Server (vLLM) across three key enterprise platforms. Each deployment target serves different architectural needs and operational requirements.
Deployment Targets
RHEL (1.1): Bare metal and edge deployments requiring direct hardware control and minimal overhead. Ideal for single-node inference, air-gapped environments, or maximum performance scenarios.
OpenShift (1.2): Container orchestration for scalable, multi-node deployments. Best for cloud-native architectures, horizontal scaling, and DevOps integration workflows.
OpenShift AI (1.3): Managed AI platform providing MLOps capabilities, data science workflows, and enterprise governance. Optimal for integrated AI pipelines and managed model serving.