Lab Setup and Prerequisites
Prerequisites
Before starting this lab, ensure you have the following:
Required Tools
-
GitHub Account: You’ll need access to fork repositories and collaborate
-
Git CLI: Installed and configured on your local machine
-
OpenShift CLI (oc): Download from OpenShift client downloads
-
HashiCorp Vault CLI: Download from Vault installation guide
-
Terminal/Command Line: Bash shell with basic utilities (openssl, curl)
-
Ansible Vault: Part of Ansible toolkit for secret management
-
Web Browser: For accessing OpenShift console, MaaS portal, and documentation
Required Skills
-
Basic Git Knowledge: Cloning repositories, basic version control concepts
-
Command Line Basics: Navigating directories, running commands
-
Container Concepts: Understanding of containers and Kubernetes/OpenShift (helpful but not required)
Provided by Instructors
-
OpenShift Cluster Access: GPU-enabled cluster with admin credentials
-
Red Hat MaaS Access: Model-as-a-Service credentials for LLaMA models
-
Workshop Materials: All necessary configuration files and scripts
-
Environment Variables: Cluster-specific configuration values
-
Support: Technical assistance throughout the lab
Getting Started
Before diving into the agentic AI lab, we need to set up our development environment. This involves two key steps:
-
Setting up GitOps: Configure automated deployment pipelines that manage our infrastructure and applications
-
Configuring Secret Management: Set up secure handling of API keys and credentials using HashiCorp Vault
Why this matters: This approach keeps sensitive information (like API keys) separate from our code, following security best practices while enabling automated deployments.
For this lab: We’ll use the automated bootstrap scripts to handle setup quickly so we can focus on building AI agents. |
For later exploration (after the lab):
-
Manual setup: The step-by-step instructions below show how each component works - perfect for understanding GitOps and secret management in detail
Team Setup
-
You’ll be working in teams of 2 people per cluster
Why teams of two?
-
Resource optimization: GPU-enabled OpenShift clusters are expensive - sharing clusters allows us to provide everyone with powerful hardware
-
Better learning: Pair programming increases knowledge sharing and helps troubleshoot issues faster
-
Real-world practice: Most production AI/ML teams work collaboratively on shared infrastructure and have a mixture of roles and expertise
This setup mirrors how teams work with shared cloud resources in enterprise environments.
-
-
Receive your cluster credentials 🔐
Your instructor will provide OpenShift login credentials for your team’s shared cluster.
-
Set up your shared repository (choose one team member to do this):
-
Fork the etx-agentic-ai repository to your personal GitHub account
Figure 1. GitHub Repo Fork -
Add your teammate as a collaborator with write access
Figure 2. GitHub Repo Collaborators -
Ensure that you Enable Issues for your fork under General > Features > Issues as they are disabled for forked repos by default
Figure 3. GitHub Repo Enable Issues
-
-
Both team members: Clone the forked repository locally
git clone git@github.com:your-gh-user/etx-agentic-ai.git cd etx-agentic-ai
Figure 4. GitHub Repo CloneReplace
your-gh-user
with the actual GitHub username of whoever forked the repository. -
Verify your setup ✅
You should now have:
-
Access to your team’s OpenShift cluster
-
A shared fork of the repository with both teammates as collaborators
-
Local copies of the code on both laptops
-
Cluster Environment
Your team has access to a fully-featured OpenShift cluster designed for AI workloads. This cluster mimics many customer production environments. Here’s how the platform is architected:
Bootstrap Components
These foundational components are deployed first to establish the platform’s operational baseline:
-
Red Hat OpenShift: Enterprise Kubernetes platform providing container orchestration
-
Advanced Cluster Management (ACM): Multi-cluster governance and GitOps orchestration
-
ArgoCD: Declarative, Git-driven application deployments
-
HashiCorp Vault: Secure credential storage and automated secret injection
Security & Governance
Built on the bootstrap foundation, these components enforce enterprise policies:
Policy as Code
Everything is managed through automated policy enforcement:
How this differs from standard GitOps: While traditional GitOps deploys applications, Policy as Code deploys and enforces the rules that govern how applications can behave, what resources they can access, and how they must be configured. The policies themselves are GitOps-managed, creating a "governance layer" above your applications. Green from GO ✅: We start compliant from day one. Rather than building systems and retrofitting security and compliance later, our development environment mirrors production with all policies active from the beginning. This means teams learn to work within enterprise guardrails naturally. This approach ensures software quality, security, and consistency at enterprise scale. You can read more about Configuration Policies here. |

-
Policy Enforcement: ACM automatically applies and monitors compliance across all workloads in all clusters (particularly useful for large-scale multi-cluster environments)
-
Observability Stack: Comprehensive monitoring, logging, and tracing for security insights
-
GPU Resource Management: Node Feature Discovery (NFD) for specialized compute allocation
Developer Platform Services
Self-service capabilities that enable development teams:
-
CI/CD Pipelines: Tekton for automated container builds, testing, and deployments
-
Source Control Integration: Git-based workflows with automated quality gates
-
Container Registry: Secure image storage with vulnerability scanning and promotion workflows
Tenant & Workload Services
Multi-tenant capabilities providing isolated, secure environments:
-
Namespace Management: Multi-tenant isolation with RBAC and resource quotas
-
Development Workbenches: Self-service Jupyter environments for data science teams
-
Service Mesh: Secure service-to-service communication and traffic management
AI/ML Platform Services
Specialized services for AI/ML workloads and agentic applications:
-
Red Hat OpenShift AI (RHOAI): Managed AI/ML platform with GPU acceleration
-
Model Serving Infrastructure: Scalable inference endpoints with model lifecycle management
-
Agentic AI Runtime: Environment for deploying AI agents with external service integrations
LLaMA Stack Integration: Our agentic AI workloads leverage LLaMA Stack, a composable framework that provides standardized APIs for model inference, safety guardrails, and tool integration. This allows our AI agents to seamlessly interact with large language models while maintaining consistent interfaces for memory management, tool calling, and safety controls across different model providers. |
The Benefits:
-
ZERO configuration drift - what’s in git is real
-
Integrates into the Governance Dashboard in ACM for SRE
-
We start as we mean to go on - we are Green from GO so that our dev environment looks like prod only smaller
-
All our clusters and environments are Kubernetes Native once bootstrapped
Required Applications
As a Team, you need to do each of these Prerequisites.
-
Choose a client to bootstrap from. It could be:
-
Your Laptop or a Toolbx or a Fedora like jumphost or a Workbench Terminal that can access your cluster and the internet
-
Your bootstrap client must have a bash shell with openssl, ansible-vault installed
-
Download and Install the Hashi Vault Client binary
-
Login to your OpenShift cluster using the OpenShift client as the cluster-admin user
-
-
-
Setup env vars and login to OpenShift
export ADMIN_PASSWORD=password # replace with yours export CLUSTER_NAME=ocp.4ldrd # replace with yours export BASE_DOMAIN=sandbox2518.opentlc.com # replace with yours
oc login --server=https://api.${CLUSTER_NAME}.${BASE_DOMAIN}:6443 -u admin -p ${ADMIN_PASSWORD}
-
Done ✅
MaaS credentials
Gather your Model as a Service Credentials.
-
Click on the See your Applications & their credentials button.
-
Create 3 Applications for these three models
-
Llama-3.2-3B
-
Llama-4-Scout-17B-16E-W4A16
-
Nomic-Embed-Text-v1.5
e.g. for example llama-4-scout-17b-16e-w4a16
Figure 6. MaaS LLama4 Scout
-
-
Setup env vars
export MODEL_LLAMA3_API_KEY=e3... export MODEL_LLAMA3_ENDPOINT_URL=https://llama-3-2-3b-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443 export MODEL_LLAMA3_NAME=llama-3-2-3b export MODEL_LLAMA4_API_KEY=ce... export MODEL_LLAMA4_ENDPOINT_URL=https://llama-4-scout-17b-16e-w4a16-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443 export MODEL_LLAMA4_NAME=llama-4-scout-17b-16e-w4a16 export MODEL_EMBED_API_KEY=95... export MODEL_EMBED_URL=https://nomic-embed-text-v1-5-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443 export MODEL_EMBED_NAME=/mnt/models
-
Done ✅
Vault Setup for GitOps
We need to setup vault for your environment.
-
Initialize the vault. Make sure you record the UNSEAL_KEY and ROOT_TOKEN somewhere safe and export them as env vars.
oc -n vault exec -ti vault-0 -- vault operator init -key-threshold=1 -key-shares=1 -tls-skip-verify
export UNSEAL_KEY=EGbx... export ROOT_TOKEN=hvs.wnz...
After running the vault initialization command, you’ll see output containing the unseal key and root token. Copy these values and export them as environment variables as shown.

-
Unseal the Vault.
oc -n vault exec -ti vault-0 -- vault operator unseal -tls-skip-verify $UNSEAL_KEY
-
Setup secrets for gitops.
(Optional Reading) You can see more details of this sort of setup here if you need more background. -
Setup env vars
export VAULT_ROUTE=vault-vault.apps.${CLUSTER_NAME}.${BASE_DOMAIN} export VAULT_ADDR=https://${VAULT_ROUTE} export VAULT_SKIP_VERIFY=true
-
Login to Vault.
vault login token=${ROOT_TOKEN}
You should see the following output:
+ .Vault Login image::vault-login.png[Vault Login, 400]
-
Setup env vars
export APP_NAME=vault export PROJECT_NAME=openshift-policy export CLUSTER_DOMAIN=apps.${CLUSTER_NAME}.${BASE_DOMAIN}
-
Create the Vault Auth using Kubernetes auth
vault auth enable -path=${CLUSTER_DOMAIN}-${PROJECT_NAME} kubernetes export MOUNT_ACCESSOR=$(vault auth list -format=json | jq -r ".\"$CLUSTER_DOMAIN-$PROJECT_NAME/\".accessor")
-
Create an ACL Policy - ArgoCD will only be allowed to READ secret values for hydration into the cluster
vault policy write $CLUSTER_DOMAIN-$PROJECT_NAME-kv-read -<< EOF path "kv/data/*" { capabilities=["read","list"] } EOF
-
Enable kv2 to store our secrets
vault secrets enable -path=kv/ -version=2 kv
-
Bind the ACL to Auth policy
vault write auth/$CLUSTER_DOMAIN-$PROJECT_NAME/role/$APP_NAME \ bound_service_account_names=$APP_NAME \ bound_service_account_namespaces=$PROJECT_NAME \ policies=$CLUSTER_DOMAIN-$PROJECT_NAME-kv-read \ period=120s
-
Grab the cluster CA certificate on the API
CA_CRT=$(echo "Q" | openssl s_client -showcerts -connect api.${CLUSTER_NAME}.${BASE_DOMAIN}:6443 2>&1 | awk '/BEGIN CERTIFICATE/,/END CERTIFICATE/ {print $0}')
-
Add the initial token and CA cert to the Vault Auth Config.
vault write auth/${CLUSTER_DOMAIN}-${PROJECT_NAME}/config \ kubernetes_host="$(oc whoami --show-server)" \ kubernetes_ca_cert="$CA_CRT"
-
Done ✅
Tavily search token
Gather your Tavily web search API Key.
-
Setup a Tavily api key for web search. Login using a github account of one of your team members.
Figure 7. Tavily API Key -
Done ✅
GitHub Token
Create a fine-grained GitHub Personal Access (PAT) Token.
-
Login to GitHub in a browser, then click on your user icon > Settings
-
Select Developer Settings > Personal Access Tokens > Fine-grained personal access tokens
-
Select Button Generate a new token - give it a token name e.g. etx-ai
-
Set Repository access
All repositories: allow access to your repositories including read-only public repos.
-
Give it the following permissions:
Commit statuses: Read-Only
Content: Read-Only
Issues: Read and Write
Metadata: Read-Only (this gets added automatically)
Pull requests: Read-Only
Figure 8. GitHub Repo Perms -
Generate the token.
Figure 9. GitHub Repo Token -
Done ✅
GitHub Webhook
Create a webhook that fires from your GitHub repo fork to ArgoCD on the OpenShift Cluster. This ensures the applications are synced whenever you push a change into git (rather than wait the 3min default sync time).
-
Login to GitHub in a browser, go to your etx-agentic-ai fork > Settings
-
Select Webhooks
-
Select Add Webhook. Add the following details
Payload URL: https://global-policy-server-openshift-policy.apps.${CLUSTER_NAME}.${BASE_DOMAIN}/api/webhook - You can get the correct URL by echoing this out on the command line:
echo https://global-policy-server-openshift-policy.apps.${CLUSTER_NAME}.${BASE_DOMAIN}/api/webhook
Content Type: application/json
SSL Verification: Enable SSL Verification
Which events: Send me everything
-
Click Add Webhook
Figure 10. GitHub Webhook -
Done ✅
The Secrets File
Why Do This
We need to be able to hydrate the vault from a single source of truth. It makes secret management very efficient. In the case if a disaster, we need to recover the vault environment quickly. We can check this file into git as an AES256 encoded file (until quantum cracks it ❈). |
The secrets file is just a bash shell script that uses the vault cli.
-
Copy the example secrets file provided
cp infra/secrets/vault-sno-example infra/secrets/vault-sno
If the secrets file was encrypted we could unencrypt is as follows (the instructor will provide the key) ansible-vault decrypt infra/secrets/vault-sno
-
Add the gathered api tokens as env vars to the secrets file and save it.
Figure 11. Add API Tokens -
Setup env vars
export VAULT_ROUTE=vault-vault.apps.${CLUSTER_NAME}.${BASE_DOMAIN} export VAULT_ADDR=https://${VAULT_ROUTE} export VAULT_SKIP_VERIFY=true
-
Login to Vault.
vault login token=${ROOT_TOKEN}
-
Hydrate the vault by running the secrets file as a script. When prompted to enter the root token, use the $ROOT_TOKEN you exported earlier.
sh infra/secrets/vault-sno
-
Encrypt the secrets file and check it back into your git fork. Generate a large secret key to use to encrypt the file and keep it safe.
you can put the key in vault 🔑 openssl rand -hex 32
-
Ansible vault encrypt will prompt you for the Key twice
ansible-vault encrypt infra/secrets/vault-sno
-
Add to git
# Its not real unless its in git git add infra/secrets/vault-sno; git commit -m "hydrated vault with apikeys"; git push
OptionalYou can add a pre-commit git hook client side so that you do not check in an unencrypted AES256 secrets file. Run this after cloning forked repo to configure git hooks:
chmod 755 infra/bootstrap/pre-commit cd .git/hooks ln -s ../../infra/bootstrap/pre-commit pre-commit cd ../../
-
Lastly, create the secret used by ArgoCD to connect to Vault in our OpenShift cluster. Since the OpenShift TokenAPI is used, we only really reference the service account details.
cat <<EOF | oc apply -f- kind: Secret apiVersion: v1 metadata: name: team-avp-credentials namespace: openshift-policy stringData: AVP_AUTH_TYPE: "k8s" AVP_K8S_MOUNT_PATH: "auth/${CLUSTER_DOMAIN}-${PROJECT_NAME}" AVP_K8S_ROLE: "vault" AVP_TYPE: "vault" VAULT_ADDR: "https://vault.vault.svc:8200" VAULT_SKIP_VERIFY: "true" type: Opaque EOF
-
Your Agentic ArgoCD is now setup to read secrets from Vault and should be in a healthy state.
-
You can also login to Vault using the Vault UI and $ROOT_TOKEN from the OpenShift web console to check out the configuration if it is unfamiliar.
Figure 12. Login to Vault -
Done ✅
💥 Expert Mode 💥
Experts Only ⛷️
Only run this script if you are familiar with the Hashi Vault setup we just ran through and you skipped to here. Run the all-in-one vault setup script.
Done ✅ |
Complete the Bootstrap
-
The following OpenShift ConsoleLinks should already exist in your cluster:
Red Hat Applications - these are cloud services provided by Red Hat for your cluster.
GenAI - these are the GenAI applications that we will be using in the exercises. The Agentic ArgoCD should be running but is empty (no apps deployed yet) and is our GitOps application. The LLamaStack Playground is not deployed yet, but will be the link for the LlamaStack UI for integrating Tools and Agents. Vault is running but not yet initialized or unsealed and is the app that stores our secrets.
OpenShift GitOps - this is the cluster bootstrap ArgoCD GitOps. This has all of the setup to get started for our cluster. It does not include the Agentic applications that we cover in the exercises.
RHOAI - the UI for Red Hat OpenShift AI. Login here to access your Data Science workbenches, models, pipelines and experiments.
-
Bootstrap App-of-Apps
# We need to update our ArgoCD Apps to point to your team fork export YOUR_GITHUB_USER=your-gh-user # the Team member who forked the GitHub Repo cd etx-agentic-ai # Navigate to root directory of code base if not already there
-
Replace the
redhat-ai-services
throughout the file with your GitHub username.sed -i "s/redhat-ai-services/${YOUR_GITHUB_USER}/g" infra/app-of-apps/etx-app-of-apps.yaml
-
Update the
redhat-ai-services
to your GitHub username in theetx-app-of-apps.yaml
file.for x in $(ls infra/app-of-apps/sno); do sed -i "s/redhat-ai-services/${YOUR_GITHUB_USER}/g" infra/app-of-apps/sno/$x done
-
Now we can save, commit, and push the changes to your GitHub fork.
# Its not real unless its in git git add .; git commit -m "using my github fork"; git push
-
Finally, we can bootstrap the apps into our cluster.
# Bootstrap all our apps oc apply -f infra/app-of-apps/etx-app-of-apps.yaml
This will install the tenant pipeline app and observability stack into our cluster. All the other GenAI apps are undeployed for now. You can check this in your app-of-apps/cluster-name github fork folder.
-
Check the Install progress of the app-of-apps in ArgoCD
-
You will need to wait for the individual apps to be installed. This may take a few minutes. After a few minutes, you should see the following output to show that the apps have been installed.
Also, notice that the
tenant-ai-agent-local-cluster
app is constantly in a progressing state. This is something we will address later in this course. -
Done ✅
Our Data Science Team Have A Request
It seems there is only limited GPUs in the cluster. In this example 1 GPU. We already have an LLM Model deployed at bootstrap time using this GPU.
The Data Science team 🤓 have requested to use GPUs for their Data Science Workbenches e.g. when they use a Pytorch, CUDA or other stack image that can directly access an accelerator.
Given the cluster already has access to one GPU node let’s quickly set up this access for them. Note that your cluster may be configured with more GPU nodes.
In our case we have a single NVIDIA accelerator attached to our instance type.
-
Check what EC2 GPU enabled instance types we have running in our cluster
oc get machines.machine.openshift.io -A
NAMESPACE NAME PHASE TYPE REGION ZONE AGE openshift-machine-api ocp-kt5tz-master-0 Running c6a.2xlarge us-east-2 us-east-2a 24h openshift-machine-api ocp-kt5tz-master-1 Running c6a.2xlarge us-east-2 us-east-2b 24h openshift-machine-api ocp-kt5tz-master-2 Running c6a.2xlarge us-east-2 us-east-2c 24h openshift-machine-api ocp-kt5tz-worker-gpu-us-east-2a-9vxzv Running g6e.2xlarge us-east-2 us-east-2a 24h openshift-machine-api ocp-kt5tz-worker-us-east-2a-fcbcg Running m6a.4xlarge us-east-2 us-east-2a 24h openshift-machine-api ocp-kt5tz-worker-us-east-2b-5zx84 Running m6a.4xlarge us-east-2 us-east-2b 24h openshift-machine-api ocp-kt5tz-worker-us-east-2c-z9xzs Running m6a.4xlarge us-east-2 us-east-2c 24h
-
We can see in this case that we have a g6e.2xlarge instance. We can check how many GPUs we are able to allocate:
oc get $(oc get node -o name -l beta.kubernetes.io/instance-type=g6e.2xlarge) -o=jsonpath={.status.allocatable} | | python -m json.tool
In this case - we have an output of 1 allocatable GPU:
{ "cpu": "7500m", "ephemeral-storage": "114345831029", "hugepages-1Gi": "0", "hugepages-2Mi": "0", "memory": "63801456Ki", "nvidia.com/gpu": "1", "pods": "250" }
-
Label the node with the device-plugin.config that matches the GPU instance product e.g. NVIDIA-L40S for this instance type.
oc label --overwrite node \ --selector=nvidia.com/gpu.product=NVIDIA-L40S \ nvidia.com/device-plugin.config=NVIDIA-L40S
If your instance type has different accelerators, you will need to adjust the label used here and the ConfigMap in the next step. -
Now apply the GPU Cluster Policy and ConfigMap objects that setup Time Slicing - a method to share nvidia gpus.
oc apply -k infra/applcations/gpu
-
After approx ~30sec check the number of allocatable GPUs
oc get $(oc get node -o name -l beta.kubernetes.io/instance-type=g6e.2xlarge) -o=jsonpath={.status.allocatable} | | python -m json.tool
This should now give an output with 8 allocatable GPUs. Great - now our data science team can see and use eight GPUs even though we only have one physical GPU.
{ "cpu": "7500m", "ephemeral-storage": "114345831029", "hugepages-1Gi": "0", "hugepages-2Mi": "0", "memory": "63801456Ki", "nvidia.com/gpu": "8", "pods": "250" }
-
Done ✅
Technical Knowledge
-
Good understanding of OpenShift/Kubernetes concepts
-
Basic familiarity with Python programming
-
Good knowledge of containerization concepts
-
Basic understanding of CI/CD pipelines
-
Good grasp of GitOps and Everything as Code practices
☕ Buckle Up, Here we go …