Bonus Lab: Advanced Pipeline Customization and Artifact Management

Congratulations on reaching this point! You’ve successfully learned the fundamentals of working with pipelines and managing artifacts. To take your skills to the next level, we present a Bonus Lab that will help you explore more advanced topics in pipeline customization and artifact management.

This is an optional section intended to deepen your understanding, and while it’s not required to complete the bootcamp, it will provide valuable experience and additional tools for real-world AI pipeline workflows.

Pre-requisites

Make sure you have completed the main pipeline module and are familiar with:

  1. The basics of pipeline creation and artifact management.

  2. How to configure and store artifacts using default tools. Basic Python scripting and command-line interaction.

Build and Deploy an Airflow Pipeline with Elyra

Pre-requisites:

To build and deploy an Airflow pipeline with Elyra, you need to follow these steps:

  1. Add Elyra custom notebook image - quay.io/eformat/elyra-base:0.2.1

    Review the lesson to add custom notebook image to RHOAI

  2. Create workbench with custom-elyra image

  3. Add Data Connection

    You can use deployed Minio instance

  4. Clone fork of git repository - https://github.com/<YOUR-ID>red-hat-data-services/telecom-customer-churn-airflow

  5. Configure Elyra to work with Airflow. Add custom runtime image - quay.io/eformat/airflow-runner:2.5.1

    Solution
    add runtime image
  6. Create a runtime configuration

    Solution
    1. Display Name: airflow

    2. Airflow settings:

      1. Apache Airflow UI Endpoint: run oc get route -n airflow to get the route

      2. Apache Airflow User Namespace: airflow

    3. Github/GitLabs settings:

      1. Git type: GITHUB

      2. GitHub server API Endpoint: https://api.github.com

      3. GitHub DAG Repository: https://github.com/<YOUR-ID>red-hat-data-services/telecom-customer-churn-airflow

      4. GitHub DAG Repository Branch: Your branch

      5. Personal Access Token: A personal access token for pushing to the repository

    4. Cloud Object Storage settings: Minio Storage details

  7. Test and run the DAG - https://github.com/<YOUR-ID>red-hat-data-services/telecom-customer-churn-airflow