Kubeflow – A Clear-Cut Introduction

03. 02. 2023

Overview

A new blog where we cover MLOps tools. Featuring Kubeflow, a toolkit that makes deployment of ML frameworks easy.

In one of our previous blogs, we talked about the importance of using and contributing to Free and Open source projects and introduced MLflow (if you somehow missed that blog, go and check it out). In this blog, we’ll continue with “flow” projects and introduce Kubeflow, a toolkit for the deployment of Machine learning (ML) workflows on Kubernetes, and do a quick overview of services that Kubeflow offers. If you’re interested in a more hands-on introduction to Kubeflow, we’ll be releasing another blog that will go through the process of setting up and running a Kubeflow Pipeline

Growing demand for AI solutions introduces the need by more and more companies to have a platform on which they can build ML use cases. Implementing a solution to one ML problem can be easy, but maintaining various solutions in a team with multiple Data Scientists can be challenging. Generally, in those types of situations, we look for a platform that can ease the development process and simplify maintaining phase, hence Kubeflow.

Kubeflow offers components for simplifying or automating steps in a typical ML workflow:

model research (Kubeflow Notebooks)
training (Kubeflow Pipelines)
hyperparameter tuning (Katib)
model serving (KServe and similar)

Kubeflow - Experimental and Production Phase

Typical ML workflow depicted on Kubeflow.

Central Dashboard

When you set up Kubeflow on your Kubernetes clusters, you can access the Central Dashboard or Kubeflow UI.

Kubeflow - Central Dashboard

The menu on the left lets you access major components (alternatively, you can use quick access boxes in the middle):

Notebooks and Tensorboards → managing Jupyter notebook and Tensorboard servers
Models → managing deployed KServing models
Volumes → managing cluster volumes
Experiments (AutoML) → managing Katib experiments
Experiments (KFP), Pipelines, Runs, and Recurring Runs → managing Kubeflow Pipelines (KFP) features
Artifacts and Executions → ML Metadata (MLMD) features
Manage Contributors → configuring user access across namespace

The top left, next to the menu, is a drop-down that lets you change a namespace. Namespaces (sometimes called profiles or workgroups) allow for multi-user isolation, i.e., only users with access rights can see components of a given namespace.

Components

Notebooks

Notebooks provide a way to run development environments inside your Kubernetes cluster. It supports JupyterLab, RStudio, and Visual Studio Code. It is a way for users to create notebook containers directly in the cluster using provided notebook images with packages pre-installed.

When you open the Notebooks dashboard page click on “New Server” and choose your configuration. Some of the things you can select are:

Name → can include letters and numbers, but no spaces
Image → Docker image for notebook server. You can choose already configured images or use a custom one
Amount of CPU and RAM for your server
Volumes that will be mounted as Persistent Volume Claim (PVC) Volume

After you select “Launch” your new notebook server will be created, and you’ll get an option to connect to it. When you connect, you’ll be redirected to another view where you can create new notebooks or import your own.

Kubeflow - Work Page, creating Clusters and workflows

Kubeflow Pipelines

Pipelines is a platform for end-to-end orchestration of ML workflows based on Docker containers. Workflows are described using a pipeline (hence the name) which is defined by one or more steps called tasks. A task is a set of instructions for execution inside a single container and includes input and output parameters. It’s possible to link tasks to one another and for a computed acyclic graph (DAG) of tasks. Pipelines are written in Python for simplicity, compiled to YAML for portability, and executed on Kubernetes for scalability.

Opening the Pipelines dashboard gives you an overview of your pipelines and lets you upload new ones by specifying a YAML file, which you got by compiling your Python source code.

Kubeflow Pipelines

Once you’ve uploaded your pipeline, you can run it. Every pipeline run has an experiment associated with it, a space that contains the history of runs and lets you run different kinds of configurations of the same pipeline to compare the results.

Katib

Katib is a Kubernetes-native project for automated machine learning (AutoML). It natively supports many ML frameworks, such as Tensorflow, Pytorch, XGBoost, and others.

It is currently in beta status, which means there is a timeline defined for when it will become stable.

Tools for serving

Kubeflow offers two multi-framework model serving systems out-of-the-box, KServing and Seldon Core. Alternatively, it offers support for the standalone model serving system or external multi-framework systems like BentoML.

Summary

With Kubernetes being one of the most popular container orchestration tools it’s not hard to find a team that uses it, and with Kubeflow integrating into it without much extra configuration makes it a desirable service for any ML platform engineer out there. Features Kubeflow offers are extensive but simple to use and with a large community helping in maintaining and upgrading it, it will only become better.