Deep dive in JuiceFS CSI Driver Sidecar mode

2023-02-22
weiweizhu

JuiceFS CSI Driver v0.18 provides a new way to access the file system, where JuiceFS client runs as a sidecar in the application Pod, sharing lifecycle with the application Pod. This new feature allows use of JuiceFS in Serverless Kubernetes environments; the sidecar mode also brings easier troubleshooting and client management than the default Mount Pod mode. In this article, we will introduce the architecture and application scenarios of the sidecar mode.

What is a sidecar in cloud?

Sidecar is a common design pattern, and the concept is very popular in the field of containers and microservices. Back to its literal meaning, as shown in the figure below, it means the cart installed on the side of the motorcycle, which is to increase the carrying capacity. It expresses the relationship between Sidecar containers and application containers. In a cloud environment, they become one and the same, sharing the environment of a Kubernetes pod, and all containers within the same pod have the same lifecycle.

sidecar

01. Introduction

1) Terminology

Pod: the smallest deployable computing unit that you can create and manage in Kubernetes.

Deployment / DaemonSet / StatefulSet / Job: declarative resources, different ways of managing Pods

PV (PersistentVolume): a piece of storage in the cluster

PVC (PersistentVolumeClaim): handling the user's request for storage

StorageClass: providing the administrator with a way to describe the storage "class"

CSI (Container Storage Interface): Container Storage Interface

2) How to use

Storage management is completely different from computing resource management. A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. Depending on how PVs are created, they can be used in two ways: static provisioning or dynamic provisioning, which are described below.

Static provisioning

The system administrator creates one or more PVs, in which the JuiceFS Volume credentials are referenced. Then, the user only needs to create a PVC, specify the desired PV, and use this PVC in the application Pod.

The relationship between the resources:

static provisioner

Static provisioning requires the system administrator to create a PV for each application, which is widely used in simple test scenarios or when sharing data between applications.

Dynamic provisioning

The system administrator creates the StorageClass, which contains JuiceFS volume credentials and mount options. Then the user creates the PVC using this StorageClass. Kubernetes will automatically create a PV to bind the PVC according to the StorageClass, when the user creates the application Pod that uses this PVC.

The relationship between the resources:

dynamic provisioner

In dynamic provisioning, a single StorageClass can be used by multiple applications, which saves system administrators from managing the PVs. In addition, for JuiceFS, the dynamic provisioning also enables data isolation between applications.

02. How JuiceFS CSI Driver works

Users can use JuiceFS in a cluster by specifying the JuiceFS Volume configurations in PV / StorageClass. JuiceFS CSI Driver provides two different ways to mount JuiceFS: Mount Pod mode and sidecar mode.

1) Mount Pod Mode

Components

The Mount Pod model requires CSI Controller Service and CSI Node Service, with the following responsibilities, respectively.

  • Controller Service: creates subdirectory in JuiceFS with PV id as name, used only in dynamic provisioning.
  • Node Service: creates a Mount Pod (with JuiceFS Client running inside) and mounts the application Pod.

This model also requires:

  • Containers can use FUSE devices (run in Privileged mode).
  • Cluster supports DaemonSet, since CSI Node is a DaemonSet component.

Architecture

The CSI Node Service architecture:

CSI node service with mount pod

  1. User creates application pod, referencing PVC previously created;
  2. CSI Node Service creates a mount pod on the associating node;
  3. A JuiceFS Client runs inside the mount pod and mounts JuiceFS volume to the host, path being /var/lib/juicefs/volume/[pv-name];
  4. CSI Node Service waits until the mount pod is up and running, and binds PV with the associated container, the PV sub-directory is mounted in the pod, the path defined by volumeMounts;
  5. Kubelet starts the application pod.

PVC in relation to Mount Pod

The relationship between relevant Kubernetes resources:

relationship between application pod, pvc, pv and mount pod

2) Sidecar Mode

Components

The JuiceFS Client runs in the same Pod as the application, inside a sidecar container.

Under sidecar mode, only the CSI Controller Service (a StatefulSet component) is deployed, its responsibility:

  1. Creating subdirectories in JuiceFS, named after the PV id.
  2. Registering the webhook with ApiServer, the webhook will handle the actual sidecar injection.

Architecture

The CSI Controller Service architecture:

JuiceFS CSI with sidecar

  1. A Webhook is registered to API Server when CSI Controller starts;
  2. An application pod references an existing JuiceFS PVC;
  3. Before actual pod creation, API Server will query against the Webhook API;
  4. CSI Controller injects the sidecar container (with JuiceFS Client running inside) into the application pod;
  5. API Server creates the application pod, with JuiceFS Client running in its sidecar container, application container can access JuiceFS once it's started.

PVC's relationship with sidecar

The relationship between PVC, PV, and Pod is represented in the following diagram, Each application pod has its own exclusive JuiceFS Client and the application mount points are isolated from each other.

relationship between application pod, pvc, pv and sidecar

In this mode, FUSE access is required and containers have to run in privileged mode

The default resource requests for the sidecar container are 1 CPU and 1GiB memory, and limits are 2 CPU and 5GiB memory. When cluster resources are low, we can also configure the resource definition in PV/StorageClass.

In addition, the mount point is shared between the sidecar container and the application container through HostPath. So when the sidecar container crashes, the mount point is lost and the pod has to be re-created in order to recover.

03. Pros and cons of the two models and the application scenarios

Sidecar mode:

  • Pros: easier to troubleshoot; available in all Kubernetes environments including serverless.
  • Cons: higher resource overhead because the sidecar container is exclusive to a pod and cannot be shared.
  • Application scenarios: Serverless Kubernetes environments (requires FUSE); standard Kubernetes environments with a low volume of application tasks.

Mount Pod mode:

  • Pros: lower resource overhead due to mount pod sharing.
  • Cons: slightly more difficult to troubleshoot; Not available in a serverless environment.
  • Scenario: standard Kubernetes environments with a high number of application tasks.

Q & A

Q: How is JuiceFS Client upgraded?

A: The version of the client depends on the mount image used by the sidecar container, which can be configured in PV/StorageClass. You can customize this image according to the documentation.

Q: Will the read performance be affected when the limit is reached?

A: Performance will be throttled when the CPU usage reaches the limit, and the client process will be OOM killed by the system if memory usage reaches the limit.

Q: Is there any control over the initialization order of sidecar and application containers?

A: The sidecar container starts first, and the application container starts after.

Q: Is mount point auto recovery available in sidecar mode?

A: In the sidecar mode, if the FUSE client crashes, the mount point will be lost; Pod will have to be re-created for the mount point to recover.

About Author:

Weiwei Zhu, Juicedata engineer, who is responsible for the developing and maintaining JuiceFS CSI Driver, and the development of JuiceFS in the area of cloud-native.

Related Posts

Training LLMs: Best Practices for Storing Thousands of Nodes in K8s

2024-10-09
Learn the best practices for training large language models in Kubernetes clusters with thousands o…

JuiceFS CSI Workflow: K8s Pod Creation with PVs

2024-09-30
Learn what happens in the JuiceFS CSI solution when creating a pod with a Persistent Volume (PV) an…

From Object Storage to K8s+JuiceFS: 85% Storage Cost Cut, HDFS-Level Performance

2024-02-07
Explore the evolution of a financial technology company's big data platform on the cloud, transitio…

JuiceFS CSI Dashboard Simplifies Troubleshooting in Cloud

2024-01-31
Explore JuiceFS CSI Dashboard designed to streamline troubleshooting in complex K8s environments.