Skip to main content

Cache

JuiceFS comes with a powerful cache design, read more in JuiceFS Community Edition, JuiceFS Cloud Service. This chapter introduces cache related settings and best practices in CSI Driver.

Cache settings

For Kubernetes nodes, a dedicated disk is often used as data and cache storage, be sure to properly configure the cache directory, or JuiceFS cache will by default be written to /var/jfsCache, which can easily eat up system storage space.

After cache directory is set, it'll be accessible in the mount pod via hostPath, you might also need to configure other cache related options (like --cache-size) according to "Adjust mount options".

note
  • In CSI Driver, cache-dir parameter does not support wildcard character, if you need to use multiple disks as storage devices, specify multiple directories joined by the : character. See JuiceFS Community Edition and JuiceFS Cloud Service.
  • For scenario that does intensive small writes, we usually recommend users to temporarily enable client write cache, but due to its inherent risks, this is advised against when using CSI Driver, because pod lifecycle is significantly more unstable, and can cause data loss if pod exists unexpectedly.

Cache related settings is configured in mount options, you can also refer to the straightforward examples below. After PV is created and mounted, you can also check the mount pod command to make sure the options contain the newly set cache directory.

  • Static provisioning

    apiVersion: v1
    kind: PersistentVolume
    metadata:
    name: juicefs-pv
    labels:
    juicefs-name: ten-pb-fs
    spec:
    capacity:
    storage: 10Pi
    volumeMode: Filesystem
    accessModes:
    - ReadWriteMany
    persistentVolumeReclaimPolicy: Retain
    mountOptions:
    - cache-dir=/dev/vdb1
    - cache-size=204800
    csi:
    driver: csi.juicefs.com
    volumeHandle: juicefs-pv
    fsType: juicefs
    nodePublishSecretRef:
    name: juicefs-secret
    namespace: default
  • Dynamic provisioning

    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
    name: juicefs-sc
    provisioner: csi.juicefs.com
    parameters:
    csi.storage.k8s.io/provisioner-secret-name: juicefs-secret
    csi.storage.k8s.io/provisioner-secret-namespace: default
    csi.storage.k8s.io/node-publish-secret-name: juicefs-secret
    csi.storage.k8s.io/node-publish-secret-namespace: default
    mountOptions:
    - cache-dir=/dev/vdb1
    - cache-size=204800

Use PVC as cache path

From 0.15.1 and above, JuiceFS CSI Driver supports using a PVC as cache directory. This is often used in hosted Kubernetes clusters provided by cloud services, which allows you to use a dedicated cloud disk as cache storage for CSI Driver.

First, create a PVC according to your cloud service provider's manual, for example:

Assuming a PVC named ebs-pvc is already created under the same namespace as the mount pod (default to kube-system), use below example to use this PVC as cache directory for JuiceFS CSI Driver.

Static provisioning

Use this PVC in a JuiceFS PV:

apiVersion: v1
kind: PersistentVolume
metadata:
name: juicefs-pv
labels:
juicefs-name: ten-pb-fs
spec:
capacity:
storage: 10Pi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
csi:
driver: csi.juicefs.com
volumeHandle: juicefs-pv
fsType: juicefs
nodePublishSecretRef:
name: juicefs-secret
namespace: default
volumeAttributes:
juicefs/mount-cache-pvc: "ebs-pvc"

Dynamic provisioning

To use ebs-pvc in StorageClass:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: juicefs-sc
provisioner: csi.juicefs.com
parameters:
csi.storage.k8s.io/provisioner-secret-name: juicefs-secret
csi.storage.k8s.io/provisioner-secret-namespace: default
csi.storage.k8s.io/node-publish-secret-name: juicefs-secret
csi.storage.k8s.io/node-publish-secret-namespace: default
juicefs/mount-cache-pvc: "ebs-pvc"

Cache warm-up

JuiceFS Client runs inside the mount pod, so cache warm-up has to happen inside the mount pod, use below commands to enter the mount pod and carry out the warm-up:

# Application pod information will be used in below commands, save them as environment variables.
APP_NS=default # application pod namespace
APP_POD_NAME=example-app-xxx-xxx

# Enter the mount pod using a single command
kubectl -n kube-system exec -it $(kubectl -n kube-system get po --field-selector spec.nodeName=$(kubectl -n $APP_NS get po $APP_POD_NAME -o jsonpath='{.spec.nodeName}') -l app.kubernetes.io/name=juicefs-mount -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}' | grep $(kubectl get pv $(kubectl -n $APP_NS get pvc $(kubectl -n $APP_NS get po $APP_POD_NAME -o jsonpath='{..persistentVolumeClaim.claimName}' | awk '{print $1}') -o jsonpath='{.spec.volumeName}') -o jsonpath='{.spec.csi.volumeHandle}')) -- bash

# Locate the JuiceFS mount point inside pod
df -h | grep JuiceFS

The path of the Community Edition and Cloud Service JuiceFS client in the Mount Pod are different, pay attention to distinguish:

/usr/local/bin/juicefs warmup /jfs/pvc-48a083ec-eec9-45fb-a4fe-0f43e946f4aa/data

On the other hand, if you've already install JuiceFS Client in the application pod, it's OK to run the warm-up command directly inside the application pod.

Clean cache when mount pod exits

Local cache can be a precious resource, especially when dealing with large scale data. JuiceFS CSI Driver does not delete cache by default when mount pod exits. If this behavior doesn't suit you, make adjustment so that local cache is cleaned when mount pod exits.

note

This feature requires JuiceFS CSI Driver 0.14.1 and above.

Static provisioning

Modify volumeAttributes in PV definition, add juicefs/clean-cache: "true":

apiVersion: v1
kind: PersistentVolume
metadata:
name: juicefs-pv
labels:
juicefs-name: ten-pb-fs
spec:
capacity:
storage: 10Pi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
csi:
driver: csi.juicefs.com
volumeHandle: juicefs-pv
fsType: juicefs
nodePublishSecretRef:
name: juicefs-secret
namespace: default
volumeAttributes:
juicefs/clean-cache: "true"

Dynamic provisioning

Configure parameters in StorageClass definition, add juicefs/clean-cache: "true":

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: juicefs-sc
provisioner: csi.juicefs.com
parameters:
csi.storage.k8s.io/provisioner-secret-name: juicefs-secret
csi.storage.k8s.io/provisioner-secret-namespace: default
csi.storage.k8s.io/node-publish-secret-name: juicefs-secret
csi.storage.k8s.io/node-publish-secret-namespace: default
juicefs/clean-cache: "true"

Dedicated cache cluster

note

Dedicated cluster is only supported in JuiceFS Cloud Service & Enterprise, Community Edition is not supported.

Kubernetes containers are usually ephemeral, a distributed cache cluster built on top of ever-changing containers is unstable, which really hinders cache utilization. For this type of situation, you can deploy a dedicated cache cluster to achieve a stable cache service.

Use below example to deploy a StatefulSet of JuiceFS clients, together they form a stable JuiceFS cache group.

apiVersion: apps/v1
kind: StatefulSet
metadata:
# name and namespace are customizable
name: juicefs-cache-group
namespace: kube-system
spec:
# cache group peer amount
replicas: 1
podManagementPolicy: Parallel
selector:
matchLabels:
app: juicefs-cache-group
juicefs-role: cache
serviceName: juicefs-cache-group
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
template:
metadata:
labels:
app: juicefs-cache-group
juicefs-role: cache
spec:
# Run a single cache group peer on each node
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: jfs-role
operator: In
values:
- cache
topologyKey: kubernetes.io/hostname
# Using hostNetwork allows pod to run with a static IP, when pod is recreated, IP will not change so that cache data persists
hostNetwork: true
# Run juicefs auth command inside the initContainers
# ref: https://juicefs.com/docs/cloud/reference/commands_reference#auth
initContainers:
- name: jfs-format
command:
- sh
- -c
# Change $VOL_NAME to the actual JuiceFS Volume name
# ref: https://juicefs.com/docs/cloud/getting_started#create-file-system
- /usr/bin/juicefs auth --token=${TOKEN} --access-key=${ACCESS_KEY} --secret-key=${SECRET_KEY} $VOL_NAME
env:
# The Secret that contains volume credentials, must reside in same namespace as this StatefulSet
# ref: https://juicefs.com/docs/csi/guide/pv#cloud-service
- name: ACCESS_KEY
valueFrom:
secretKeyRef:
key: access-key
name: jfs-secret-ee
- name: SECRET_KEY
valueFrom:
secretKeyRef:
key: secret-key
name: jfs-secret-ee
- name: TOKEN
valueFrom:
secretKeyRef:
key: token
name: jfs-secret-ee
image: juicedata/mount:v1.0.2-4.8.2
volumeMounts:
- mountPath: /root/.juicefs
name: jfs-root-dir
# Containers running the JuiceFS Client and forming the cache group
# ref: https://juicefs.com/docs/cloud/guide/cache#dedicated-cache-cluster
containers:
- name: juicefs-cache
command:
- sh
- -c
# Change $VOL_NAME to the actual JuiceFS Volume name
# Must use --foreground to make JuiceFS Client process run in foreground, adjust other mount options to your need (especially --cache-group)
# ref: https://juicefs.com/docs/cloud/reference/commands_reference#mount
- /usr/bin/juicefs mount $VOL_NAME /mnt/jfs --foreground --cache-dir=/data/jfsCache --cache-size=512000 --cache-group=jfscache
# Use the mount pod container image
# ref: https://juicefs.com/docs/csi/guide/custom-image
image: juicedata/mount:v1.0.2-4.8.2
lifecycle:
# Unmount file system when exiting
preStop:
exec:
command:
- sh
- -c
- umount /mnt/jfs
# Adjust resource accordingly
# ref: https://juicefs.com/docs/csi/guide/resource-optimization#mount-pod-resources
resources:
requests:
memory: 500Mi
# Mounting file system requires system privilege
securityContext:
privileged: true
volumeMounts:
- mountPath: /dev/shm
name: cache-dir
- mountPath: /root/.juicefs
name: jfs-root-dir
volumes:
# Adjust cache directory, define multiple volumes if need to use multiple cache directories
# ref: https://juicefs.com/docs/cloud/guide/cache#client-read-cache
- name: cache-dir
hostPath:
path: /dev/shm
type: DirectoryOrCreate
- name: jfs-root-dir
emptyDir: {}

A JuiceFS cache cluster is deployed with the cache group name jfscache, in order to use this cache cluster in application JuiceFS clients, you'll need to join them into the same cache group, and additionally add the --no-sharing option, so that these application clients doesn't really involve in building the cache data, this is what prevents a instable cache group.

Under dynamic provisioning, modify mount options according to below examples, see full description in mount options.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: juicefs-sc
provisioner: csi.juicefs.com
parameters:
csi.storage.k8s.io/provisioner-secret-name: juicefs-secret
csi.storage.k8s.io/provisioner-secret-namespace: default
csi.storage.k8s.io/node-publish-secret-name: juicefs-secret
csi.storage.k8s.io/node-publish-secret-namespace: default
mountOptions:
...
- cache-group=jfscache
- no-sharing