How to implement a distributed /etc directory using etcd and JuiceFS

2023-02-02
Weiwei Zhu

Background

etcd is a key-value database with consistency and high availability. It is simple, safe, fast, and reliable and the primary data store for Kubernetes. Let's start with an official description of etcd's name.

The name “etcd” originated from two ideas, the unix “/etc” folder and “d"istributed systems. The “/etc” folder is a place to store configuration data for a single system whereas etcd stores configuration information for large scale distributed systems. Hence, a “d"istributed “/etc” is “etcd”.

The above quote comes from etcd officialwebsite. etcd imaginatively combines the concepts of etc (where Linux systems typically store their configuration files) and distributed. However, since etcd is served via HTTP API, so "unfortunately" a truly distributed /etc directory was not implemented. Below we will introduce how etcd can be used to implement a truly distributed /etc directory via JuiceFS.

We use JuiceFS, an open source distributed file system, to provide access to the POSIX file interface for `/etc`, and JuiceFS can use etcd as a Metadata engine to store metadata such as directory trees and filenames in the file system. With JuiceFS CSI Driver, it can be used as Persistent Volume to enable multiple application instances to share configuration files in Kubernetes platform, which is the out-and-out distributed `/etc`.

The following will introduce what is JuiceFS, why JuiceFS can and how to implement distributed /etc, and how etcd can share configuration files across multiple application instances with JuiceFS.

What is JuiceFS

JuiceFS Arch

JuiceFS is an open source distributed file system designed for the cloud, providing full POSIX, HDFS and S3 API compatibility. JuiceFS separates "data" and "metadata" storage, files are split into chunks and stored in object storage and the corresponding metadata can be stored in various databases such as Redis, MySQL, TiKV, SQLite, etc. v1.0-beta3 version officially supports etcd as the metadata engine. The data storage engine docks to any object storage storage , and etcd is also supported as data storage engine in v1.0-rc1, which is suitable for storing configuration files with small capacity.

Why JuiceFS can achieve distributed /etc

According to the hierarchical architecture described above, we can see that JuiceFS stores metadata in the database and file data in object storage, thus allowing users to access the same tree file system structure on different nodes. With JuiceFS, we can put configuration files in the file system and then mount JuiceFS into its /etc directory in each application, thus realizing a truly distributed "/etc". The whole process is shown in the figure below.

unnamed

How to implement distributed /etc

The next section describes how etcd makes it possible for multiple nginx instances to share the same configuration and implement distributed /etc using JuiceFS.

Deploy etcd

In a Kubernetes cluster, it is recommended to build an independent for JuiceFS, instead of using the default etcd in the cluster, to avoid affecting the stability of the Kubernetes cluster with high access pressure of file system.

To install etcd, you can refer to the official documentation and build a multi-node etcd cluster; you can also use the chart installer provided by Bitnami for etcd .

In case of data sensitivity, you can enable the encrypted communication function of etcd for encrypted data transmission. Refer to the sample script provided by the etcd project.

Preparing configuration files in JuiceFS

After the etcd cluster is installed, we can initialize a JuiceFS file system with two lines of commands.

$ juicefs format etcd://$IP1:2379,$IP2:2379,$IP3:2379/jfs --storage etcd --bucket etcd://$IP1:2379,$IP2:2379,$IP3:2379/data pics
$ juicefs mount etcd://$IP1:2379,$IP2:2379,$IP3:2379/jfs /mnt/jfs

After mounting the JuiceFS volume to `/mnt/jfs`, you can directly place the `nginx.conf` file in that directory.

Using JuiceFS in Kubernetes

First we should create a Secret and fill it with etcd connection information:

apiVersion: v1
kind: Secret
metadata:
  name: juicefs-secret
  namespace: kube-system
type: Opaque
stringData:
  name: test
  metaurl: etcd://$IP1:2379,$IP2:2379,$IP3:2379/jfs
  storage: etcd
  bucket: etcd://$IP1:2379,$IP2:2379,$IP3:2379/data

Both the metadata engine and the object store use etcd, where $IP1, $IP2, and $IP3 are the client IPs of etcd. Then we create the PV and PVC (see this document. You can then mount the PVC to `/etc` in Nginx application, as follows.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-app
spec:
    …
    volumes:
      - name: config
        persistentVolumeClaim:
          claimName: etcd
    containers:
      - image: nginx
        volumeMounts:
        - mountPath: /etc/nginx
          name: config
         …

When Nginx application starts, it will read the `nginx.conf` configuration file under `/etc/nginx`, that is, read the `nginx.conf` configuration file we placed in JuiceFS which is shared through JuiceFS volume.

Summary

etcd has become the de facto standard as a cloud-native key-value database for storing service configuration information in the Kubernetes platform. However, the configuration file management of many upper-level applications is still inconvenient. This article shares how JuiceFS turned etcd into a distributed "/etc", helping etcd fulfill its original dream ✌🏻.

About Author:

Weiwei Zhu, Juicedata engineer, who is responsible for the developing and maintaining JuiceFS CSI Driver, and the development of JuiceFS in the area of cloud-native.

Related Posts

From Hadoop to Cloud: Why and How to Decouple Storage and Compute in Big Data Platforms

2023-11-01
Learn the importance and feasibility of storage-compute decoupling and explore available market sol…

Xiaomi: Building a Cloud-Native File Storage Platform to Host 5B+ Files in AI Training & More

2023-10-12
Xiaomi built a cloud-native file storage platform with JuiceFS to support apps in AI training, larg…