Skip to main content

Mirror file system

If you wish to use the same JuiceFS file system across two or more regions, while maintaining a similar high-performance experience, you can consider using JuiceFS's "Mirror file system" feature: create one or more complete "read-only mirrors" for a file system with the same content, but with a different volume name and access credentials.

Prerequisites

In order to implement file system mirroring, the JuiceFS metadata service needs to be deployed privately in each region. You can freely choose any region, whether it is a public cloud or IDC, you can build a mirror cluster. This feature is currently only available to on-premise users, contact us for more information.

How it works

mirror architecture

To build a mirror file system, firstly, the mirror region needs to establish metadata synchronization, a separate metadata cluster will be deployed in the mirror region, and configured to become a mirror of the source region. The basic mirror unit is the whole metadata service, meaning you can't choose to mirror just a single file system. This is usually not a problem since the metadata mirroring overhead is relatively small. Once the mirror metadata region has been configured and deployed, it'll automatically sync the metadata from the source region, under normal cross region network conditions, synchronization latency is usually around a second, there's also a monitoring task running in Console which periodically check data differences between source and mirror regions.

As for object storage, users can choose from different setup depending on their performance requirement and budget:

  1. To simplify the setup, and lower the object storage costs, consider sharing the object storage service across both regions. To increase mirror region performance, use a distributed cache cluster, as shown below.

    mirror with shared object storage

  2. If your mirror region needs all the data from the source file system with the best performance, use a dedicated object storage service in the mirror region, and enable replication to keep them synchronized.

    mirror with async replication

With setup 2, a dedicated object storage is used in the mirror region, so credentials for both object storage services should be provided via the --access-key2 and --secret-key2 options. The mirror metadata service will watch for all data modifications in the Raft changelog, and dispatch data synchronization tasks to the clients (in the form of background tasks), client will then pull data from the source object storage, and upload to the target object storage. And if synchronization speed isn't ideal, simply mount more clients to increase concurrency.

Mirror region clients will preferentially read data from the object storage in the current region. If the data synchronization has not yet been completed, it will instead try to read from the object storage in the source region, as shown in the following figure:

mirror read object storage preference

Caveats

  • With replication enabled, when the mirror file system is mounted for the first time, it will automatically start synchronizing all existing data. But we recommend that you manually perform a full synchronization in advance (such as synchronizing data through juicefs sync), so that the mirror file system can be put into production faster. While the client in the mirror region is running, the data in the object storage will also be fully synchronized (in both directions) periodically, this is carried out once a week by default.
  • The mirror file system only support read-only access, so data cannot be written or modified. For FUSE mount point, any write operation will end with error. But for CSI Driver, pay special attention not to use any non-existent directories in PVs (e.g. using dynamic provisioning, or mounting a sub-directory that does not exist), or CSI Controller provisioning will fail, resulting in a bad mount.

Billing Notes

You can create multiple mirrors in multiple different regions, each of them is billed separately.