Skip to main content

Mirror file system

If you wish to use the same JuiceFS file system across two or more regions, while maintaining a similar high-performance experience, you can consider using JuiceFS's "Mirror file system" feature: create one or more complete "read-only mirrors" for a file system with the same content, but with a different volume name and access credentials.

Prerequisites

In order to implement file system mirroring, the JuiceFS metadata service needs to be deployed privately in each region. You can freely choose any region, whether it is a public cloud or IDC, you can build a mirror cluster. This feature is currently only available to on-premise users, contact us for more information.

How it works

mirror

For a mirror file system, both file metadata and object storage data need to establish synchronization. For metadata, a separate metadata cluster will be deployed in the mirror region, and configured into a mirror of the source region. The basic mirror unit is the whole metadata service, meaning you can't choose to mirror a single file system, this is usually not a problem since metadata mirroring overhead is relatively small. Once the mirror metadata region has been configured and deployed, it'll automatically sync the metadata from the source region, under normal cross region network conditions, synchronization latency is usually around a second, there's also a monitoring task running in Console which periodically check data differences between source and mirror regions.

As for object storage, using a dedicated object storage bucket in the mirror region is recommended for better performance, and enable replication to synchronize data. However this is optional and you are free to use the same object storage service in both regions, if network conditions are acceptable.

If you decide to use a dedicated object storage in the mirror region, you'll have to provide credentials for both object storage services via the --access-key2 and --secret-key2 options. The mirror metadata service will watch for all data modifications in the Raft changelog, and dispatch data synchronization tasks to the clients (in the form of background tasks), client will then pull data from the source object storage, and upload to the target object storage. And if synchronization speed isn't ideal, simply mount more clients to increase concurrency.

Mirror region clients will preferentially read data from the object storage in the current region. If the data synchronization has not yet been completed and the read fails, it will instead try to read from the object storage in the source region, as shown in the following figure:

mirror-read

With replication enabled, when the mirror file system is mounted for the first time, it will automatically start synchronizing all existing data. But we recommend that you manually perform a full synchronization in advance (such as synchronizing data through juicefs sync), so that the mirror file system can be put into production faster. While the client in the mirror region is running, the data in the object storage will also be fully synchronized (in both directions) periodically, this is carried out once a week by default.

Caveats

  • If mirror region can access the source object storage via public internet, you can also choose to use the same object storage for both regions, and if read speed isn't ideal, consider using a dedicated cache cluster, so that you can warm up data in advance, and greatly accelerate file reads.
  • The mirror file system only support read-only access, so data cannot be written or modified. For FUSE mount point, any write operation will end with error. But for CSI Driver, pay special attention not to use any non-existent directories in PVs (e.g. using dynamic provisioning, or mounting a sub-directory that does not exist), or CSI Controller provisioning will fail, resulting in a bad mount.

Billing Notes

You can create multiple mirrors in multiple different regions, each of them is billed separately.