Skip to main content

How to Set Up Object Storage

As you can learn from JuiceFS Technical Architecture, JuiceFS is a distributed file system with data and metadata stored separately. JuiceFS uses object storage as the main data storage and uses databases such as Redis, PostgreSQL and MySQL as metadata storage.

Storage options

When creating a JuiceFS file system, there are following options to set up the storage:

  • --storage: Specify the type of storage to be used by the file system, e.g. --storage s3
  • --bucket: Specify the storage access address, e.g. --bucket https://myjuicefs.s3.us-east-2.amazonaws.com
  • --access-key and --secret-key: Specify the authentication information when accessing the storage

For example, the following command uses Amazon S3 object storage to create a file system:

juicefs format --storage s3 \
--bucket https://myjuicefs.s3.us-east-2.amazonaws.com \
--access-key abcdefghijklmn \
--secret-key nmlkjihgfedAcBdEfg \
redis://192.168.1.6/1 \
myjfs

Other options

When executing the juicefs format or juicefs mount command, you can set some special options in the form of URL parameters in the --bucket option, such as tls-insecure-skip-verify=true in https://myjuicefs.s3.us-east-2.amazonaws.com?tls-insecure-skip-verify=true is to skip the certificate verification of HTTPS requests.

Client certificates are also supported as they are commonly used for mTLS connections, for example: https://myjuicefs.s3.us-east-2.amazonaws.com?ca-certs=./path/to/ca&ssl-cert=./path/to/cert&ssl-key=./path/to/privatekey

Enable data sharding

When creating a file system, multiple buckets can be defined as the underlying storage of the file system through the --shards option. In this way, the system will distribute the files to multiple buckets based on the hashed value of the file name. Data sharding technology can distribute the load of concurrent writing of large-scale data to multiple buckets, thereby improving the writing performance.

The following are points to note when using the data sharding function:

  • The --shards option accepts an integer between 0 and 256, indicating how many Buckets the files will be scattered into. The default value is 0, indicating that the data sharding function is not enabled.
  • Only multiple buckets under the same object storage can be used.
  • The integer wildcard %d needs to be used to specify the buckets, for example, "http://192.168.1.18:9000/myjfs-%d". Buckets can be created in advance in this format, or automatically created by the JuiceFS client when creating a file system.
  • The data sharding is set at the time of creation and cannot be modified after creation. You cannot increase or decrease the number of buckets, nor cancel the shards function.

For example, the following command creates a file system with 4 shards.

juicefs format --storage s3 \
--shards 4 \
--bucket "https://myjfs-%d.s3.us-east-2.amazonaws.com" \
...

After executing the above command, the JuiceFS client will create 4 buckets named myjfs-0, myjfs-1, myjfs-2, and myjfs-3.

Access Key and Secret Key

In general, object storages are authenticated with Access Key ID and Access Key Secret. For JuiceFS file system, they are provided by options --access-key and --secret-key (or AK, SK for short).

It is more secure to pass credentials via environment variables ACCESS_KEY and SECRET_KEY instead of explicitly specifying the options --access-key and --secret-key in the command line when creating a filesystem, e.g.,

export ACCESS_KEY=abcdefghijklmn
export SECRET_KEY=nmlkjihgfedAcBdEfg
juicefs format --storage s3 \
--bucket https://myjuicefs.s3.us-east-2.amazonaws.com \
redis://192.168.1.6/1 \
myjfs

Public clouds typically allow users to create IAM (Identity and Access Management) roles, such as AWS IAM role or Alibaba Cloud RAM role, which can be assigned to VM instances. If the cloud server instance already has read and write access to the object storage, there is no need to specify --access-key and --secret-key.

Use temporary access credentials

Permanent access credentials generally have two parts, Access Key, Secret Key, while temporary access credentials generally include three parts, Access Key, Secret Key and token, and temporary access credentials have an expiration time, usually between a few minutes and a few hours.

How to get temporary credentials

Different cloud vendors have different acquisition methods. Generally, the Access Key, Secret Key and ARN representing the permission boundary of the temporary access credential are required as parameters to request access to the STS server of the cloud service vendor to obtain the temporary access credential. This process can generally be simplified by the SDK provided by the cloud vendor. For example, Amazon S3 can refer to this link to obtain temporary credentials, and Alibaba Cloud OSS can refer to this link.

How to set up object storage with temporary access credentials

The way of using temporary credentials is not much different from using permanent credentials. When formatting the file system, pass the Access Key, Secret Key, and token of the temporary credentials through --access-key, --secret-key, --session-token can set the value. E.g:

juicefs format \
--storage oss \
--access-key xxxx \
--secret-key xxxx \
--session-token xxxx \
--bucket https://bucketName.oss-cn-hangzhou.aliyuncs.com \
redis://localhost:6379/1 \
test1

Since temporary credentials expire quickly, the key is how to update the temporary credentials that JuiceFS uses after format the file system before the temporary credentials expire. The credential update process is divided into two steps:

  1. Before the temporary certificate expires, apply for a new temporary certificate;
  2. Without stopping the running JuiceFS, use the juicefs config Meta-URL --access-key xxxx --secret-key xxxx --session-token xxxx command to hot update the access credentials.

Newly mounted clients will use the new credentials directly, and all clients already running will also update their credentials within a minute. The entire update process will not affect the running business. Due to the short expiration time of the temporary credentials, the above steps need to be executed in a long-term loop to ensure that the JuiceFS service can access the object storage normally.

Internal and public endpoint

Typically, object storage services provide a unified URL for access, but the cloud platform usually provides both internal and external endpoints. For example, the platform cloud services that meet the criteria will automatically resolve requests to the internal endpoint of the object storage. This offers you a lower latency, and internal network traffic is free.

Some cloud computing platforms also distinguish between internal and public networks, but instead of providing a unified access URL, they provide separate internal Endpoint and public Endpoint addresses.

JuiceFS also provides flexible support for this object storage service that distinguishes between internal and public addresses. For scenarios where the same file system is shared, the object storage is accessed through internal Endpoint on the servers that meet the criteria, and other computers are accessed through public Endpoint, which can be used as follows:

  • When creating a file system: It is recommended to use internal Endpoint address for --bucket
  • When mounting a file system: For clients that do not satisfy the internal line, you can specify a public Endpoint address to --bucket.

Creating a file system using an internal Endpoint ensures better performance and lower latency, and for clients that cannot be accessed through an internal address, you can specify a public Endpoint to mount with the option --bucket.

Storage class Added in v1.1

Object storage usually supports multiple storage classes, such as standard storage, infrequent access storage, and archive storage. Different storage classes will have different prices and availability, you can set the default storage class with the --storage-class option when creating the JuiceFS file system, or set a new storage class with the --storage-class option when mounting the JuiceFS file system. Please refer to the user manual of the object storage you are using to see how to set the value of the --storage-class option (such as Amazon S3).

note

When using certain storage classes (such as archive and deep archive), the data cannot be accessed immediately, and the data needs to be restored in advance and accessed after a period of time.

note

When using certain storage classes (such as infrequent access), there are minimum bill units, and additional charges may be incurred for reading data. Please refer to the user manual of the object storage you are using for details.

Using proxy

If the network environment where the client is located is affected by firewall policies or other factors that require access to external object storage services through a proxy, the corresponding proxy settings are different for different operating systems. Please refer to the corresponding user manual for settings.

On Linux, for example, the proxy can be set by creating http_proxy and https_proxy environment variables.

export http_proxy=http://localhost:8035/
export https_proxy=http://localhost:8035/
juicefs format \
--storage s3 \
... \
myjfs

Supported object storage

If you wish to use a storage system that is not listed, feel free to submit a requirement issue.

NameValue
Amazon S3s3
Google Cloud Storagegs
Azure Blob Storagewasb
Backblaze B2b2
IBM Cloud Object Storageibmcos
Oracle Cloud Object Storages3
Scaleway Object Storagescw
DigitalOcean Spacesspace
Wasabiwasabi
Telnyx Cloud Storages3
Storj DCSs3
Vultr Object Storages3
Cloudflare R2s3
Bunny Storagebunny
Alibaba Cloud OSSoss
Tencent Cloud COScos
Huawei Cloud OBSobs
Baidu Object Storagebos
Volcano Engine TOStos
Kingsoft Cloud KS3ks3
QingStorqingstor
Qiniuqiniu
Sina Cloud Storagescs
CTYun OOSoos
ECloud Object Storageeos
JD Cloud OSSs3
UCloud US3ufile
Ceph RADOSceph
Ceph RGWs3
Glustergluster
Swiftswift
MinIOminio
WebDAVwebdav
HDFShdfs
Apache Ozones3
Redisredis
TiKVtikv
etcdetcd
SQLitesqlite3
MySQLmysql
PostgreSQLpostgres
Local diskfile
SFTP/SSHsftp

Amazon S3

S3 supports two styles of endpoint URI: virtual hosted-style and path-style. The difference is:

  • Virtual-hosted-style: https://<bucket>.s3.<region>.amazonaws.com
  • Path-style: https://s3.<region>.amazonaws.com/<bucket>

The <region> should be replaced with specific region code, e.g. the region code of US East (N. Virginia) is us-east-1. All the available region codes can be found here.

note

For AWS users in China, you need add .cn to the host, i.e. amazonaws.com.cn, and check this document for region code.

note

If the S3 bucket has public access (anonymous access is supported), please set --access-key to anonymous.

In JuiceFS both the two styles are supported to specify the bucket address, for example:

juicefs format \
--storage s3 \
--bucket https://<bucket>.s3.<region>.amazonaws.com \
... \
myjfs

You can also set --storage to s3 to connect to S3-compatible object storage, e.g.:

juicefs format \
--storage s3 \
--bucket https://<bucket>.<endpoint> \
... \
myjfs
tip

The format of the option --bucket for all S3 compatible object storage services is https://<bucket>.<endpoint> or https://<endpoint>/<bucket>. The default region is us-east-1. When a different region is required, it can be set manually via the environment variable AWS_REGION or AWS_DEFAULT_REGION.

Google Cloud Storage

Google Cloud uses IAM to manage permissions for accessing resources. Through authorizing service accounts, you can have a fine-grained control of the access rights of cloud servers and object storage.

For cloud servers and object storage that belong to the same service account, as long as the account grants access to the relevant resources, there is no need to provide authentication information when creating a JuiceFS file system, and the cloud platform will automatically complete authentication.

For cases where you want to access the object storage from outside the Google Cloud Platform, for example, to create a JuiceFS file system on your local computer using Google Cloud Storage, you need to configure authentication information. Since Google Cloud Storage does not use Access Key ID and Access Key Secret, but rather the JSON key file of the service account to authenticate the identity.

Please refer to "Authentication as a service account" to create JSON key file for the service account and download it to the local computer, and define the path to the key file via the environment variable GOOGLE_APPLICATION_ CREDENTIALS, e.g.:

export GOOGLE_APPLICATION_CREDENTIALS="$HOME/service-account-file.json"

You can write the command to create environment variables to ~/.bashrc or ~/.profile and have the shell set it automatically every time you start.

Once you have configured the environment variables for passing key information, the commands to create a file system locally and on Google Cloud Server are identical. For example,

juicefs format \
--storage gs \
--bucket <bucket>[.region] \
... \
myjfs

As you can see, there is no need to include authentication information in the command, and the client will authenticate the access to the object storage through the JSON key file set in the previous environment variable. Also, since the bucket name is globally unique, when creating a file system, you only need to specify the bucket name in the option --bucket.

Azure Blob Storage

To use Azure Blob Storage as data storage of JuiceFS, please check the documentation to learn how to view the storage account name and access key, which correspond to the values ​​of the --access-key and --secret-key options, respectively.

The --bucket option is set in the format https://<container>.<endpoint>, please replace <container> with the name of the actual blob container and <endpoint> with core.windows.net (Azure Global) or core.chinacloudapi.cn (Azure China). For example:

juicefs format \
--storage wasb \
--bucket https://<container>.<endpoint> \
--access-key <storage-account-name> \
--secret-key <storage-account-access-key> \
... \
myjfs

In addition to providing authorization information through the options --access-key and --secret-key, you could also create a connection string and set the environment variable AZURE_STORAGE_CONNECTION_STRING. For example:

# Use connection string
export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;AccountName=XXX;AccountKey=XXX;EndpointSuffix=core.windows.net"
juicefs format \
--storage wasb \
--bucket https://<container> \
... \
myjfs
note

For Azure users in China, the value of EndpointSuffix is core.chinacloudapi.cn.

Backblaze B2

To use Backblaze B2 as a data storage for JuiceFS, you need to create application key first. Application Key ID and Application Key corresponds to Access Key and Secret Key, respectively.

Backblaze B2 supports two access interfaces: the B2 native API and the S3-compatible API.

B2 native API

The storage type should be set to b2, and only the bucket name needs to be set in the option --bucket. For example:

juicefs format \
--storage b2 \
--bucket <bucket> \
--access-key <application-key-ID> \
--secret-key <application-key> \
... \
myjfs

S3-compatible API

The storage type should be set to s3, and the full bucket address in the option bucket needs to be specified. For example:

juicefs format \
--storage s3 \
--bucket https://s3.eu-central-003.backblazeb2.com/<bucket> \
--access-key <application-key-ID> \
--secret-key <application-key> \
... \
myjfs

IBM Cloud Object Storage

When creating JuiceFS file system using IBM Cloud Object Storage, you first need to create an API key and an instance ID. The "API key" and "instance ID" are the equivalent of access key and secret key, respectively.

IBM Cloud Object Storage provides multiple endpoints for each region, depending on your network (e.g. public or private). Thus, please choose an appropriate endpoint. For example:

juicefs format \
--storage ibmcos \
--bucket https://<bucket>.<endpoint> \
--access-key <API-key> \
--secret-key <instance-ID> \
... \
myjfs

Oracle Cloud Object Storage

Oracle Cloud Object Storage supports S3 compatible access. Please refer to official documentation for more information.

The endpoint format for this object storage is: ${namespace}.compat.objectstorage.${region}.oraclecloud.com, for example:

juicefs format \
--storage s3 \
--bucket https://<bucket>.<endpoint> \
--access-key <your-access-key> \
--secret-key <your-sceret-key> \
... \
myjfs

Scaleway Object Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.s3.<region>.scw.cloud. Remember to replace <region> with specific region code, e.g. the region code of "Amsterdam, The Netherlands" is nl-ams. All available region codes can be found here. For example:

juicefs format \
--storage scw \
--bucket https://<bucket>.s3.<region>.scw.cloud \
... \
myjfs

DigitalOcean Spaces

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<space-name>.<region>.digitaloceanspaces.com. Please replace <region> with specific region code, e.g. nyc3. All available region codes can be found here. For example:

juicefs format \
--storage space \
--bucket https://<space-name>.<region>.digitaloceanspaces.com \
... \
myjfs

Wasabi

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.s3.<region>.wasabisys.com, replace <region> with specific region code, e.g. the region code of US East 1 (N. Virginia) is us-east-1. All available region codes can be found here. For example:

juicefs format \
--storage wasabi \
--bucket https://<bucket>.s3.<region>.wasabisys.com \
... \
myjfs
note

For users in Tokyo (ap-northeast-1) region, please refer to this document to learn how to get appropriate endpoint URI.***

Telnyx

Prerequisites

Set up JuiceFS:

juicefs format \
--storage s3 \
--bucket https://<regional-endpoint>.telnyxstorage.com/<bucket> \
--access-key <api-key> \
--secret-key <api-key> \
... \
myjfs

Available regional endpoints are here.

Storj DCS

Please refer to this document to learn how to create access key and secret key.

Storj DCS is an S3-compatible storage, using s3 for option --storage. The setting format of the option --bucket is https://gateway.<region>.storjshare.io/<bucket>, and please replace <region> with the corresponding region code you need. There are currently three available regions: us1, ap1 and eu1. For example:

juicefs format \
--storage s3 \
--bucket https://gateway.<region>.storjshare.io/<bucket> \
--access-key <your-access-key> \
--secret-key <your-sceret-key> \
... \
myjfs
caution

Storj DCS ListObjects API is not fully S3 compatible (result list is not sorted), so some features of JuiceFS do not work. For example, juicefs gc, juicefs fsck, juicefs sync, juicefs destroy. And when using juicefs mount, you need to disable automatic-backup function by adding --backup-meta 0.

Vultr Object Storage

Vultr Object Storage is an S3-compatible storage, using s3 for --storage option. The format of the option --bucket is https://<bucket>.<region>.vultrobjects.com/. For example:

juicefs format \
--storage s3 \
--bucket https://<bucket>.ewr1.vultrobjects.com/ \
--access-key <your-access-key> \
--secret-key <your-sceret-key> \
... \
myjfs

Please find the access and secret keys for object storage in the customer portal.

Cloudflare R2

R2 is Cloudflare's object storage service and provides an S3-compatible API, so usage is the same as Amazon S3. Please refer to Documentation to learn how to create Access Key and Secret Key.

juicefs format \
--storage s3 \
--bucket https://<ACCOUNT_ID>.r2.cloudflarestorage.com/myjfs \
--access-key <your-access-key> \
--secret-key <your-sceret-key> \
... \
myjfs

For production, it is recommended to pass key information via the ACCESS_KEY and SECRET_KEY environment variables, e.g.

export ACCESS_KEY=<your-access-key>
export SECRET_KEY=<your-sceret-key>
juicefs format \
--storage s3 \
--bucket https://<ACCOUNT_ID>.r2.cloudflarestorage.com/myjfs \
... \
myjfs
caution

Cloudflare R2 ListObjects API is not fully S3 compatible (result list is not sorted), so some features of JuiceFS do not work. For example, juicefs gc, juicefs fsck, juicefs sync, juicefs destroy. And when using juicefs mount, you need to disable automatic-backup function by adding --backup-meta 0.

Bunny Storage

Bunny Storage offers a non-S3 compatible object storage with multiple performance tiers and many storage regions. It uses it uses a custom API.

This is not included by default, please build it with tag bunny

Usage

Create a Storage Zone and use the Zone Name with the Hostname of the Location seperated by a dot as Bucket name and the Write Password as Secret Key.

juicefs format \
--storage bunny \
--secret-key "write-password" \
--bucket "https://uk.storage.bunnycdn.com/myzone" \ # https://<Endpoint>/<Zonename>
myjfs

Alibaba Cloud OSS

Please follow this document to learn how to get access key and secret key. If you have already created RAM role and assigned it to a VM instance, you could omit the options --access-key and --secret-key.

Alibaba Cloud also supports using Security Token Service (STS) to authorize temporary access to OSS. If you wanna use STS, you should omit the options --access-key and --secret-key and set environment variables ALICLOUD_ACCESS_KEY_ID, ALICLOUD_ACCESS_KEY_SECRET and SECURITY_TOKENinstead, for example:

# Use Security Token Service (STS)
export ALICLOUD_ACCESS_KEY_ID=XXX
export ALICLOUD_ACCESS_KEY_SECRET=XXX
export SECURITY_TOKEN=XXX
juicefs format \
--storage oss \
--bucket https://<bucket>.<endpoint> \
... \
myjfs

OSS provides multiple endpoints for each region, depending on your network (e.g. public or internal network). Please choose an appropriate endpoint.

If you are creating a file system on AliCloud's server, you can specify the bucket name directly in the option --bucket. For example.

# Running within Alibaba Cloud
juicefs format \
--storage oss \
--bucket <bucket> \
... \
myjfs

Tencent Cloud COS

The naming rule of bucket in Tencent Cloud is <bucket>-<APPID>, so you must append APPID to the bucket name. Please follow this document to learn how to get APPID.

The full format of --bucket option is https://<bucket>-<APPID>.cos.<region>.myqcloud.com, and please replace <region> with specific region code. E.g. the region code of Shanghai is ap-shanghai. You could find all available region codes here. For example:

juicefs format \
--storage cos \
--bucket https://<bucket>-<APPID>.cos.<region>.myqcloud.com \
... \
myjfs

If you are creating a file system on Tencent Cloud's server, you can specify the bucket name directly in the option --bucket. For example.

# Running within Tencent Cloud
juicefs format \
--storage cos \
--bucket <bucket>-<APPID> \
... \
myjfs

Huawei Cloud OBS

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.obs.<region>.myhuaweicloud.com, and please replace <region> with specific region code. E.g. the region code of Beijing 1 is cn-north-1. You could find all available region codes here. For example:

juicefs format \
--storage obs \
--bucket https://<bucket>.obs.<region>.myhuaweicloud.com \
... \
myjfs

If you are creating a file system on Huawei Cloud's server, you can specify the bucket name directly in the option --bucket. For example,

# Running within Huawei Cloud
juicefs format \
--storage obs \
--bucket <bucket> \
... \
myjfs

Baidu Object Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.<region>.bcebos.com, and please replace <region> with specific region code. E.g. the region code of Beijing is bj. You could find all available region codes here. For example:

juicefs format \
--storage bos \
--bucket https://<bucket>.<region>.bcebos.com \
... \
myjfs

If you are creating a file system on Baidu Cloud's server, you can specify the bucket name directly in the option --bucket. For example,

# Running within Baidu Cloud
juicefs format \
--storage bos \
--bucket <bucket> \
... \
myjfs

Volcano Engine TOS Added in v1.0.3

Please follow this document to learn how to get access key and secret key.

The TOS provides multiple endpoints for each region, depending on your network (e.g. public or internal). Please choose an appropriate endpoint. For example:

juicefs format \
--storage tos \
--bucket https://<bucket>.<endpoint> \
... \
myjfs

Kingsoft Cloud KS3

Please follow this document to learn how to get access key and secret key.

KS3 provides multiple endpoints for each region, depending on your network (e.g. public or internal). Please choose an appropriate endpoint. For example:

juicefs format \
--storage ks3 \
--bucket https://<bucket>.<endpoint> \
... \
myjfs

QingStor

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.<region>.qingstor.com, replace <region> with specific region code. E.g. the region code of Beijing 3-A is pek3a. You could find all available region codes here. For example:

juicefs format \
--storage qingstor \
--bucket https://<bucket>.<region>.qingstor.com \
... \
myjfs
note

The format of --bucket option for all QingStor compatible object storage services is http://<bucket>.<endpoint>.

Qiniu

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.s3-<region>.qiniucs.com, replace <region> with specific region code. E.g. the region code of China East is cn-east-1. You could find all available region codes here. For example:

juicefs format \
--storage qiniu \
--bucket https://<bucket>.s3-<region>.qiniucs.com \
... \
myjfs

Sina Cloud Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.stor.sinaapp.com. For example:

juicefs format \
--storage scs \
--bucket https://<bucket>.stor.sinaapp.com \
... \
myjfs

CTYun OOS

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.<endpoint>, For example:

juicefs format \
--storage oos \
--bucket https://<bucket>.<endpoint> \
... \
myjfs

ECloud Object Storage

Please follow this document to learn how to get access key and secret key.

ECloud Object Storage provides multiple endpoints for each region, depending on your network (e.g. public or internal). Please choose an appropriate endpoint. For example:

juicefs format \
--storage eos \
--bucket https://<bucket>.<endpoint> \
... \
myjfs

JD Cloud OSS

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.<region>.jdcloud-oss.com,and please replace <region> with specific region code. You could find all available region codes here. For example:

juicefs format \
--storage s3 \
--bucket https://<bucket>.<region>.jdcloud-oss.com \
... \
myjfs

UCloud US3

Please follow this document to learn how to get access key and secret key.

US3 (formerly UFile) provides multiple endpoints for each region, depending on your network (e.g. public or internal). Please choose an appropriate endpoint. For example:

juicefs format \
--storage ufile \
--bucket https://<bucket>.<endpoint> \
... \
myjfs

Ceph RADOS

note

JuiceFS v1.0 uses go-ceph v0.4.0, which supports Ceph Luminous (v12.2.x) and above. JuiceFS v1.1 uses go-ceph v0.18.0, which supports Ceph Octopus (v15.2.x) and above. Make sure that JuiceFS matches your Ceph and librados version, see go-ceph.

The Ceph Storage Cluster has a messaging layer protocol that enables clients to interact with a Ceph Monitor and a Ceph OSD Daemon. The librados API enables you to interact with the two types of daemons:

JuiceFS supports the use of native Ceph APIs based on librados. You need to install librados library and build juicefs binary separately.

First, install a librados that matches the version of your Ceph installation, For example, if Ceph version is Octopus (v15.2.x), then it is recommended to use librados v15.2.x.

sudo apt-get install librados-dev

Then compile JuiceFS for Ceph (make sure you have Go 1.20+ and GCC 5.4+ installed):

make juicefs.ceph

When using with Ceph, the JuiceFS Client object storage related options are interpreted differently:

  • --bucket stands for the Ceph storage pool, the format is ceph://<pool-name>. A pool is a logical partition for storing objects. Create a pool before use.
  • --access-key stands for the Ceph cluster name, the default value is ceph.
  • --secret-key option is Ceph client user name, the default user name is client.admin.

In order to reach Ceph Monitor, librados reads Ceph configuration file by searching default locations and the first found will be used. The locations are:

  • CEPH_CONF environment variable
  • /etc/ceph/ceph.conf
  • ~/.ceph/config
  • ceph.conf in the current working directory

Since these additional Ceph configuration files are needed during the mount, CSI Driver users need to upload them to Kubernetes, and map to the mount pod.

To format a volume, run:

juicefs.ceph format \
--storage ceph \
--bucket ceph://<pool-name> \
--access-key <cluster-name> \
--secret-key <user-name> \
... \
myjfs

Ceph RGW

Ceph Object Gateway is an object storage interface built on top of librados to provide applications with a RESTful gateway to Ceph Storage Clusters. Ceph Object Gateway supports S3-compatible interface, so we could set --storage to s3 directly.

The --bucket option format is http://<bucket>.<endpoint> (virtual hosted-style). For example:

juicefs format \
--storage s3 \
--bucket http://<bucket>.<endpoint> \
... \
myjfs

Gluster

Gluster is a software defined distributed storage that can scale to several petabytes. JuiceFS communicates with Gluster via the libgfapi library, so it needs to be built separately before used.

First, install libgfapi (version 6.0 - 10.1, 10.4+ is not supported yet)

sudo apt-get install uuid-dev libglusterfs-dev glusterfs-common

Then compile JuiceFS supporting Gluster:

make juicefs.gluster

Now we can create a JuiceFS volume on Gluster:

juicefs format \
--storage gluster \
--bucket host1,host2,host3/gv0 \
... \
myjfs

The format of --bucket option is <host[,host...]>/<volume_name>. Please note the volume_name here is the name of Gluster volume, and has nothing to do with the name of JuiceFS volume.

Swift

OpenStack Swift is a distributed object storage system designed to scale from a single machine to thousands of servers. Swift is optimized for multi-tenancy and high concurrency. Swift is ideal for backups, web and mobile content, and any other unstructured data that can grow without bound.

The --bucket option format is http://<container>.<endpoint>. A container defines a namespace for objects.

Currently, JuiceFS only supports Swift V1 authentication.

The value of --access-key option is username. The value of --secret-key option is password. For example:

juicefs format \
--storage swift \
--bucket http://<container>.<endpoint> \
--access-key <username> \
--secret-key <password> \
... \
myjfs

MinIO

MinIO is an open source lightweight object storage, compatible with Amazon S3 API.

It is easy to run a MinIO instance locally using Docker. For example, the following command sets and maps port 9900 for the console with --console-address ":9900" and also maps the data path for the MinIO to the minio-data folder in the current directory, which can be modified if needed.

sudo docker run -d --name minio \
-p 9000:9000 \
-p 9900:9900 \
-e "MINIO_ROOT_USER=minioadmin" \
-e "MINIO_ROOT_PASSWORD=minioadmin" \
-v $PWD/minio-data:/data \
--restart unless-stopped \
minio/minio server /data --console-address ":9900"

After container is up and running, you can access:

The initial Access Key and Secret Key of the object storage are both minioadmin.

When using MinIO as data storage for JuiceFS, set the option --storage to minio.

juicefs format \
--storage minio \
--bucket http://127.0.0.1:9000/<bucket> \
--access-key minioadmin \
--secret-key minioadmin \
... \
myjfs
note
  1. Currently, JuiceFS only supports path-style MinIO URI addresses, e.g., http://127.0.0.1:9000/myjfs.
  2. The MINIO_REGION environment variable can be used to set the region of MinIO, if not set, the default is us-east-1.
  3. When using Multi-Node MinIO deployment, consider setting using a DNS address in the service endpoint, resolving to all MinIO Node IPs, as a simple load-balancer, e.g. http://minio.example.com:9000/myjfs

WebDAV

WebDAV is an extension of the Hypertext Transfer Protocol (HTTP) that facilitates collaborative editing and management of documents stored on the WWW server among users. From JuiceFS v0.15+, JuiceFS can use a storage that speaks WebDAV as a data storage.

You need to set --storage to webdav, and --bucket to the endpoint of WebDAV. If basic authorization is enabled, username and password should be provided as --access-key and --secret-key, for example:

juicefs format \
--storage webdav \
--bucket http://<endpoint>/ \
--access-key <username> \
--secret-key <password> \
... \
myjfs

HDFS

HDFS is the file system for Hadoop, which can be used as the object storage for JuiceFS.

When HDFS is used, --access-key can be used to specify the username, and hdfs is usually the default superuser. For example:

juicefs format \
--storage hdfs \
--bucket namenode1:8020 \
--access-key hdfs \
... \
myjfs

When --access-key is not specified on formatting, JuiceFS will use the current user of juicefs mount or Hadoop SDK to access HDFS. It will hang and fail with IO error eventually, if the current user don't have enough permission to read/write the blocks in HDFS.

JuiceFS will try to load configurations for HDFS client based on $HADOOP_CONF_DIR or $HADOOP_HOME. If an empty value is provided to --bucket, the default HDFS found in Hadoop configurations will be used.

bucket format:

  • [hdfs://]namenode:port[/path]

for HA cluster:

  • [hdfs://]namenode1:port,namenode2:port[/path]
  • [hdfs://]nameservice[/path]

For HDFS which enable Kerberos, KRB5KEYTAB and KRB5PRINCIPAL environment var can be used to set keytab and principal.

Apache Ozone

Apache Ozone is a scalable, redundant, and distributed object storage for Hadoop. It supports S3-compatible interface, so we could set --storage to s3 directly.

juicefs format \
--storage s3 \
--bucket http://<endpoint>/<bucket>\
--access-key <your-access-key> \
--secret-key <your-sceret-key> \
... \
myjfs

Redis

Redis can be used as both metadata storage for JuiceFS and as data storage, but when using Redis as a data storage, it is recommended not to store large-scale data.

Standalone

The --bucket option format is redis://<host>:<port>/<db>. The value of --access-key option is username. The value of --secret-key option is password. For example:

juicefs format \
--storage redis \
--bucket redis://<host>:<port>/<db> \
--access-key <username> \
--secret-key <password> \
... \
myjfs

Redis Sentinel

In Redis Sentinel mode, the format of the --bucket option is redis[s]://MASTER_NAME,SENTINEL_ADDR[,SENTINEL_ADDR]:SENTINEL_PORT[/DB]. Sentinel's password needs to be declared through the SENTINEL_PASSWORD_FOR_OBJ environment variable. For example:

export SENTINEL_PASSWORD_FOR_OBJ=sentinel_password
juicefs format \
--storage redis \
--bucket redis://masterName,1.2.3.4,1.2.5.6:26379/2 \
--access-key <username> \
--secret-key <password> \
... \
myjfs

Redis Cluster

In Redis Cluster mode, the format of --bucket option is redis[s]://ADDR:PORT,[ADDR:PORT],[ADDR:PORT]. For example:

juicefs format \
--storage redis \
--bucket redis://127.0.0.1:7000,127.0.0.1:7001,127.0.0.1:7002 \
--access-key <username> \
--secret-key <password> \
... \
myjfs

TiKV

TiKV is a highly scalable, low latency, and easy to use key-value database. It provides both raw and ACID-compliant transactional key-value API.

TiKV can be used as both metadata storage and data storage for JuiceFS.

note

It's recommended to use dedicated TiKV 5.0+ cluster as the data storage for JuiceFS.

The --bucket option format is <host>:<port>,<host>:<port>,<host>:<port>, and <host> is the address of Placement Driver (PD). The options --access-key and --secret-key have no effect and can be omitted. For example:

juicefs format \
--storage tikv \
--bucket "<host>:<port>,<host>:<port>,<host>:<port>" \
... \
myjfs
note

Don't use the same TiKV cluster for both metadata and data, because JuiceFS uses non-transactional protocol (RawKV) for objects and transactional protocol (TnxKV) for metadata. The TxnKV protocol has special encoding for keys, so they may overlap with keys even they has different prefixes. BTW, it's recommmended to enable Titan in TiKV for data cluster.

Set up TLS

If you need to enable TLS, you can set the TLS configuration item by adding the query parameter after the bucket URL. Currently supported configuration items:

NameValue
caCA root certificate, used to connect TiKV/PD with TLS
certcertificate file path, used to connect TiKV/PD with TLS
keyprivate key file path, used to connect TiKV/PD with TLS
verify-cnverify component caller's identity, reference link

For example:

juicefs format \
--storage tikv \
--bucket "<host>:<port>,<host>:<port>,<host>:<port>?ca=/path/to/ca.pem&cert=/path/to/tikv-server.pem&key=/path/to/tikv-server-key.pem&verify-cn=CN1,CN2" \
... \
myjfs

etcd

etcd is a small-scale key-value database with high availability and reliability, which can be used as both the metadata storage of JuiceFS and the data storage of JuiceFS.

etcd will limit a single request to no more than 1.5MB by default, you need to change the block size (--block-size option) of JuiceFS to 1MB or even lower.

The --bucket option needs to fill in the etcd address, the format is similar to <host1>:<port>,<host2>:<port>,<host3>:<port>. The --access-key and --secret-key options are filled with username and password, which can be omitted when etcd does not enable user authentication. E.g:

juicefs format \
--storage etcd \
--block-size 1024 \ # This option is very important
--bucket "<host1>:<port>,<host2>:<port>,<host3>:<port>/prefix" \
--access-key myname \
--secret-key mypass \
... \
myjfs

Set up TLS

If you need to enable TLS, you can set the TLS configuration item by adding the query parameter after the bucket URL. Currently supported configuration items:

NameValue
cacertCA root certificate
certcertificate file path
keyprivate key file path
server-namename of server
insecure-skip-verify1

For example:

juicefs format \
--storage etcd \
--bucket "<host>:<port>,<host>:<port>,<host>:<port>?cacert=/path/to/ca.pem&cert=/path/to/server.pem&key=/path/to/key.pem&server-name=etcd" \
... \
myjfs
note

The path to the certificate needs to be an absolute path, and make sure that all machines that need to mount can use this path to access them.

SQLite

SQLite is a small, fast, single-file, reliable, full-featured single-file SQL database engine widely used around the world.

When using SQLite as a data store, you only need to specify its absolute path.

juicefs format \
--storage sqlite3 \
--bucket /path/to/sqlite3.db \
... \
myjfs
note

Since SQLite is an embedded database, only the host where the database is located can access it, and cannot be used in multi-machine sharing scenarios. If a relative path is used when formatting, it will cause problems when mounting, please use an absolute path.

MySQL

MySQL is one of the popular open source relational databases, often used as the database of choice for web applications, both as a metadata engine for JuiceFS and for storing files data. MySQL-compatible MariaDB, TiDB, etc. can be used as data storage.

When using MySQL as a data storage, you need to create a database in advance and add the desired permissions, specify the access address through the --bucket option, specify the user name through the --access-key option, and specify the password through the --secret-key option. An example is as follows:

juicefs format \
--storage mysql \
--bucket (<host>:3306)/<database-name> \
--access-key <username> \
--secret-key <password> \
... \
myjfs

After the file system is created, JuiceFS creates a table named jfs_blob in the database to store the data.

note

Don't miss the parentheses () in the --bucket parameter.

PostgreSQL

PostgreSQL is a powerful open source relational database with a complete ecology and rich application scenarios. It can be used as both the metadata engine of JuiceFS and the data storage. Other databases compatible with the PostgreSQL protocol (such as CockroachDB, etc.) can also be used as data storage.

When creating a file system, you need to create a database and add the corresponding read and write permissions. Use the --bucket option to specify the address of the data, use the --access-key option to specify the username, and use the --secret-key option to specify the password. An example is as follows:

juicefs format \
--storage postgres \
--bucket <host>:<port>/<db>[?parameters] \
--access-key <username> \
--secret-key <password> \
... \
myjfs

After the file system is created, JuiceFS creates a table named jfs_blob in the database to store the data.

Troubleshooting

The JuiceFS client uses SSL encryption to connect to PostgreSQL by default. If the connection error pq: SSL is not enabled on the server indicates that the database does not have SSL enabled. You can enable SSL encryption for PostgreSQL according to your business scenario, or you can add the parameter sslmode=disable to the bucket URL to disable encryption verification.

Local disk

When creating JuiceFS storage, if no storage type is specified, the local disk will be used to store data by default. The default storage path for root user is /var/jfs, and ~/.juicefs/local is for ordinary users.

For example, using the local Redis database and local disk to create a JuiceFS storage named test:

juicefs format redis://localhost:6379/1 test

Local storage is usually only used to help users understand how JuiceFS works and to give users an experience on the basic features of JuiceFS. The created JuiceFS storage cannot be mounted by other clients within the network and can only be used on a single machine.

SFTP/SSH

SFTP - Secure File Transfer Protocol, It is not a type of storage. To be precise, JuiceFS reads and writes to disks on remote hosts via SFTP/SSH, thus allowing any SSH-enabled operating system to be used as a data storage for JuiceFS.

For example, the following command uses the SFTP protocol to connect to the remote server 192.168.1.11 and creates the myjfs/ folder in the $HOME directory of user tom as the data storage of JuiceFS.

juicefs format  \
--storage sftp \
--bucket 192.168.1.11:myjfs/ \
--access-key tom \
--secret-key 123456 \
...
redis://localhost:6379/1 myjfs

Notes

  • --bucket is used to set the server address and storage path in the format [sftp://]<IP/Domain>:[port]:<Path>. Note that the directory name should end with /, and the port number is optionally defaulted to 22, e.g. 192.168.1.11:22:myjfs/.
  • --access-key set the username of the remote server
  • --secret-key set the password of the remote server

NFS

NFS - Network File System, is a commonly used file-sharing service in Unix-like operating systems. It allows computers within a network to access remote files as if they were local files.

JuiceFS supports using NFS as the underlying storage to build a file system, offering two usage methods: local mount and direct mode.

Local Mount

JuiceFS v1.1 and earlier versions only support using NFS as underlying storage via local mount. This method requires mounting the directory on the NFS server locally first, and then using it as a local disk to create the JuiceFS file system.

For example, first mount the /srv/data directory from the remote NFS server 192.168.1.11 to the local /mnt/data directory, and then access it in file mode.

$ sudo mount -t nfs 192.168.1.11:/srv/data /mnt/data
$ sudo juicefs format \
--storage file \
--bucket /mnt/data \
... \
redis://localhost:6379/1 myjfs

From JuiceFS's perspective, the locally mounted NFS is still a local disk, so the --storage option is set to file.

Similarly, because the underlying storage can only be accessed on the mounted device, to share access across multiple devices, you need to mount the NFS share on each device separately, or provide external access through network-based methods such as WebDAV or S3 Gateway.

Direct Mode

JuiceFS v1.2 and later versions support using NFS as the underlying storage in direct mode. This method does not require pre-mounting the NFS directory locally but accesses the shared directory directly through the built-in NFS protocol in the JuiceFS client.

For example, the remote server's /etc/exports configuration file exports the following NFS share:

/srv/data    192.168.1.0/24(rw,sync,no_subtree_check)

You can directly use the JuiceFS client to connect to the /srv/data directory on the NFS server to create the file system:

$ sudo juicefs format  \
--storage nfs \
--bucket 192.168.1.11:/srv/data \
... \
redis://localhost:6379/1 myjfs

In direct mode, the --storage option is set to nfs, and the --bucket option is set to the NFS server address and shared directory. The JuiceFS client will directly connect to the directory on the NFS server to read and write data.

A few considerations:

  1. JuiceFS direct mode currently only supports the NFSv3 protocol.
  2. The JuiceFS client needs permission to access the NFS shared directory.
  3. NFS by default enables the root_squash feature, which maps root access to the NFS share to the nobody user by default. To avoid permission issues with NFS shares, you can set the owner of the shared directory to nobody:nogroup or configure the NFS share with the no_root_squash option to disable permission squashing.