6 Essential Tips for JuiceFS Users

2023-11-23
Herald

As big data and artificial intelligence (AI) technologies continue to evolve, more enterprises, teams, and individuals are adopting JuiceFS, an open-source high-performance distributed file system designed for the cloud. This article compiles six practical tips to help you enhance management efficiency of JuiceFS, including:

  • Viewing mounted file systems
  • Streamlining management using bash scripts
  • Checking how many clients are mounted concurrently
  • Enabling/disabling the trash feature
  • Completely destroying a file system
  • Metadata backup and restoration

Viewing mounted file systems

Sometimes, you may have multiple JuiceFS file systems mounted on a single machine or different options mounted on the same file system across multiple machines. Distinguishing which machine is mounting which file system and what tuning options are set is a common question. Here are a few convenient methods, illustrated on a Linux system:

Method 1: Using the ps command

ps aux | grep juicefs

This command's output will display background-mounted file systems.

herald     36290  0.2  0.1 800108 78848 ?        Sl   11:07   0:24 juicefs mount -d sqlite3:///home/herald/jfs/my.db /home/herald/jfs/mnt
herald     37190  1.3  0.1 3163100 106160 ?      Sl   11:11   2:12 juicefs mount -d badger:///home/herald/jfs/mydb /home/herald/jfs/mnt2
herald     68886  0.0  0.0 221812  2400 pts/0    S+   13:54   0:00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox juicefs

Method 2: Using pgrep and cat commands

In Linux systems, you can find process information in the /proc file system and access it using the process identifier (PID) as the directory name. Use pgrep to find the PID of the juicefs mount process: pgrep juicefs

This will output the PIDs of juicefs mount processes, for example:

36290
37190

Use cat /proc/PID/cmdline to print the command line of each process, for example:

cat /proc/36290/cmdline

It will output something similar to the following:

juicefs mount -d sqlite3:///home/herald/jfs/my.db /home/herald/jfs/mnt

Method 3: Using a bash script

I've integrated Method 2 into a bash script available on Github Gist:

# Download the bash script.
curl -LO https://gist.githubusercontent.com/yuhr123/4e7a09653e833a083dae87ba76b7d642/raw/d8de5350955aa33a3bfafc7cf3756c5f8f3fa04d/proc

# Grant script execution permissions.
chmod +x proc

# Run the script.
./proc juicefs
It will output something similar to the following:
PID: 36290, Command Line: juicefs mount -d sqlite3:///home/herald/jfs/my.db /home/herald/jfs/mnt
PID: 37190, Command Line: juicefs mount -d badger:///home/herald/jfs/mydb /home/herald/jfs/mnt2

Streamlining management using bash scripts

The JuiceFS client operates through command lines. While it's not challenging to use, entering commands directly can be cumbersome, especially for users who have just started or are repeatedly adjusting mounting options and tuning performance. Bash scripts can help manage various commands.

Creating a file system using a script

For example, creating a script named format-myjfs.sh to manage the commands that create a file system:

#!/bin/bash

juicefs format --storage s3 \
--bucket xxx \
--access-key xxx \
--secret-key xxx \
redis://xxx.xxx.xxx/1 \
myjfs

Run the script:

bash format-myjfs.sh

This script is convenient to check which bucket and database this file system is composed of at any time. The disadvantage is that it may need to write the access key of the object storage or database. Therefore, if you want to manage it this way, you must keep this script properly. You can use the environment variables to convey sensitive information. You can also use gpg to perform symmetric encryption on this script after use.

Managing file system mounting with a script

Mounting a file system is a daily and frequent management action, such as creating a script named mount-myjfs.sh:

#!/bin/bash

juicefs mount \
--cache-dir /mnt/juicefs-cache \
--buffer-size 2048 \
--writeback \
--free-space-ratio 0.5 \
redis://xxx.xxx.xxx/1 \
/mnt/myjfs

Run the script:

bash mount-juicefs.sh

This script provides a more intuitive way to adjust mounting options.

Checking how many clients are mounted concurrently

A key feature of the cloud file system is that it can be mounted by multiple clients located on different networks at the same time. For example, if the same file system is mounted in a data center in Chicago and another data center in New York simultaneously, the servers in both places can read and write at the same time. JuiceFS’ transaction mechanism can ensure the consistency of written data. To view the current mounted clients, use the status command:

juicefs status redis://192.168.1.80/1

The output, in JSON format, includes information about active sessions, such as software version, hostname, IP address, mount point, and process ID. For example:

{
  "Setting": {
    "Name": "myjfs",
    "UUID": "520ae432-f355-43d2-a445-020787f325f4",
    "Storage": "minio",
    "Bucket": "http://192.168.1.80:9123/myjfs",
    "AccessKey": "admin",
    "SecretKey": "removed",
    "BlockSize": 4096,
    "Compression": "none",
    "EncryptAlgo": "aes256gcm-rsa",
    "KeyEncrypted": true,
    "TrashDays": 1,
    "MetaVersion": 1,
    "MinClientVersion": "1.1.0-A",
    "DirStats": true
  },
  "Sessions": [
    {
      "Sid": 2,
      "Expire": "2023-10-27T09:08:09+08:00",
      "Version": "1.1.0+2023-09-04.08c4ae6",
      "HostName": "homelab",
      "IPAddrs": [
        "192.168.1.80",
      ],
      "MountPoint": "/home/herald/jfs/mnt3",
      "ProcessID": 173507
    },
    {
      "Sid": 4,
      "Expire": "2023-10-27T09:08:11+08:00",
      "Version": "1.1.0+2023-09-04.08c4ae6",
      "HostName": "HeralddeMacBook-Air.local",
      "IPAddrs": [
        "192.168.3.102",
      ],
      "MountPoint": "webdav",
      "ProcessID": 20746
    }
  ],
  "Statistic": {
    "UsedSpace": 4347064320,
    "AvailableSpace": 1125895559778304,
    "UsedInodes": 11,
    "AvailableInodes": 10485760
  }
}

Enabling/disabling the trash feature

JuiceFS supports a trash feature as a safety mechanism against accidental deletions. By default, the trash feature is enabled, retaining deleted files for one day before permanent deletion from the .trash directory. When conducting optimization tests with frequent creation and deletion of temporary files, it's essential to disable the trash feature for timely storage space release.

Use the config command to adjust the number control trash of --trash-days. The set number represents the number of days the trash reserves files. If you set it to 0, the trash feature is disabled. For example:

# Set the trash to retain files for 7 days. 
juicefs config META-URL --trash-days=7  

# Disable the trash feature.  
juicefs config META-URL --trash-days=0

Completely destroying a file system

For those new to a technology, understanding how to clean and delete a file system is crucial. JuiceFS file system destruction, like creation, involves necessary confirmation steps: 1. Use the status command to find the UUID of the file system to be deleted:

# juicefs status redis://192.168.1.80/1

{
  "Setting": {
    "Name": "myjfs",
    "UUID": "520ae432-f355-43d2-a445-020787f325f4",
    "Storage": "minio",
    "Bucket": "http://192.168.1.80:9123/myjfs",
  1. Confirm that all clients have stopped using the file system, as active mounts prevent destruction.
  2. Execute the destroy command to destroy the file system:
juicefs destroy redis://192.168.1.80/1 520ae432-f355-43d2-a445-020787f325f4

Metadata backup and restoration

JuiceFS stores data and metadata separately:

  • Data is stored in object stores in blocks.
  • Metadata, containing crucial information like file names, sizes, locations, and permissions, is stored in a separate database.

When you access files, you must first retrieve the metadata before you get the actual data. Metadata is crucial to any file system.

To ensure metadata safety, JuiceFS enables automatic hourly backups to the object storage bucket's meta directory. In case of metadata engine failure, you can download the latest backup and restore metadata using the load command. When you restore metadata, note that:

  • You can only restore the metadata to a new database.
  • You must reset the secret key of the object storage.

For example, assuming that your file system was created using Redis Database 1, now it is damaged, and you need to rebuild the metadata on Database 2. Just go to the meta directory of the object storage to download the latest backup and then follow the steps below to restore it.

# Import metadata backup into a new database.
juicefs load redis://192.168.1.80/2 dump-2023-10-27-025129.json.gz

# Update object storage secret key.
juicefs config --secret-key xxx redis://192.168.1.80/2

Note: There is inevitably a time lag between automatic backup and the occurrence of a failure. It's impossible to recover new data created between the last backup and the occurrence of a failure.

After all, there are only a few extreme situations. The more common requirement is to migrate metadata between different databases. This operation is also simple:

  1. Stop the reading and writing applications of the file system.
  2. Use the dump command to export the metadata.
  3. Use the load command to import it on the target database.
# Export metadata to the meta-dump.json file.
juicefs dump redis://192.168.1.80/1 meta-dump.json

# Import metadata into a new sqlite database.
juicefs load sqlite3://myjfs.db meta-dump.json

# Update the secret key of the object storage.
juicefs config --secret-key xxx sqlite3://myjfs.db

If you have any questions or would like to learn more details, feel free to join discussions about JuiceFS on GitHub and the JuiceFS community on Slack.

Author

Herald

Related Posts

How We Achieved a 40x Performance Boost in Metadata Backup and Recovery

2023-12-20
Discover how JuiceFS achieved a 40x performance boost in metadata backup and recovery, reducing run…

JuiceFS v1.0 RC1 is released, the optimization of the metadata migration and backup tool is remarkable

2022-06-17 Juicedata
JuiceFS v1.0 RC1 is available now. Among all the optimizations, the most remark one is about the me…

JuiceFS is releasing 1.0 and big news about its license

2022-01-11 Juicedata
One year ago today, we made JuiceFS open source on GitHub. In a year's time, JuiceFS has collected …

JuiceFS v0.15 is released, open metadata backup and restore function, performance is greatly improved!

2021-07-07 Juicedata
JuiceFS v0.15 is released. This release also improved the performance significantly for read/write …