Skip to main content

Metadata Engines Benchmark

Conclusion first:

  • For pure metadata operations, MySQL costs about 2~4x times of Redis; TiKV has similar performance to MySQL, and in most cases it costs a bit less; etcd costs about 1.5x times of TiKV.
  • For small I/O (~100 KiB) workloads, total time costs with MySQL are about 1~3x of those with Redis; TiKV and etcd performs similarly to MySQL.
  • For large I/O (~4 MiB) workloads, total time costs with different metadata engines show no significant difference (object storage becomes the bottleneck).
note
  1. By changing appendfsync from always to everysec, Redis gains performance boost but loses a bit of data reliability. More information can be found here.
  2. Both Redis and MySQL store only one replica locally, while TiKV and etcd stores three replicas on three different hosts using Raft protocol.

Details are provided below. Please note all the tests are run with the same object storage (to save data), clients and metadata hosts, only metadata engines differ.

Environment

JuiceFS Version

1.0.0-dev+2022-04-07.50fc234e

Object Storage

Amazon S3

Client Hosts

  • Amazon c5.xlarge: 4 vCPUs, 8 GiB Memory, Up to 10 Gigabit Network
  • Ubuntu 18.04.4 LTS

Metadata Hosts

  • Amazon c5d.xlarge: 4 vCPUs, 8 GiB Memory, Up to 10 Gigabit Network, 100 GB SSD (local storage for metadata engines)
  • Ubuntu 20.04.1 LTS
  • SSD is formated as ext4 and mounted on /data

Metadata Engines

Redis

  • Version: 6.2.6
  • Configuration:
    • appendonly: yes
    • appendfsync: always or everysec
    • dir: /data/redis

MySQL

  • Version: 8.0.25
  • /var/lib/mysql is bind mounted on /data/mysql

TiKV

  • Version: 5.4.0
  • Configuration:
    • deploy_dir: /data/tikv-deploy
    • data_dir: /data/tikv-data

etcd

  • Version: 3.5.2
  • Configuration:
    • data-dir: /data/etcd

Tools

All the following tests are run for each metadata engine.

Golang Benchmark

Simple benchmarks within the source code: pkg/meta/benchmarks_test.go

JuiceFS Bench

JuiceFS provides a basic benchmark command:

./juicefs bench /mnt/jfs -p 4

mdtest

  • Version: mdtest-3.4.0+dev

Run parallel tests on 3 client nodes:

$ cat myhost
client1 slots=4
client2 slots=4
client3 slots=4

Test commands:

# metadata only
$ mpirun --use-hwthread-cpus --allow-run-as-root -np 12 --hostfile myhost --map-by slot /root/mdtest -b 3 -z 1 -I 100 -u -d /mnt/jfs

# 12000 * 100KiB files
$ mpirun --use-hwthread-cpus --allow-run-as-root -np 12 --hostfile myhost --map-by slot /root/mdtest -F -w 102400 -I 1000 -z 0 -u -d /mnt/jfs

fio

  • Version: fio-3.1
fio --name=big-write --directory=/mnt/jfs --rw=write --refill_buffers --bs=4M --size=4G --numjobs=4 --end_fsync=1 --group_reporting

Results

Golang Benchmark

  • Shows time cost (us/op). Smaller is better.
  • Number in parentheses is the multiple of Redis-Always cost (always and everysec are candidates for Redis configuration appendfsync).
  • Because of enabling metadata cache, the results of read are all less than 1us, which are not comparable for now.
Redis-AlwaysRedis-EverysecMySQLTiKVetcd
mkdir600471 (0.8)2121 (3.5)1614 (2.7)2203 (3.7)
mvdir878756 (0.9)3372 (3.8)1854 (2.1)3000 (3.4)
rmdir785673 (0.9)3065 (3.9)2097 (2.7)3634 (4.6)
readdir_10302303 (1.0)1011 (3.3)1232 (4.1)2171 (7.2)
readdir_1k16681838 (1.1)16824 (10.1)6682 (4.0)17470 (10.5)
mknod584498 (0.9)2117 (3.6)1561 (2.7)2232 (3.8)
create591468 (0.8)2120 (3.6)1565 (2.6)2206 (3.7)
rename860736 (0.9)3391 (3.9)1799 (2.1)2941 (3.4)
unlink709580 (0.8)3052 (4.3)1881 (2.7)3080 (4.3)
lookup9997 (1.0)423 (4.3)731 (7.4)1286 (13.0)
getattr9189 (1.0)343 (3.8)371 (4.1)661 (7.3)
setattr501357 (0.7)1258 (2.5)1358 (2.7)1480 (3.0)
access9089 (1.0)348 (3.9)370 (4.1)646 (7.2)
setxattr404270 (0.7)1152 (2.9)1116 (2.8)757 (1.9)
getxattr9189 (1.0)298 (3.3)365 (4.0)655 (7.2)
removexattr21995 (0.4)882 (4.0)1554 (7.1)1461 (6.7)
listxattr_18888 (1.0)312 (3.5)374 (4.2)658 (7.5)
listxattr_109491 (1.0)397 (4.2)390 (4.1)694 (7.4)
link605461 (0.8)2436 (4.0)1627 (2.7)2237 (3.7)
symlink602465 (0.8)2394 (4.0)1633 (2.7)2244 (3.7)
write613371 (0.6)2565 (4.2)1905 (3.1)2350 (3.8)
read_100 (0.0)0 (0.0)0 (0.0)0 (0.0)
read_1000 (0.0)0 (0.0)0 (0.0)0 (0.0)

JuiceFS Bench

Redis-AlwaysRedis-EverysecMySQLTiKVetcd
Write big file565.07 MiB/s556.92 MiB/s557.93 MiB/s553.58 MiB/s542.93 MiB/s
Read big file664.82 MiB/s652.18 MiB/s673.55 MiB/s679.07 MiB/s672.91 MiB/s
Write small file102.30 files/s105.80 files/s87.20 files/s95.00 files/s95.75 files/s
Read small file2200.30 files/s1894.45 files/s1360.85 files/s1394.90 files/s1017.30 files/s
Stat file11607.40 files/s15032.90 files/s5470.05 files/s3283.20 files/s2827.80 files/s
FUSE operation0.41 ms/op0.42 ms/op0.46 ms/op0.45 ms/op0.42 ms/op
Update meta3.63 ms/op3.19 ms/op8.91 ms/op7.04 ms/op4.46 ms/op

mdtest

  • Shows rate (ops/sec). Bigger is better.
Redis-AlwaysRedis-EverysecMySQLTiKVetcd
EMPTY FILES
Directory creation5322.06110182.7431622.5713134.9352316.622
Directory stat302016.015261650.26814359.37822584.1019186.274
Directory removal5268.77910663.4981299.1262511.0351668.792
File creation5277.41410043.0121647.3833062.8202305.468
File stat300142.547349101.88916166.34322464.0209334.466
File read45753.41947342.34613502.13615163.3449378.590
File removal4172.03311076.6601148.6752316.6351457.711
Tree creation80.35384.67743.65641.39059.275
Tree removal110.291118.37448.28372.24060.040
SMALL FILES
File creation314.787320.041307.489293.323293.029
File stat57502.06056546.51114096.10211517.8637432.247
File read46251.76347537.78317030.91314345.9604890.890
File removal3615.2537808.427898.6311884.3151228.742
Tree creation53.52351.87125.27636.51124.960
Tree removal62.67653.38425.78222.07413.652

fio

Redis-AlwaysRedis-EverysecMySQLTiKVetcd
Write bandwidth555 MiB/s532 MiB/s553 MiB/s537 MiB/s555 MiB/s