Skip to main content

Metadata Engines Benchmark

Conclusion first:

  • For pure metadata operations, MySQL costs about 2~4x times of Redis; TiKV has similar performance to MySQL, and in most cases it costs a bit less; etcd costs about 1.5x times of TiKV.
  • For small I/O (~100 KiB) workloads, total time costs with MySQL are about 1~3x of those with Redis; TiKV and etcd performs similarly to MySQL.
  • For large I/O (~4 MiB) workloads, total time costs with different metadata engines show no significant difference (object storage becomes the bottleneck).
note
  1. By changing appendfsync from always to everysec, Redis gains performance boost but loses a bit of data reliability. More information can be found here.
  2. Both Redis and MySQL store only one replica locally, while TiKV and etcd stores three replicas on three different hosts using Raft protocol.

Details are provided below. Please note all the tests are run with the same object storage (to save data), clients and metadata hosts, only metadata engines differ.

Environment

JuiceFS Version

1.1.0-beta1+2023-06-08.5ef17ba0

Object Storage

Amazon S3

Client Hosts

  • Amazon c5.xlarge: 4 vCPUs, 8 GiB Memory, Up to 10 Gigabit Network
  • Ubuntu 20.04.1 LTS

Metadata Hosts

  • Amazon c5d.xlarge: 4 vCPUs, 8 GiB Memory, Up to 10 Gigabit Network, 100 GB SSD (local storage for metadata engines)
  • Ubuntu 20.04.1 LTS
  • SSD is formatted as ext4 and mounted on /data

Metadata Engines

Redis

  • Version: 7.0.9
  • Configuration:
    • appendonly: yes
    • appendfsync: always or everysec
    • dir: /data/redis

MySQL

  • Version: 8.0.25
  • /var/lib/mysql is bind mounted on /data/mysql

PostgreSQL

  • Version: 15.3
  • The data directory was changed to /data/pgdata

TiKV

  • Version: 6.5.3
  • Configuration:
    • deploy_dir: /data/tikv-deploy
    • data_dir: /data/tikv-data

etcd

  • Version: 3.3.25
  • Configuration:
    • data-dir: /data/etcd

FoundationDB

  • Version: 6.3.23
  • Configuration:
    • data-dir/data/fdb

Tools

All the following tests are run for each metadata engine.

Golang Benchmark

Simple benchmarks within the source code: pkg/meta/benchmarks_test.go

JuiceFS Bench

JuiceFS provides a basic benchmark command:

./juicefs bench /mnt/jfs -p 4

mdtest

  • Version: mdtest-3.3.0

Run parallel tests on 3 client nodes:

$ cat myhost
client1 slots=4
client2 slots=4
client3 slots=4

Test commands:

# metadata only
mpirun --use-hwthread-cpus --allow-run-as-root -np 12 --hostfile myhost --map-by slot /root/mdtest -b 3 -z 1 -I 100 -u -d /mnt/jfs

# 12000 * 100KiB files
mpirun --use-hwthread-cpus --allow-run-as-root -np 12 --hostfile myhost --map-by slot /root/mdtest -F -w 102400 -I 1000 -z 0 -u -d /mnt/jfs

fio

  • Version: fio-3.28
fio --name=big-write --directory=/mnt/jfs --rw=write --refill_buffers --bs=4M --size=4G --numjobs=4 --end_fsync=1 --group_reporting

Results

Golang Benchmark

  • Shows time cost (us/op). Smaller is better.
  • Number in parentheses is the multiple of Redis-Always cost (always and everysec are candidates for Redis configuration appendfsync).
  • Because of enabling metadata cache, the results of read are all less than 1us, which are not comparable for now.
Redis-AlwaysRedis-EverysecMySQLPostgreSQLTiKVetcdFoundationDB
mkdir558468 (0.8)2042 (3.7)1076 (1.9)1237 (2.2)1916 (3.4)1842 (3.3)
mvdir693621 (0.9)2693 (3.9)1459 (2.1)1414 (2.0)2486 (3.6)1895 (2.7)
rmdir717648 (0.9)3050 (4.3)1697 (2.4)1641 (2.3)2980 (4.2)2088 (2.9)
readdir_10280288 (1.0)1350 (4.8)1098 (3.9)995 (3.6)1757 (6.3)1744 (6.2)
readdir_1k14901547 (1.0)18779 (12.6)18414 (12.4)5834 (3.9)15809 (10.6)15276 (10.3)
mknod562464 (0.8)1547 (2.8)849 (1.5)1211 (2.2)1838 (3.3)1763 (3.1)
create570455 (0.8)1570 (2.8)844 (1.5)1209 (2.1)1849 (3.2)1761 (3.1)
rename728627 (0.9)2735 (3.8)1478 (2.0)1419 (1.9)2445 (3.4)1911 (2.6)
unlink658567 (0.9)2365 (3.6)1280 (1.9)1443 (2.2)2461 (3.7)1940 (2.9)
lookup173178 (1.0)557 (3.2)375 (2.2)608 (3.5)1054 (6.1)1029 (5.9)
getattr8786 (1.0)530 (6.1)350 (4.0)306 (3.5)536 (6.2)504 (5.8)
setattr471345 (0.7)1029 (2.2)571 (1.2)1001 (2.1)1279 (2.7)1596 (3.4)
access8789 (1.0)518 (6.0)356 (4.1)307 (3.5)534 (6.1)526 (6.0)
setxattr393262 (0.7)992 (2.5)534 (1.4)800 (2.0)717 (1.8)1300 (3.3)
getxattr8487 (1.0)494 (5.9)333 (4.0)303 (3.6)529 (6.3)511 (6.1)
removexattr21596 (0.4)697 (3.2)385 (1.8)1007 (4.7)1336 (6.2)1597 (7.4)
listxattr_18587 (1.0)516 (6.1)342 (4.0)303 (3.6)531 (6.2)515 (6.1)
listxattr_108791 (1.0)561 (6.4)383 (4.4)322 (3.7)565 (6.5)529 (6.1)
link680545 (0.8)2435 (3.6)1375 (2.0)1732 (2.5)3058 (4.5)2402 (3.5)
symlink580448 (0.8)1785 (3.1)954 (1.6)1224 (2.1)1897 (3.3)1764 (3.0)
newchunk00 (0.0)1 (0.0)1 (0.0)1 (0.0)1 (0.0)2 (0.0)
write553369 (0.7)2352 (4.3)1183 (2.1)1573 (2.8)1788 (3.2)1747 (3.2)
read_100 (0.0)0 (0.0)0 (0.0)0 (0.0)0 (0.0)0 (0.0)
read_1000 (0.0)0 (0.0)0 (0.0)0 (0.0)0 (0.0)0 (0.0)

JuiceFS Bench

Redis-AlwaysRedis-EverysecMySQLPostgreSQLTiKVetcdFoundationDB
Write big file730.84 MiB/s731.93 MiB/s729.00 MiB/s744.47 MiB/s730.01 MiB/s746.07 MiB/s744.70 MiB/s
Read big file923.98 MiB/s892.99 MiB/s905.93 MiB/s895.88 MiB/s918.19 MiB/s939.63 MiB/s948.81 MiB/s
Write small file95.20 files/s109.10 files/s82.30 files/s86.40 files/s101.20 files/s95.80 files/s94.60 files/s
Read small file1242.80 files/s937.30 files/s752.40 files/s1857.90 files/s681.50 files/s1229.10 files/s1301.40 files/s
Stat file12313.80 files/s11989.50 files/s3583.10 files/s7845.80 files/s4211.20 files/s2836.60 files/s3400.00 files/s
FUSE operation0.41 ms/op0.40 ms/op0.46 ms/op0.44 ms/op0.41 ms/op0.41 ms/op0.44 ms/op
Update meta2.45 ms/op1.76 ms/op2.46 ms/op1.78 ms/op3.76 ms/op3.40 ms/op2.87 ms/op

mdtest

  • Shows rate (ops/sec). Bigger is better.
Redis-AlwaysRedis-EverysecMySQLPostgreSQLTiKVetcdFoundationDB
EMPTY FILES
Directory creation4901.3429990.0291252.4214091.9344041.3041910.7683065.578
Directory stat289992.466379692.5769359.27869384.09749465.2236500.17817746.670
Directory removal5131.61410356.293902.0771254.8903210.5181450.8422460.604
File creation5472.6289984.8241326.6134726.5824053.6101801.9562908.526
File stat288951.216253218.5589135.571233148.25250432.6586276.78714939.411
File read64560.14860861.3978445.95320013.02718411.2809094.62711087.931
File removal6084.79112221.0831073.0633961.8553742.2691648.7342214.311
Tree creation80.12183.54634.42061.93777.87556.29974.982
Tree removal218.53595.59942.33044.696114.41476.00264.036
SMALL FILES
File creation295.067312.182275.588289.627307.121275.578263.487
File stat54069.82752800.1088760.70919841.72814076.2148214.31810009.670
File read62341.56857998.3984639.57119244.67823376.7335477.7546533.787
File removal5615.01811573.4151061.6003907.7403411.6631024.4211750.613
Tree creation57.86057.08023.72352.62144.59019.99811.243
Tree removal96.75665.27923.22719.51127.61617.86810.571

fio

Redis-AlwaysRedis-EverysecMySQLPostgreSQLTiKVetcdFoundationDB
Write bandwidth729 MiB/s737 MiB/s736 MiB/s768 MiB/s731 MiB/s738 MiB/s745 MiB/s