JuiceFS Evaluation with AWS EFS and FSx for Lustre

2024-08-07
Brent Bai

In this article, we evaluate features, features, performance and cost of three popular file systems: JuiceFS, Amazon Elastic File System (EFS), and Amazon FSx for Lustre. These file systems are assessed based on their performance and cost to help you make an informed decision for your storage needs.

Features

We have another blog post about POSIX Compatibility Comparison Among Four File Systems on the Cloud.

JuiceFS passed ALL 8832 tests, FSx for Lustre failed 16 tests, EFS failed 1895 tests.

The test results showing JuiceFS the best compatibility.

Moreover, JuiceFS supports POSIX ACL, trash, subdirectory mount, subdirectory quota, data compression transparent, data encryption in-transit and at-rest, cross-region replication, etc.

Test environment

The tests were conducted using a c5.4xlarge (16 cores, 32 GB memory, 10 Gbps network) instance type across AWS EFS, JuiceFS Cloud Service on us-west-2, and AWS FSx for Lustre 1.2 TB, 1,200 MB/s. We utilized one primary assessment to demonstrate a fundamental benchmark. If you’re interested, we can process a more rigorous test.

JuiceFS client tests: Configured with 4 threads, testing both big files (1,024 MiB files with 1 MiB block read/write) and small files (128 KiB files, 100 files read/write).

Performance tests and use cases

The performance tests were designed to evaluate both large and small file operations in sequential read/write:

  • Big file sequential read/write: These tests are crucial for applications requiring consistent throughput, such as LLM data processing and training, gene sequencing, big data analytics, video streaming, and data backup.

  • Small file sequential read/write: These tests simulate workloads seen in computer vision data processing and model training, media processing, scientific computing, etc.

The table below shows the test results of throughput comparison:

MiB/s EFS JuiceFS FSx for Lustre
Writing big files 475 1,116 594
Reading big files 568 1,016 590
Writing small files 29 8 297
Reading small files 104 160 274
  • Large file operations: JuiceFS is the fastest for both reading and writing large files.
  • Small file operations: FSx for Lustre performs best in reading and writing small files.

Parameters to mount JuiceFS:

  • buffer-size=1024
  • max-upload=200
  • max-download=200

Cost comparison

Cost ($) EFS JuiceFS FSx for Lustre
Storage 0.30/GB-month 0.02/GB-month+ S3 0.023/GB-month 0.60/GB-month
Access write 0.06/GB0.10/GB cross AZ Free same region + S3 API $0.005/1,000 requests Free same AZ0.10/GB cross AZ
Access read 0.03/GB0.10/GB cross AZ Free same region+ S3 API $0.0004/1,000 requests Free same AZ
0.10/GB cross AZ

Pricing model and hidden cross-AZ fees

The pricing models for these file systems vary significantly. AWS EFS charges based on storage and access, making it relatively costly for extensive use. JuiceFS offers a lower storage cost and includes S3 charges, providing a cost-effective solution especially when data is accessed within the same region. FSx for Lustre, while delivering high performance, incurs the highest storage costs.

A crucial consideration is the hidden cross-AZ data transfer fees. For AWS EFS and FSx for Lustre, data transfer between availability zones can lead to unexpected costs. JuiceFS takes advantage of S3 storage's regional free access, reducing potential overheads.

Conclusion

While JuiceFS may not always be the fastest file system, it’s exceptionally cost-effective for certain use cases. JuiceFS demonstrates remarkable performance in sequential read/write scenarios, outperforming AWS EFS and providing comparable results to AWS FSx for Lustre at a fraction of the cost. Random read/write operations are generally inefficient on any network file system. For optimal performance in such cases, it’s advisable to utilize high-level clustering with leader election to localize operations and batch remote processes.

In summary, JuiceFS offers a compelling blend of high performance and cost efficiency, making it an excellent choice for applications with heavy sequential read/write workloads. With its lower costs and effective pricing model, JuiceFS stands out as a superior option for many use cases compared to AWS EFS and FSx for Lustre.

If you have any questions or would like to learn more, feel free to join JuiceFS discussions on GitHub and its community on Slack.

Author

Brent Bai
A Software Architect with 20 years of experience in designing and developing high-performance, large-scale systems

Related Posts

LLM Storage Selection & Detailed Performance Analysis of JuiceFS

2024-10-23
Explore storage solutions for large language models, comparing JuiceFS performance with CephFS, Lus…

Optimizing JuiceFS Read Performance: Readahead, Prefetch, and Cache

2024-08-06
Deep dive into how JuiceFS enhances read performance using readahead, prefetch, and cache.

MemVerge Chose JuiceFS: Small File Writes 5x Faster than s3fs

2024-07-31
As a US company specializing in memory-convergence infrastructure, MemVerge accelerated bioinformat…

From HPC to AI: Evolution and Performance Evaluation of File Systems

2024-05-23
Renmin University of China evaluated Lustre, Alluxio, and JuiceFS for AI storage, with benchmarking…