Today we are very pleased to release JuiceFS v1.0, which is the first Long Term Support(LTS) release after 18 months of continuous iteration and extensive validation in a large number of production environments. JuiceFS v1.0 is compatible with all previous releases and can be upgraded directly.
JuiceFS is a distributed file system designed for the cloud environment. It is compatible with POSIX, HDFS, and S3 protocols. JuiceFS can be used as PV in Kubernetes through CSI driver, and it can also be widely used in big data, machine learning, and other scenarios that require shared file storage.
Production ready
Stable and reliable software cannot be achieved without a comprehensive testing system. JuiceFS’ testing system has covered daily basic testing, compatibility testing, third-party tool testing, and practical application scenario testing. And there will be additional anomaly and stress testing before each version is released. Before this official release of v1.0, we also simulated switching from Redis to TiKV and continued to write 10 billion small files to verify the scalability of the system.
In the past year and a half, a large number of users have tried to apply JuiceFS to their scenarios, including artificial intelligence, big data, cloud native, data sharing, backup and archiving etc. JuiceFS’ capability has been improved gradually and it has been launched into users’ production environments, which has withstood the continuous test of stability and performance.
We learned that there are thousands of clusters in continuous use, with the largest clusters exceeding 10 PB of data and billions of files. These users come from various industries, including internet, high-tech, telecom operators, life science , aerospace, meteorology, and remote sensing and other fields.
Quick tour of features
JuiceFS is the first fully plug-in distributed file system. Both metadata and data in JuiceFS can be implemented with the support of existing components so as to meet the needs of abundant and variable enterprise environments and data storage. Currently, it supports more than 10 metadata engines and more than 30 data storage engines. At the same time, JuiceFS is compatible with POSIX, HDFS, S3, WebDAV and other protocols, which enables data to be freely circulated in all applications.
In addition to stability, JuiceFS has also improved data security from all aspects:
- Data storage encryption, allowing file content to be encrypted and stored in object storage to prevent accidental data leakage
- Trash, preventing unexpected deletion from mistake
- Metadata import & export tools, not only convenient for backup, but also allowing metadata engine migration
- Automatic backup
- Support delayed data deletion, and with the metadata backup, data can return back to prevent accidental updates
Moreover, the system observability has continuously been increasing, which
- Provides abundant system metrics to monitor the system running status, which enables directly accessing Prometheus through API and preset Grafana templates.
- Allows to learn the system details through Client log and Access log.
- Provides metadata indexing analysis tool, juicefs info.
- Provides real-time profiling tool, juicefs profile.
- Provides statistics monitoring tool, juicefs stats.
- Supports Graphite protocol to collect monitoring data from Hadoop SDK.
- Provides Built-in Pyroscope for profiling.
Also, JuiceFS provides abundance of management tools:
- juicefs sync allows to copy data between two storage systems, equivalent to high-performance rsync/DistCp, supporting various access protocols.
- juicefs warmup warms up data for the specified path, improving read performance
- juicefs rmr allows to rapidly remove the specified directory.
- juicefs config can modify the file system configuration online.
- juicefs fsck checks the integrity of the file system for potentially destroyed files.
- juicefs gc collects leaked objects.
- juicefs bench runs basic benchmarks.
- juicefs objbench tests access and basic benchmarks of the object storage.
Embrace open source
JuiceFS was released as a cloud service in 2017. After three years of continuous polishing with stable operation, we released JuiceFS Community Edition with a more plugable infrastructure on January 11, 2021, in order to allow more developers to experience the convenience of this product. This product has been continuously iterated at a rate of one beta release per month.
Considering concerns of some community users about AGPLv3, we changed the "AGPLv3 license" to "Apache 2.0 license" in January 2022. This enables users to apply JuiceFS to various commercial environments with more confidence, and allows them to do secondary improvements according to their own needs. In addition, it also brings convenience for further integration of upstream and downstream applications, for example, Fluid and PaddlePaddle Operator have integrated JuiceFS to their applications.
The development of an open source software cannot be done without the joint efforts of community users, including every member who participated in submitting issues, contributing PR, sharing articles, and providing answers to questions. We would like to give our great gratitude to all of them!
Future plan
Since JuiceFS v1.0 is the first Long Term Support (LTS) version, we will provide continuous maintenance for 24 months, after which there will be new LTS versions for upgrades.
The following features will be implemented in future versions (feedback is welcome):
- Support FoundationDB as the metadata engine
- Directory quota
- User and group quotas
- POSIX ACLs
- Snapshot
- WORM (Write Once Read Many)
In the continuous iteration process of JuiceFS, the backward compatibility has been always maintained. It is hoped that new improvements can be used by users soon, and the future version will also be compatible with v1.0 and provide a smooth upgrade solution. At the same time, JuiceFS v1.0 has also made some forward compatibility preparations for future versions.