Metrics
JuiceFS provided a Prometheus API for each file system, this page only contains some of the mostly used metrics. For the full metrics list, visit JuiceFS console, navigate to the "Monitor" page, and click "Prometheus API".
Some of the metrics come with multiple subnodes, and are omitted in this section. e.g. juicefs_fuse_ops_lookup
、juicefs_fuse_ops_open
are all included in juicefs_fuse_ops_<name>
.
On the other hand, metrics introduced in this chapter all start with the juicefs_
prefix, this is different in on-prem Grafana data source, for example:
juicefs_trash_size
is calledjfs_stat_trash_size
juicefs_operationDuration
is calledmount_operationDuration
So when you encounter trouble searching for the desired metrics, strip the prefix or suffix and try again.
File system
Labels
Name | Description |
---|---|
name | file system name |
Metrics
Name | Description | Unit |
---|---|---|
juicefs_size | size of files | byte |
juicefs_inodes | number of inodes | |
juicefs_trash_size | size of files in Trash | byte |
juicefs_trash_files | number of files in Trash |
Client
Labels
Name | Description |
---|---|
name | file system name |
host | client hostname |
ip | client IP address |
mountpoint | path of mount point |
Metrics
Operating system
Name | Description | Unit |
---|---|---|
juicefs_uptime | uptime | second |
juicefs_cpuusage | overall CPU usage time | microsecond |
juicefs_memusage | current RSS memory usage | byte |
juicefs_heapSys | Total memory allocated to the client process, same as Sys | byte |
juicefs_heapInuse | Same as HeapInuse | byte |
juicefs_handles | Number of file handlers (descriptors) | |
juicefs_threads | Number of threads in the client process, see ThreadCreateProfile | |
juicefs_goroutines | Number of goroutines in the client process | |
juicefs_gcPause | Same as PauseTotalNs | nanosecond |
Metadata service
Name | Description | Unit |
---|---|---|
juicefs_metaDuration | Duration of Metadata requests | microsecond |
juicefs_metaRequest | Number of Metadata requests | |
juicefs_meta | Sum of all meta metrics, not very useful | |
juicefs_meta_operations | Number of metadata operations | |
juicefs_meta_bytes_sent | Metadata requests sent traffic | byte |
juicefs_meta_bytes_received | Metadata requests received traffic | byte |
juicefs_meta_packets_sent | Metadata requests sent packets | |
juicefs_meta_packets_received | Metadata requests received packets | |
juicefs_meta_reconnects | Number of metadata service reconnects | |
juicefs_meta_usec_ping | Ping latency between client and metadata service | microsecond |
juicefs_meta_dircache | Number of client memory metadata cache operations | |
juicefs_meta_dircache_<name> | Number of different types of client memory metadata cache operations | |
juicefs_meta_dircache0_dirs | Number of directories in the client memory metadata cache | |
juicefs_meta_dircache0_inodes | Number of inodes in the client memory metadata cache |
File access
Name | Description | Unit |
---|---|---|
juicefs_read_bytes | Total read bytes, different from juicefs_get_bytes , this is counted in the file system level | byte |
juicefs_write_bytes | Total write bytes, different from juicefs_put_bytes , this is counted in the file system level | byte |
juicefs_operations | Number of file operations | |
juicefs_operationDuration | File operation latency | |
juicefs_fuse_ops | Number of file operations, roughly the same as juicefs_operations , but counted by summing up all different operation types | |
juicefs_fuse_ops_<name> | Number of single file operations, e.g. getattr 、lookup 、open | |
juicefs_openfiles | Number of opened files | |
juicefs_bgjobs | Number of currently running background jobs | |
juicefs_bgjobs_compact | Number of currently running compaction in background jobs | |
juicefs_bgjobs_delete | Number of currently running Trash deletion in background jobs | |
juicefs_compacts | Number of successful compactions | |
juicefs_compact_bytes | Compaction traffic | byte |
Buffer
Name | Description | Unit |
---|---|---|
juicefs_totalBufferUsed | Used buffer size | byte |
juicefs_readBufferUsed | Used read buffer size | byte |
Local cache
Name | Description | Unit |
---|---|---|
juicefs_blockcache_blocks | Number of cache blocks | |
juicefs_blockcache_bytes | Size of cache blocks | byte |
juicefs_blockcache_hits | Number of cache block hits | |
juicefs_blockcache_hitBytes | Bytes of cache block hits | byte |
juicefs_blockcache_miss | Number of missed cache block hits | |
juicefs_blockcache_missBytes | Size of missed cache block hits | byte |
juicefs_blockcache_evict | Number of cache block evictions | |
juicefs_blockcache_evictBytes | Size of cache block evictions | byte |
juicefs_blockcache_evictDur | Duration of cache block evictions | microsecond |
juicefs_blockcache_readDuration | Duration of cache block reads | microsecond |
juicefs_blockcache_write | Number of cache block writes | |
juicefs_blockcache_writeBytes | Size of cache block writes | byte |
juicefs_blockcache_writeDuration | Duration of cache block writes | microsecond |
juicefs_symlink_cache | Sum of symbolic link cache related metrics, not very useful | |
juicefs_symlink_cache_<name> | Metrics of different types of symbolic link cache operations |
Distributed cache
Architecture and related practices are introduced in Distributed Cache.
In the cache group, there will be corresponding metrics for "sender" and "receiver" for reading data and writing data. For example, when reading files through a cache group, member nodes will not only receive data from other members, but also share and send their own cache data to other members.
There are two sets of mutually contrasting metrics in distributed cache metrics, corresponding to different application scenarios:
remotecache_get
andremotecache_send
: correspond to the scenario where the client reads data from the distributed cacheremotecache_receive
andremotecache_put
: correspond to the scenario where the client writes data to the distributed cache (the--fill-group-cache
option needs to be enabled), or the scenario of balancing data in the distributed cache.
It should be noted that there is no correlation between metrics in different groups. For example, the two metrics remotecache_get
and remotecache_put
are not related in any way (don’t be misled by the name).
No matter what the scenario is, each source and destination has its own metrics. Using a simple 2-node cache group as an example:
- The
remotecache_sendBytes
metric in node A represents the data sent by A for file read requests. Similarly,remotecache_sendBytes
metric in node B represents the data sent by B. - The
remotecache_getBytes
metric in node B represents the data received by B for file read requests. Similarly,remotecache_getBytes
metric in node A represents the data received by A.
Therefore, under normal circumstances, if you sum up metrics of all cache group members (including --no-sharing
clients), the size of remotecache_sendBytes
should be roughly equal to remotecache_getBytes
. If it is found that remotecache_sendBytes
is much larger that remotecache_getBytes
, it means that there are errors during data transfer, check remotecache_errors
metric and client logs for causes.
The metrics related to file reads in a cache group are mostly the same as those for writes.
Name | Description | Unit |
---|---|---|
juicefs_remotecache_errors | Number of all failed requests | |
juicefs_remotecache_get | Number of requests for cache group client to read data | |
juicefs_remotecache_getBytes | Size of cache group client read request | byte |
juicefs_remotecache_getDuration | Latency of cache group client read request | microsecond |
juicefs_remotecache_send | Number of requests for data sent by cache group client | |
juicefs_remotecache_sendBytes | Size of the data sent by the cache group client | byte |
juicefs_remotecache_sendDuration | Latency for cache group client to send data, including the latency in reading the local cache (juicefs_blockcache_readDuration ). If there is no hit in the local cache, it also includes the request time of the object storage (juicefs_objectDuration_get ). If the object storage response is slow or there is a speed limit, this metric will increase. | microsecond |
juicefs_remotecache_put | Number of requests for cache group client to write data | |
juicefs_remotecache_putBytes | Size of cache group client write request | byte |
juicefs_remotecache_putDuration | Latency of cache group client write request | microsecond |
juicefs_remotecache_receive | Number of requests for cache group client to receive data | |
juicefs_remotecache_receiveBytes | Size of the data received by the cache group client | byte |
juicefs_remotecache_recvDuration | Latency for cache group client to receive data | microsecond |
Object storage
Name | Description | Unit |
---|---|---|
juicefs_object | Number of total object storage requests | |
juicefs_object_get | Number of GetObject requests | |
juicefs_object_put | Number of PutObject requests | |
juicefs_object_delete | Number of DeleteObject requests | |
juicefs_object_copy | Number of CopyObject requests | |
juicefs_object_head | Number of HeadObject requests | |
juicefs_object_list | Number of ListObjects requests | |
juicefs_objectDuration | Duration of all object storage requests | microsecond |
juicefs_objectDuration_get | Duration of GetObject requests | microsecond |
juicefs_objectDuration_getDelay | GetObject requests wait, wait is caused by client token speed control or congestion caused by low --max-downloads | microsecond |
juicefs_objectDuration_put | Duration of PutObject requests | microsecond |
juicefs_objectDuration_putDelay | PutObject requests wait, wait is caused by client token speed control or congestion caused by low --max-uploads | microsecond |
juicefs_objectDuration_delete | Duration of DeleteObject requests | microsecond |
juicefs_object_error | Number of object storage requests failures | |
juicefs_get_bytes | Size of GET requests | byte |
juicefs_put_bytes | Size of PUT requests | byte |