Background job
In JuiceFS Enterprise Edition, we use the phrase "background job" (also abbreviated as "bgjob") to refer to a series of tasks that's dispatched by the Metadata Service, and executed in clients, this includes compaction, trash cleaning, and data replication. Keep in mind that "background job" should be distinguished from "asynchronous task", the former has developed its own special meaning in JuiceFS, while "asynchronous task" generally refers to various asynchronous execution processes. For example, if client write cache is enabled, data is uploaded asynchronously by clients, such process does not belong to background job, but still a async task nonetheless.
All types of background job introduced in this chapter deals with object storage, in order to allow finder control over this process (e.g. disable bgjob for some client, or limit compaction speed), you can modify client token settings from the Web Console, to dynamically adjust bgjob settings for clients.
If background job is not explicitly disabled, clients will run all types of background jobs for the mounted file system (with strict data isolation between different file systems). This means:
- When the
--subdir
options is used to mount a sub-directory, the scope of background job is not affected, jobs coming from the entire file system is still dispatched to these clients; - If clients are mounted with read-only tokens, background jobs will not run, make sure there are clients mounted with read/write privilege, otherwise critical file system features will not work properly;
- Similarly, other mount options will not affect with the scope of background job. This doesn't necessarily mean that mount options do not affect background job at all: clients with their options tuned for better performance will run jobs faster, hence their metadata service will dispatch more tasks accordingly.
Compaction
Compaction is the process of merging multiple slices into one, in order to avoid file fragmentation. Learn more in in How JuiceFS Stores Files.
Every time file is written, metadata service will check fragmentation status to see if compaction is in need, and dispatch as background jobs according to pre-defined rules. After compaction is completed in clients, multiple slices will be compacted into one, to improve read efficiency. So if you noticed unexplained network traffic when there's no explicit read or write in JuiceFS Client, this is usually just compaction traffic, and nothing to worry about.
Compaction comes with the following design:
- Job dispatch favors the client that initiate the write, this is also considered a form of data locality, because it likely already has all the slices needed for compaction in its local cache, so this strategy helps easing read overhead;
- For every chunk, compaction is carried out by a single client at any given time, scheduling is controlled globally by the Metadata Service.
You can observe compaction in:
- The monitoring tab in the file system page, from the Web Console. Compaction traffic is indicated by the "compact" line, under the object storage panel;
- On-prem The Meta info dashboard from Grafana. Read the
slices:used
line under the Memory distribution panel. This value should be close to 0 under ideal circumstances. Make no mistake that a highslices:used
value (from 100M to several Gs) does not necessarily indicate disasters, if you do not notice apparent performance issues in the client side, no fix is required and you should just continue use normally (while these type of situation may comes with high number of slices, the set of files that's actually used maintains a pretty good fragmentation level, thus no impact on overall performance).
In different scenarios, compaction may run into different problems, continue reading for more.
Compaction and client write cache
Client write cache by itself is a feature that you should use with caution, if write cache is enabled, make sure client maintain an acceptable level of performance, so that staging data can be uploaded to storage in time. If this isn't the case, a slow client with write cache enabled, when running compaction, can serious trouble like read errors. Because with write cache clients, metadata is committed before data finishes uploading, slices is merged into one and file metadata state is changed, but the merged slice is uploaded too slowly, causing all reads from other clients to hang indefinitely or even timeout.
Slow compaction speed
If slices grow uncontrolled, file system performance quickly deteriorate and clients can even hang. If your troubleshooting concludes precisely this issue, refer to below methods for a fix:
- Background jobs are executed by clients, if the number of clients is simply not enough, or their token is disabled from bgjob by mistake, compaction will not run normally. Make sure there are an abundant number of clients, and bgjob is turned on for them (any form of JuiceFS Client is counted, including Hadoop SDK, S3 Gateway);
- Apparently, compaction is carried out by first download the relevant slices to local cache, consolidate, and then upload to object storage before the new state is committed to the Metadata Service. If your object storage service does not meet the required level of performance, or comes with bandwidth limitations, compaction may not run at proper speed and lead to severe fragmentation;
- On-prem Metadata Service also controls compaction scheduling, and client task queues, consult our engineers;
- Certain write patterns intrinsically cause more slices, like a continuous flow of small appends (each precedes a
flush
call), JuiceFS is always studying different application scenarios and improve upon special write patterns. If your application produces abnormal level of fragmentation, contact our engineers to look into the problem.