Skip to main content

JuiceFS S3 Gateway

JuiceFS splits and upload files to the underlying object storage, and applications often use the exposed POSIX API. However, if you need to use an S3-compatible API to access JuiceFS files, JuiceFS S3 Gateway provides a solution. Here is its architecture:

JuiceFS S3 Gateway architecture

S3 Gateway is usually used to:

  • Expose the S3 API for JuiceFS file system, enabling applications to access JuiceFS via the S3 SDK.
  • Use tools like s3cmd, AWS CLI and MinIO Client to access and modify files stored in JuiceFS.
  • When transferring data across regions, use S3 Gateway as an unified data export endpoint. This eliminates metadata latency and improves performance. See Sync across regions using S3 Gateway.

Deploy S3 Gateway

Similar to the mount command, gateway reads the local JuiceFS Client configuration file (~/.juicefs/$VOL_NAME.conf by default). If this configuration does not exist, you need to authenticate via the Web Console using the auth command to generate it:

juicefs auth $VOL_NAME --token=xxx --access-key=xxx --secret-key=xxx

S3 Gateway is implemented using the open source MinIO code, so you have to provide some MinIO related variables:

export MINIO_ACCESS_KEY="admin"
export MINIO_SECRET_KEY="password"

Without these variables, you will encounter errors like MINIO_ROOT_USER should be specified as an environment variable with at least 3 characters.

With the credential variables in place, run the gateway command to launch the gateway:

juicefs gateway myjfs 127.0.0.1:8888

Expected output:

2023/03/21 20:15:49.945403 juicefs[97188] <INFO>: connected to 47.103.20.252:9308 [client.go:874]
2023/03/21 20:15:49.965411 juicefs[97188] <INFO>: Cache: /Users/herald/.juicefs/cache/jfs8 capacity: 102400 MB [disk_cache.go:747]
Endpoint: http://127.0.0.1:8888

Browser Access:
http://127.0.0.1:8888

Object API (Amazon S3 compatible):
Go: https://docs.min.io/docs/golang-client-quickstart-guide
Java: https://docs.min.io/docs/java-client-quickstart-guide
Python: https://docs.min.io/docs/python-client-quickstart-guide
JavaScript: https://docs.min.io/docs/javascript-client-quickstart-guide
.NET: https://docs.min.io/docs/dotnet-client-quickstart-guide

Assuming your server's public IP is 111.2.3.4 and you want to make the gateway accessible over the internet on port 9000, adjust the startup command like this:

juicefs gateway myjfs 111.2.3.4:9000

Access JuiceFS S3 Gateway

You can access JuiceFS S3 Gateway using various S3 API-supported clients, desktop applications, and web applications. Ensure the correct address and port are used.

Note

The following examples assume that JuiceFS S3 Gateway is running on the local host and being accessed by third-party clients. Adjust the gateway's address according to your specific scenario.

Use the AWS CLI

Download and install the AWS Command Line Interface (AWS CLI) from the AWS website.

Configure it:

$ aws configure
AWS Access Key ID [None]: admin
AWS Secret Access Key [None]: 12345678
Default region name [None]:
Default output format [None]:

The program guides you interactively to add new configurations. Use the same values for Access Key ID as MINIO_ROOT_USER and Secret Access Key as MINIO_ROOT_PASSWORD. Leave the region name and output format blank.

Now you can use the aws s3 command to access JuiceFS storage, for example:

# List buckets
$ aws --endpoint-url http://localhost:9000 s3 ls

# List objects in bucket
$ aws --endpoint-url http://localhost:9000 s3 ls s3://<bucket>

Use the MinIO Client

To avoid compatibility issues, we recommend using the RELEASE.2021-04-22T17-40-00Z version of the MinIO Client (mc). You can find historical versions with different architectures of mc at this address. For example, for the amd64 architecture, you can download the RELEASE.2021-04-22T17-40-00Z version of mc from this link.

After installing mc, add a new alias:

mc alias set juicefs http://localhost:9000 admin 12345678

Then, you can freely copy, move, add, and delete files and folders between the local disk, JuiceFS storage, and other cloud storage services using the mc client.

$ mc ls juicefs/jfs
[2021-10-20 11:59:00 CST] 130KiB avatar-2191932_1920.png
[2021-10-20 11:59:00 CST] 4.9KiB box-1297327.svg
[2021-10-20 11:59:00 CST] 21KiB cloud-4273197.svg
[2021-10-20 11:59:05 CST] 17KiB hero.svg
[2021-10-20 11:59:06 CST] 1.7MiB hugo-rocha-qFpnvZ_j9HU-unsplash.jpg
[2021-10-20 11:59:06 CST] 16KiB man-1352025.svg
[2021-10-20 11:59:06 CST] 1.3MiB man-1459246.ai
[2021-10-20 11:59:08 CST] 19KiB sign-up-accent-left.07ab168.svg
[2021-10-20 11:59:10 CST] 11MiB work-4997565.svg

Common features

Multi-bucket support

By default, JuiceFS S3 Gateway only allows one bucket. The bucket name is the file system name. If you need multiple buckets, you can add --multi-buckets at startup to enable multi-bucket support. This parameter exports each subdirectory under the top-level directory of the JuiceFS file system as a separate bucket. Creating a bucket means creating a subdirectory with the same name at the top level of the file system.

juicefs gateway myjfs localhost:9000 --multi-buckets

Retain ETags

By default, JuiceFS S3 Gateway does not save or return object ETag information. You can enable this with --keep-etag.

Enable object tags

Object tags are not supported by default, but you can use --object-tag to enable them.

Enable virtual host-style requests

By default, JuiceFS S3 Gateway supports path-style requests in the format of http://mydomain.com/bucket/object. The MINIO_DOMAIN environment variable is used to enable virtual host-style requests. If the request's Host header information matches (.+).mydomain.com, the matched pattern $1 is used as the bucket, and the path is used as the object.

For example:

export MINIO_DOMAIN=mydomain.com

Adjust the IAM refresh interval

The default refresh interval for Identity and Access Management (IAM) caching is 5 minutes. You can adjust this using --refresh-iam-interval. The value of this parameter is a time string with a unit, such as "300ms", "-1.5h", or "2h45m." Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", and "h".

For example, to set a refresh interval of 1 minute:

juicefs gateway xxxx xxxx    --refresh-iam-interval 1m

Multiple gateway instances

JuiceFS' distributed architecture allows you to run multiple JuiceFS S3 Gateway instances across different nodes simultaneously. This enhances both the availability and performance of the S3 Gateways. Each S3 Gateway instance independently handles requests but accesses the same JuiceFS file system. Keep the following points in mind:

  • Ensure that all instances are started with the same user at initialization; use the same UID and GID for all instances.
  • IAM refresh times may vary between nodes, but avoid setting refresh intervals too short to prevent excessive pressure on JuiceFS.
  • The address and port listened by each instance can be freely configured. If multiple instances are started on the same machine, ensure that there is no conflict in port numbers.

Enable virtual-hosted-style

Since S3 Gateway is developed upon opensource MinIO code, the MINIO_DOMAIN variable is supported, you can use this to enable virtual-hosted-style:

export MINIO_DOMAIN=mydomain.com
juicefs gateway myjfs 111.2.3.4:9000

If Helm chart (read below section) is used to deploy S3 Gateway, specify this variable in envs for the same effect.

Deploy S3 Gateway in Kubernetes

Installation requires Helm 3.1.0 and above, refer to the Helm Installation Guide.

helm repo add juicefs https://juicedata.github.io/charts/
helm repo update

Our Helm chart simultaneously support JuiceFS Community and Enterprise edition, distinguished by populating different fields in the values file. Take enterprise edition for an example, you can create a separate values file to overwrite these important fields:

values-mycluster.yaml
# Our default values.yaml uses community edition, change to enterprise edition
image:
repository: juicedata/mount
tag: "ee-5.0.21-f900c6e"

secret:
name: "myjfs"
# If the token field is populated, installation will be treated as enterprise edition
token: "xxx"
accessKey: "xxx"
secretKey: "xxx"
tip

Don't forget to include the values-mycluster.yaml into your Git project (or using other source code management systems), so that all changes on the values file can be traced and rolled back.

Once credentials are configured, run the following command to deploy:

# Below command can be used both to carry out the initial installation, and future upgrades
helm upgrade --install -f values-mycluster.yaml s3-gateway juicefs/juicefs-s3-gateway

After installation, follow the output instructions to get the Kubernetes Service address, and verify if it's working.

There's no such thing as a symbolic in object storage, nevertheless, symbolic link is supported in JuiceFS. So if your file system contains any, pay attention when using S3 Gateway.

  • All symbolic links (relative or absolute) should only target files within JuiceFS (file, not directory). If link resolves to a local file system target or a directory, it's not accessible in S3 Gateway.

  • Relative symbolic links can be used normally under S3 Gateway.

  • If you need to access an absolute symbolic link inside JuiceFS S3 Gateway, add --mountpoint to the start command, and specify the mount point.

    Assuming the mount point being /jfs, and the following symlink has been created inside JuiceFS:

    $ ls -alh /jfs
    file.txt -> /jfs/dir/file.txt

    Use the following command to ensure proper symlink resolution:

    juicefs gateway myjfs 127.0.0.1:8888 --mountpoint=/jfs

Advanced features

The core feature of JuiceFS S3 Gateway is to provide the S3 API. Now, the support for the S3 protocol is comprehensive. Version 5.0.18 supports IAM and bucket event notifications.

These advanced features require the RELEASE.2021-04-22T17-40-00Z version of the mc client. For the usage of these advanced features, see the MinIO documentation or the mc command-line help information.

If you are unsure about the available features or how to use a specific feature, you can append -h to a subcommand to view the help information.

Identity and access control

Regular users

Before version 5.1, juicefs gateway only created a superuser when starting, and this superuser belonged only to that process. Even if multiple gateway processes shared the same file system, their users were isolated between processes. You could set different superusers for each gateway process, and they were independent and unaffected by each other.

Starting from version 5.1, juicefs gateway still requires setting a superuser at startup, and this superuser remains isolated per process. However, it allows adding new users using mc admin user add. Newly added users are shared across the same file system. You can manage new users using mc admin user. This supports adding, disabling, enabling, and deleting users, as well as viewing all users and displaying user information and policies.

$ mc admin user -h
NAME:
mc admin user - manage users

USAGE:
mc admin user COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]

COMMANDS:
add Add a new user
disable Disable a user
enable Enable a user
remove Remove a user
list List all users
info Display information of a user
policy Export user policies in JSON format
svcacct Manage service accounts

An example of adding a user:

# Add a new user.
$ mc admin user add myjfs user1 admin123

# List current users.
$ mc admin user list myjfs
enabled user1

# List current users in JSON format.
$ mc admin user list myjfs --json
{
"status": "success",
"accessKey": "user1",
"userStatus": "enabled"
}

Service accounts

Service accounts are used to create a copy of an existing user with the same permissions, allowing different applications to use separate access keys. The privileges for service accounts inherit from their parent users. They can be managed using the command:

$ mc admin user svcacct -h
NAME:
mc admin user svcacct - manage service accounts

USAGE:
mc admin user svcacct COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]

COMMANDS:
add Add a new service account
ls List services accounts
rm Remove a service account
info Get a service account information
set Edit an existing service account
enable Enable a service account
disable Disable a services account
tip

Service accounts inherit privileges from their parent users and cannot have policies attached directly.

For example, let's say there is an existing user named user1. You can create a service account called svcacct1 for it as follows:

mc admin user svcacct add myjfs user1 --access-key svcacct1 --secret-key 123456abc

If the parent user, user1, has read-only permissions, svcacct1 will also inherit these permissions. To grant different permissions to svcacct1, you must adjust the privileges of the parent user.

AssumeRole security token service

The S3 Gateway Security Token Service (STS) is a service that allows clients to request temporary credentials to access MinIO resources. The working principle of temporary credentials is almost the same as default administrator credentials but with some differences:

  • Temporary credentials are short-lived. They can be configured to last from minutes to hours. After expiration, the gateway no longer recognizes them and does not allow any form of API request access.
  • Temporary credentials do not need to be stored with the application. They are dynamically generated and provided to the application when requested. When temporary credentials expire, applications can request new credentials.

The AssumeRole operation returns a set of temporary security credentials. You can use them to access gateway resources. AssumeRole requires authorization credentials for an existing gateway user and returns temporary security credentials, including an access key, secret key, and security token. Applications can use these temporary security credentials to sign requests for gateway API operations. The policies applied to these temporary credentials inherit from gateway user credentials.

By default, AssumeRole creates temporary security credentials with a validity period of one hour. However, you can specify the duration of the credentials using the optional parameter DurationSeconds, which can range from 900 (15 minutes) to 604,800 (7 days).

API request parameters
  • Version

    Indicates the STS API version information. The only supported value is '2011-06-15', borrowed from the AWS STS API documentation for compatibility.

    ParameterValue
    TypeString
    RequireYes
  • AUTHPARAMS

    Indicates the STS API authorization information. If you are familiar with AWS Signature V4 authorization headers, this STS API supports the signature V4 authorization as described here.

  • DurationSeconds

    Duration in seconds. This value can range from 900 seconds (15 minutes) to 7 days. If the value is higher than this setting, the operation fails. By default, this value is set to 3,600 seconds.

    ParameterValue
    TypeInteger
    Valid rangeFrom 900 to 604,800
    RequiredNo
  • Policy

    A JSON-format IAM policy that you want to use as an inline session policy. This parameter is optional. Passing a policy to this operation returns new temporary credentials. The permissions of the generated session are the intersection of preset policy names and the policy set here. You cannot use this policy to grant more permissions than allowed by the assumed preset policy names.

    ParameterValue
    TypeString
    Valid rangeFrom 1 to 2,048
    RequiredNo
Response elements

The XML response of this API is similar to AWS STS AssumeRole.

Errors

The XML error response of this API is similar to AWS STS AssumeRole.

A POST request example
http://minio:9000/?Action=AssumeRole&DurationSeconds=3600&Version=2011-06-15&Policy={"Version":"2012-10-17","Statement":[{"Sid":"Stmt1","Effect":"Allow","Action":"s3:*","Resource":"arn:aws:s3:::*"}]}&AUTHPARAMS
A response example
<?xml version="1.0" encoding="UTF-8"?>
<AssumeRoleResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
<AssumeRoleResult>
<AssumedRoleUser>
<Arn/>
<AssumeRoleId/>
</AssumedRoleUser>
<Credentials>
<AccessKeyId>Y4RJU1RNFGK48LGO9I2S</AccessKeyId>
<SecretAccessKey>sYLRKS1Z7hSjluf6gEbb9066hnx315wHTiACPAjg</SecretAccessKey>
<Expiration>2019-08-08T20:26:12Z</Expiration>
<SessionToken>eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJhY2Nlc3NLZXkiOiJZNFJKVTFSTkZHSzQ4TEdPOUkyUyIsImF1ZCI6IlBvRWdYUDZ1Vk80NUlzRU5SbmdEWGo1QXU1WWEiLCJhenAiOiJQb0VnWFA2dVZPNDVJc0VOUm5nRFhqNUF1NVlhIiwiZXhwIjoxNTQxODExMDcxLCJpYXQiOjE1NDE4MDc0NzEsImlzcyI6Imh0dHBzOi8vbG9jYWxob3N0Ojk0NDMvb2F1dGgyL3Rva2VuIiwianRpIjoiYTBiMjc2MjktZWUxYS00M2JmLTg3MzktZjMzNzRhNGNkYmMwIn0.ewHqKVFTaP-j_kgZrcOEKroNUjk10GEp8bqQjxBbYVovV0nHO985VnRESFbcT6XMDDKHZiWqN2vi_ETX_u3Q-w</SessionToken>
</Credentials>
</AssumeRoleResult>
<ResponseMetadata>
<RequestId>c6104cbe-af31-11e0-8154-cbc7ccf896c7</RequestId>
</ResponseMetadata>
</AssumeRoleResponse>
Use the AWS CLI with the AssumeRole API
  1. Start the gateway and create a user named foobar.

  2. Configure the AWS CLI:

    [foobar]
    region = us-east-1
    aws_access_key_id = foobar
    aws_secret_access_key = foo12345
  3. Use the AWS CLI to request the AssumeRole API.

    note

    In the command below, --role-arn and --role-session-name have no significance for the gateway. You can set them to any value that meets the command line requirements.

    $ aws --profile foobar --endpoint-url http://localhost:9000 sts assume-role --policy '{"Version":"2012-10-17","Statement":[{"Sid":"Stmt1","Effect":"Allow","Action":"s3:*","Resource":"arn:aws:s3:::*"}]}' --role-arn arn:xxx:xxx:xxx:xxxx --role-session-name anything
    {
    "AssumedRoleUser": {
    "Arn": ""
    },
    "Credentials": {
    "SecretAccessKey": "xbnWUoNKgFxi+uv3RI9UgqP3tULQMdI+Hj+4psd4",
    "SessionToken": "eyJhbGciOiJIUzUxMiIsInR5cCI6IkpXVCJ9.eyJhY2Nlc3NLZXkiOiJLOURUSU1VVlpYRVhKTDNBVFVPWSIsImV4cCI6MzYwMDAwMDAwMDAwMCwicG9saWN5IjoidGVzdCJ9.PetK5wWUcnCJkMYv6TEs7HqlA4x_vViykQ8b2T_6hapFGJTO34sfTwqBnHF6lAiWxRoZXco11B0R7y58WAsrQw",
    "Expiration": "2019-02-20T19:56:59-08:00",
    "AccessKeyId": "K9DTIMUVZXEXJL3ATUOY"
    }
    }
Access the AssumeRole API in Go applications

See the MinIO official example program.

note

Superusers set by environment variables cannot use the AssumeRole APIs. Only users added by mc admin user add can use these APIs.

Permission management

By default, newly created users have no permissions and need to be granted permissions using mc admin policy before they can be used. This command supports adding, deleting, updating, and listing policies, as well as adding, deleting, and updating permissions for users.

$ mc admin policy -h
NAME:
mc admin policy - manage policies defined in the MinIO server

USAGE:
mc admin policy COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]

COMMANDS:
add Add a new policy
remove Remove a policy
list List all policies
info Show information on a policy
set Set an IAM policy on a user or group
unset Unset an IAM policy for a user or group
update Attach a new IAM policy to a user or group

The gateway includes the following common policies:

  • readonly: Read-only users.
  • readwrite: Read-write users.
  • writeonly: Write-only users.
  • consoleAdmin: Read-write-admin users, where "admin" means the ability to use management APIs such as creating users.

For example, to set a user as a read-only user:

# Set user1 as a read-only user.
$ mc admin policy set myjfs readonly user=user1

# Check user policy.
$ mc admin user list myjfs
enabled user1 readonly

For custom policies, use mc admin policy add:

$ mc admin policy add -h
NAME:
mc admin policy add - add new policy

USAGE:
mc admin policy add TARGET POLICYNAME POLICYFILE

POLICYNAME:
Name of the canned policy on MinIO server.

POLICYFILE:
Name of the policy file associated with the policy name.

EXAMPLES:
1. Add a new canned policy 'writeonly'.
$ mc admin policy add myjfs writeonly /tmp/writeonly.json

The policy file to be added here must be in JSON format with IAM-compatible syntax, and no more than 2,048 characters. This syntax allows for more fine-grained access control. If you are unfamiliar with this, you can first use the following command to see the simple policies and then modify them accordingly.

$ mc admin policy info myjfs readonly
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetBucketLocation",
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::*"
]
}
]
}

User group management

JuiceFS S3 Gateway supports creating user groups, similar to Linux user groups, and uses mc admin group for management. You can set one or more users to a group and grant permissions uniformly to the group. This usage is similar to user management.

$ mc admin  group -h
NAME:
mc admin group - manage groups

USAGE:
mc admin group COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]

COMMANDS:
add Add users to a new or existing group
remove Remove a group or its members
info Display group information
list List all groups
enable Enable a group
disable Disable a group

Anonymous access management

In addition to user-specific permissions, anonymous access management is also possible. This allows specific objects or buckets to be accessible to anyone. You can use the mc policy command to manage this functionality.

Name:
mc policy - manage anonymous access to buckets and objects

USAGE:
mc policy [FLAGS] set PERMISSION TARGET
mc policy [FLAGS] set-json FILE TARGET
mc policy [FLAGS] get TARGET
mc policy [FLAGS] get-json TARGET
mc policy [FLAGS] list TARGET

PERMISSION:
Allowed policies are: [none, download, upload, public].

FILE:
A valid S3 policy JSON filepath.

EXAMPLES:
1. Set bucket to "download" on Amazon S3 cloud storage.
$ mc policy set download s3/burningman2011

2. Set bucket to "public" on Amazon S3 cloud storage.
$ mc policy set public s3/shared

3. Set bucket to "upload" on Amazon S3 cloud storage.
$ mc policy set upload s3/incoming

4. Set policy to "public" for bucket with prefix on Amazon S3 cloud storage.
$ mc policy set public s3/public-commons/images

5. Set a custom prefix based bucket policy on Amazon S3 cloud storage using a JSON file.
$ mc policy set-json /path/to/policy.json s3/public-commons/images

6. Get bucket permissions.
$ mc policy get s3/shared

7. Get bucket permissions in JSON format.
$ mc policy get-json s3/shared

8. List policies set to a specified bucket.
$ mc policy list s3/shared

9. List public object URLs recursively.
$ mc policy --recursive links s3/shared/

The gateway has built-in support for four types of anonymous permissions by default:

  • none: Disallows anonymous access (typically used to clear existing permissions).
  • download: Allows anyone to read.
  • upload: Allows anyone to write.
  • public: Allows anyone to read and write.

The following example shows how to set an object to allow anonymous downloads:

# Set testbucket1/afile for anonymous access.
mc policy set download useradmin/testbucket1/afile

# View specific permissions.
mc policy get-json useradmin/testbucket1/afile

$ mc policy --recursive links useradmin/testbucket1/
http://127.0.0.1:9001/testbucket1/afile

# Directly download the object.
wget http://127.0.0.1:9001/testbucket1/afile

# Clear download permission for a file.
mc policy set none useradmin/testbucket1/afile

Configuration effective time

All management API updates for JuiceFS S3 Gateway take effect immediately and are persisted to the JuiceFS file system. Clients that accept these API requests also immediately reflect these changes.

However, in a multi-server gateway setup, the situation is slightly different. This is because when the gateway handles request authentication, it uses in-memory cached information as the validation baseline. Otherwise, reading configuration file content for every request would pose unacceptable performance issues. However, caching also introduces potential inconsistencies between cached data and the configuration file.

Currently, JuiceFS S3 Gateway's cache refresh strategy involves forcibly updating the in-memory cache every 5 minutes (certain operations also trigger cache update operations). This ensures that configuration changes take effect within a maximum of 5 minutes in a multi-server setup. You can adjust this time by using the --refresh-iam-interval parameter. If immediate effect on a specific gateway is required, you can manually restart it.

Bucket event notifications

You can use bucket event notifications to monitor events happening on objects within a storage bucket and trigger certain actions in response.

Currently supported object event types include:

  • s3:ObjectCreated:Put
  • s3:ObjectCreated:CompleteMultipartUpload
  • s3:ObjectAccessed:Head
  • s3:ObjectCreated:Post
  • s3:ObjectRemoved:Delete
  • s3:ObjectCreated:Copy
  • s3:ObjectAccessed:Get

Supported global events include:

  • s3:BucketCreated
  • s3:BucketRemoved

You can use the mc client tool with the event subcommand to set up and monitor event notifications. Notifications sent by MinIO for publishing events are in JSON format. See the JSON structure.

To reduce dependencies, JuiceFS S3 Gateway has cut support for certain event destination types. Currently, storage bucket events can be published to the following destinations:

  • Redis
  • MySQL
  • PostgreSQL
  • Webhooks
$ mc admin config get myjfs | grep notify
notify_webhook publish bucket notifications to webhook endpoints
notify_mysql publish bucket notifications to MySQL databases
notify_postgres publish bucket notifications to Postgres databases
notify_redis publish bucket notifications to Redis datastores
note

Here, assuming the JuiceFS file system name is images, enable the S3 Gateway service and define its alias as myjfs in mc. For the S3 Gateway, the JuiceFS file system name images serves as a bucket name.

Use Redis to publish events

Redis event destination supports two formats: namespace and access.

In the namespace format, the gateway synchronizes objects in the bucket to entries in a Redis hash. Each entry corresponds to an object in the storage bucket, with the key set to "bucket name/object name" and the value as JSON-formatted event data specific to that gateway object. Any updates or deletions of objects also update or delete corresponding entries in the hash.

In the access format, the gateway uses RPUSH to add events to a list. Each element in this list is a JSON-formatted list with two elements:

  • A timestamp string
  • A JSON object containing event data related to operations on the bucket

In this format, elements in the list are not updated or deleted.

To use notification destinations in namespace and access formats:

  1. Configure Redis with the gateway.

    Use the mc admin config set command to configure Redis as the event notification destination:

    # Command-line parameters
    # mc admin config set myjfs notify_redis[:name] address="xxx" format="namespace|access" key="xxxx" password="xxxx" queue_dir="" queue_limit="0"
    # An example
    $ mc admin config set myjfs notify_redis:1 address="127.0.0.1:6379/1" format="namespace" key="bucketevents" password="yoursecret" queue_dir="" queue_limit="0"

    You can use mc admin config get myjfs notify_redis to view the configuration options. Different types of destinations have different configuration options. For Redis type, it has the following configuration options:

    $ mc admin config get myjfs notify_redis
    notify_redis enable=off format=namespace address= key= password= queue_dir= queue_limit=0

    Here are the meanings of each configuration option:

    notify_redis[:name]               Supports setting multiple Redis instances with different names.
    address* (address) Address of the Redis server. For example: localhost:6379.
    key* (string) Redis key to store/update events. The key is created automatically.
    format* (namespace*|access) Whether it is namespace or access. Default is 'namespace'.
    password (string) Password for the Redis server.
    queue_dir (path) Directory to store unsent messages, for example, '/home/events'.
    queue_limit (number) Maximum limit of unsent messages. Default is '100000'.
    comment (sentence) Optional comment description.

    The gateway supports persistent event storage. Persistent storage backs up events when the Redis broker is offline and replays events when the broker comes back online. You can set the directory for event storage using the queue_dir field and the maximum limit for storage using queue_limit. For example, you can set queue_dir to /home/events, and you can set queue_limit to 1,000. By default, queue_limit is 100,000. Before updating the configuration, you can use the mc admin config get command to get the current configuration.

    $ mc admin config get myjfs notify_redis
    notify_redis:1 address="127.0.0.1:6379/1" format="namespace" key="bucketevents" password="yoursecret" queue_dir="" queue_limit="0"

    # Effective after restart
    $ mc admin config set myjfs notify_redis:1 queue_limit="1000"
    Successfully applied new settings.
    Please restart your server 'mc admin service restart myjfs'.
    # Note that you cannot use `mc admin service restart myjfs` to restart. JuiceFS S3 Gateway does not currently support this functionality. You need to manually restart JuiceFS S3 Gateway when prompted after configuring with `mc`.

    After using the mc admin config set command to update the configuration, restart JuiceFS S3 Gateway to apply the changes. JuiceFS S3 Gateway will output a line similar to SQS ARNs: arn:minio:sqs::1:redis.

    Based on your needs, you can add multiple Redis destinations by providing the identifier for each Redis instance (like the "1" in the example "notify_redis:1") along with the configuration parameters for each instance.

  2. Enable bucket notifications.

    Now you can enable event notifications on a bucket named "images." When a JPEG file is created or overwritten, a new key is created or an existing key is updated in the previously configured Redis hash. If an existing object is deleted, the corresponding key is also removed from the hash. Therefore, the rows in the Redis hash map to .jpg objects in the "images" bucket.

    To configure bucket notifications, you need to use the Amazon Resource Name (ARN) information outputted by the gateway in the previous steps. See more information about ARNs.

    You can use the mc tool to add these configuration details. Assuming the gateway service alias is myjfs, you can execute the following script:

    mc event add myjfs/images arn:minio:sqs::1:redis --suffix .jpg
    mc event list myjfs/images
    arn:minio:sqs::1:redis s3:ObjectCreated:*,s3:ObjectRemoved:*,s3:ObjectAccessed:* Filter: suffix=".jpg"
  3. Verify Redis.

    Start the redis-cli Redis client program to check the content in Redis. Running the monitor Redis command will output every command executed on Redis.

    redis-cli -a yoursecret
    127.0.0.1:6379> monitor
    OK

    Upload a file named myphoto.jpg to the images bucket.

    mc cp myphoto.jpg myjfs/images

    In the previous terminal, you can see the operations performed by the gateway on Redis:

    127.0.0.1:6379> monitor
    OK
    1712562516.867831 [1 192.168.65.1:59280] "hset" "bucketevents" "images/myphoto.jpg" "{\"Records\":[{\"eventVersion\":\"2.0\",\"eventSource\":\"minio:s3\",\"awsRegion\":\"\",\"eventTime\":\"2024-04-08T07:48:36.865Z\",\"eventName\":\"s3:ObjectCreated:Put\",\"userIdentity\":{\"principalId\":\"admin\"},\"requestParameters\":{\"principalId\":\"admin\",\"region\":\"\",\"sourceIPAddress\":\"127.0.0.1\"},\"responseElements\":{\"content-length\":\"0\",\"x-amz-request-id\":\"17C43E891887BA48\",\"x-minio-origin-endpoint\":\"http://127.0.0.1:9001\"},\"s3\":{\"s3SchemaVersion\":\"1.0\",\"configurationId\":\"Config\",\"bucket\":{\"name\":\"images\",\"ownerIdentity\":{\"principalId\":\"admin\"},\"arn\":\"arn:aws:s3:::images\"},\"object\":{\"key\":\"myphoto.jpg\",\"size\":4,\"eTag\":\"40b134ab8a3dee5dd9760a7805fd495c\",\"userMetadata\":{\"content-type\":\"image/jpeg\"},\"sequencer\":\"17C43E89196AE2A0\"}},\"source\":{\"host\":\"127.0.0.1\",\"port\":\"\",\"userAgent\":\"MinIO (darwin; arm64) minio-go/v7.0.11 mc/RELEASE.2021-04-22T17-40-00Z\"}}]}"

    Here, you can see that the gateway executed the HSET command on the minio_events key.

    In the access format, minio_events is a list, and the gateway calls RPUSH to add it to the list. In the monitor command, you can see:

    127.0.0.1:6379> monitor
    OK
    1712562751.922469 [1 192.168.65.1:61102] "rpush" "aceesseventskey" "[{\"Event\":[{\"eventVersion\":\"2.0\",\"eventSource\":\"minio:s3\",\"awsRegion\":\"\",\"eventTime\":\"2024-04-08T07:52:31.921Z\",\"eventName\":\"s3:ObjectCreated:Put\",\"userIdentity\":{\"principalId\":\"admin\"},\"requestParameters\":{\"principalId\":\"admin\",\"region\":\"\",\"sourceIPAddress\":\"127.0.0.1\"},\"responseElements\":{\"content-length\":\"0\",\"x-amz-request-id\":\"17C43EBFD35A53B8\",\"x-minio-origin-endpoint\":\"http://127.0.0.1:9001\"},\"s3\":{\"s3SchemaVersion\":\"1.0\",\"configurationId\":\"Config\",\"bucket\":{\"name\":\"images\",\"ownerIdentity\":{\"principalId\":\"admin\"},\"arn\":\"arn:aws:s3:::images\"},\"object\":{\"key\":\"myphoto.jpg\",\"size\":4,\"eTag\":\"40b134ab8a3dee5dd9760a7805fd495c\",\"userMetadata\":{\"content-type\":\"image/jpeg\"},\"sequencer\":\"17C43EBFD3DACA70\"}},\"source\":{\"host\":\"127.0.0.1\",\"port\":\"\",\"userAgent\":\"MinIO (darwin; arm64) minio-go/v7.0.11 mc/RELEASE.2021-04-22T17-40-00Z\"}}],\"EventTime\":\"2024-04-08T07:52:31.921Z\"}]"

Use MySQL to publish events

The MySQL notification destination supports two formats: namespace and access.

If you use the namespace format, the gateway synchronizes objects in the bucket to rows in the database table. Each row has two columns:

  • key_name. It is the bucket name plus the object name.
  • value. It is the JSON-formatted event data about that gateway object.

If objects are updated or deleted, the corresponding rows in the table are also updated or deleted.

If you use the access format, the gateway adds events to the table. Rows have two columns:

  • event_time. It is the time the event occurred on the gateway server.
  • event_data. It is the JSON-formatted event data about that gateway object.

In this format, rows are not deleted or modified.

The following steps show how to use the notification destination in namespace format. The access format is similar and not further described here.

  1. Ensure the MySQL version meets the minimum requirements.

    JuiceFS S3 Gateway requires MySQL version 5.7.8 or above, because it uses the JSON data type introduced in MySQL 5.7.8.

  2. Configure MySQL to the gateway.

    Use the mc admin config set command to configure MySQL as the event notification destination.

    mc admin config set myjfs notify_mysql:myinstance table="minio_images" dsn_string="root:123456@tcp(172.17.0.1:3306)/miniodb"

    You can use mc admin config get myjfs notify_mysql to view the configuration options. Different destination types have different configuration options. For MySQL type, the following configuration options are available:

    $ mc admin config get myjfs notify_mysql
    format=namespace dsn_string= table= queue_dir= queue_limit=0 max_open_connections=2

    Here are the meanings of each configuration item:

    KEY:
    notify_mysql[:name] Publish bucket notifications to the MySQL database. When multiple MySQL server endpoints are required, you can add a user-specified "name" to each configuration, for example, "notify_mysql:myinstance."

    ARGS:
    dsn_string* (string) MySQL data source name connection string, for example, "<user>:<password>@tcp(<host>:<port>)/<database>".
    table* (string) Name of the database table to store/update events. The table is automatically created.
    format* (namespace*|access) 'namespace' or 'access.' The default is 'namespace.'
    queue_dir (path) The directory for storing unsent messages, for example, '/home/events'.
    queue_limit (number) The maximum limit of unsent messages. The default is '100000'.
    comment (sentence) Optional comment description.

    dsn_string is required and must be in the format <user>:<password>@tcp(<host>:<port>)/<database>.

    MinIO supports persistent event storage. Persistent storage backs up events when the MySQL connection is offline and replays events when the broker comes back online. You can set the storage directory for events using the queue_dir field, and the maximum storage limit using queue_limit. For example, you can set queue_dir to /home/events, and queue_limit to 1,000. By default, queue_limit is set to 100,000.

    Before updating the configuration, you can use the mc admin config get command to get the current configuration.

    $ mc admin config get myjfs/ notify_mysql
    notify_mysql:myinstance enable=off format=namespace host= port= username= password= database= dsn_string= table= queue_dir= queue_limit=0

    Update the MySQL notification configuration using the mc admin config set command with the dsn_string parameter:

    mc admin config set myjfs notify_mysql:myinstance table="minio_images" dsn_string="root:xxxx@tcp(127.0.0.1:3306)/miniodb"

    You can add multiple MySQL server endpoints as needed, by providing the identifier of the MySQL instance (for example, "myinstance") and the configuration parameter information for each instance.

    After updating the configuration with the mc admin config set command, restart the gateway to apply the configuration changes. The gateway server will output a line during startup similar to SQS ARNs: arn:minio:sqs::myinstance:mysql.

  3. Enable bucket notifications.

    Now you can enable event notifications on a bucket named "images." When a file is uploaded to the bucket, a new record is inserted into MySQL, or an existing record is updated. If an existing object is deleted, the corresponding record is also deleted from the MySQL table. Therefore, each row in the MySQL table corresponds to an object in the bucket.

    To configure bucket notifications, you need to use the ARN information outputted by MinIO in previous steps. See more information about ARNs.

    Assuming the gateway service alias is myjfs, you can execute the following script:

    # Add notification configuration to the 'images' bucket using the MySQL ARN. The --suffix parameter is used to filter events.
    mc event add myjfs/images arn:minio:sqs::myinstance:mysql --suffix .jpg
    # Print the notification configuration on the 'images' bucket.
    mc event list myjfs/images
    arn:minio:sqs::myinstance:mysql s3:ObjectCreated:*,s3:ObjectRemoved:*,s3:ObjectAccessed:* Filter: suffix=”.jpg”
  4. Verify MySQL.

    Open a new terminal and upload a JPEG image to the images bucket:

    mc cp myphoto.jpg myjfs/images

    Open a MySQL terminal and list all records in the minio_images table. You will find a newly inserted record.

Use PostgreSQL to publish events

The method of publishing events using PostgreSQL is similar to publishing MinIO events using MySQL, with PostgreSQL version 9.5 or above required. The gateway uses PostgreSQL 9.5's INSERT ON CONFLICT (aka UPSERT) feature and 9.4's jsonb data type.

Use a webhook to publish events

Webhooks use a push model to get data instead of continually pulling.

  1. Configure a webhook to the gateway.

    The gateway supports persistent event storage. Persistent storage backs up events when the webhook is offline and replays events when the broker comes back online. You can set the directory for event storage using the queue_dir field, and the maximum storage limit using queue_limit. For example, you can set queue_dir to /home/events and queue_limit to 1,000. By default, queue_limit is 100,000.

    KEY:
    notify_webhook[:name] Publish bucket notifications to webhook endpoints.

    ARGS:
    endpoint* (url) Webhook server endpoint, for example, http://localhost:8080/minio/events.
    auth_token (string) Opaque token or JWT authorization token.
    queue_dir (path) The directory for storing unsent messages, for example, '/home/events'.
    queue_limit (number) The maximum limit of unsent messages. The default is '100000'.
    client_cert (string) The client certificate for mTLS authentication of the webhook.
    client_key (string) The client certificate key for mTLS authentication of the webhook.
    comment (sentence) Optional comment description.

    Use the mc admin config set command to update the configuration. The endpoint here is the service that listens for webhook notifications. Save the configuration file and restart the MinIO service to apply the changes. Note that when restarting MinIO, this endpoint must be up and accessible.

    mc admin config set myjfs notify_webhook:1 queue_limit="0"  endpoint="http://localhost:3000" queue_dir=""
  2. Enable bucket notifications.

    Now you can enable event notifications. When a file is uploaded to the bucket, an event is triggered. Here, the ARN value is arn:minio:sqs::1:webhook. See more information about ARNs.

    mc mb myjfs/images-thumbnail
    mc event add myjfs/images arn:minio:sqs::1:webhook --event put --suffix .jpg

    If the command report cannot create a bucket, please check if the S3 Gateway has enabled Multi-bucket support.

  3. Use Thumbnailer to verify.

    Thumbnailer is a project that generates thumbnails using MinIO's listenBucketNotification API. JuiceFS uses Thumbnailer to listen to gateway notifications. If a file is uploaded to the gateway service, Thumbnailer listens to that notification, generates a thumbnail, and uploads it to the gateway service.

    To install Thumbnailer:

    git clone https://github.com/minio/thumbnailer/
    npm install

    Open the Thumbnailer's config/webhook.json configuration file, add the configuration for the MinIO server, and start Thumbnailer using:

    NODE_ENV=webhook node thumbnail-webhook.js

    Thumbnailer runs on http://localhost:3000/.

    Next, configure the MinIO server to send messages to this URL (mentioned in step 1) and set up bucket notifications using mc (mentioned in step 2). Then upload an image to the gateway server:

    mc cp ~/images.jpg myjfs/images
    .../images.jpg: 8.31 KB / 8.31 KB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 100.00% 59.42 KB/s 0s

    After a moment, use mc ls to check the content of the bucket. You will see a thumbnail.

    mc ls myjfs/images-thumbnail
    [2017-02-08 11:39:40 IST] 992B images-thumbnail.jpg