In this recipe we will learn how to setup MinIO as a persistent storage layer for Alluxio. The Alluxio documentation for using Alluxio with MinIO can be found here.
Alluxio provides memory-speed virtual distributed storage for applications. By using Alluxio with MinIO, applications can access MinIO data at memory speed and with file system APIs commonly used for big data workloads.
This section describes how to set up Alluxio with MinIO assumed to be already running. See the MinIO quick start guide for how to set up MinIO.
Extract the downloaded Alluxio binary, using the appropriate version and distribution. If MinIO is the only storage you are using with Alluxio, the distribution is irrelevant.
tar xvfz alluxio-<VERSION>-<DISTRIBUTION>-bin.tar.gz
cd alluxio-<VERSION>-<DISTRIBUTION>
Create a configuration file for Alluxio by copying the template.
cp conf/alluxio-site.properties.template conf/alluxio-site.properties
Modify the Alluxio configuration file appropriately for your deployment. This is an example configuration of running Alluxio locally with MinIO.
Assume the MinIO server is running at <MINIO_HOST:PORT>. Assume the MinIO bucket you wish to mount in Alluxio is <MINIO_BUCKET>. Assume the MinIO access key is <ACCESS_KEY> and secret key is <SECRET_KEY>.
alluxio.master.hostname=localhost
alluxio.underfs.address=s3a://<MINIO_BUCKET>/
alluxio.underfs.s3.endpoint=http://<MINIO_HOST:PORT>/
alluxio.underfs.s3.disable.dns.buckets=true
alluxio.underfs.s3a.inherit_acl=false
aws.accessKeyId=<ACCESS_KEY>
aws.secretKey=<SECRET_KEY>
Start the Alluxio server locally.
bin/alluxio-start.sh local -f
Files which already reside in the MinIO bucket will be available thorugh Alluxio. One way to view them is the Alluxio UI. Applications can access and write data in MinIO through the Alluxio namespace.
You can run Alluxio's built in I/O tests to see this in action.
bin/alluxio runTests
After the tests have run, you will see data which has been written in THROUGH
modes in MinIO.