Generates a preview on the user files (videos, images, and PDFs) triggered by s3 upload. You can find a list of supported extensions in ./previews/file_formats.yml
. This repository contains instructions and configurations to create AWS Lambda Functions deployed using Docker Images.
To enable the use of the IBM Aspera File Preview, you must build a container image with FFmpeg packaged inside. FFmpeg is a free open-source utility, that is not provided or managed by IBM and is subject to third party’s terms and conditions. FFmpeg contains codecs for encoding and decoding various video coding formats. Certain codecs contained within FFmpeg may be covered by patents and require a license to use. The Dockerfile provided by IBM will enable all codecs in FFMPEG by default; however, you must assess your licensing needs in connection with your use of FFmpeg and adjust which codecs to enable accordingly. IBM makes no warranties or conditions, express or implied, and IBM will have no liability to Client, regarding the FFmpeg utility when used with the Cloud Service.
- Docker
- AWS CLI
- An AWS account with permission to create an ECR private repository and push docker images into it.
- AWS CLI installed and configured with credentials.
- In the later steps, you will need to define
aws_access_key
,aws_secret_key
andsecurity_token
in thevariables.tf
file under the directory./terraform-aws/previews
, and which will be used in theterraform
script.
- Terraform
git clone
the repository:$ git clone https://github.com/IBM/aspera-on-cloud-file-previews.git
The following settings can be adjusted before the installation depending on your use cases:
- Change the value of
preview_duration
inside of./terraform-aws/previews/variables.tf
to increase the duration of a preview for video files.- Default value is set to 15 seconds.
- Value can be changed anytime within AWS Lambda page in the Configuration -> environment variables section.
- To add the audio of preview videos set the value of
preview_audio
in the./terraform-aws/previews/variables.tf
directory totrue
.- Default value is set to false.
- Value can also be changed anytime in AWS Lambda page.
- There will be 2 instances of File Preview in AWS Lambda, one with high resources that will be used for the
video
preview processing and another with low resources for theimage
thumbnail processing. Depending on the file extension, it will invoke either of them to reducecosts
of the running AWS Lambda Instances. - The environment variables for 'high_resource_lambda_name' and 'low_resource_lambda_name' in the AWS page are not required to be changed, unless the names for the lambda functions are manually changed outside of Terraform.
Create an AWS ECR private repository (if one does not exist already).
$ aws ecr create-repository \
--repository-name {repo_name} \
--image-scanning-configuration scanOnPush=true \
--region {region}
Note: {repo_name}
can be anything, for example. "previews-terraform"
. Also in later Terraform steps, you will need to modify the {account_id}
, {repo_name}
and {region}
values in the ./terraform-aws/previews/variables.tf
file.
Provide Docker access to push docker images in AWS ECR private repositories.
$ aws ecr get-login-password --region {region} | docker login --username AWS --password-stdin {account_id}.dkr.ecr.{region}.amazonaws.com
The {account_id}
can be fetched with:
$ aws sts get-caller-identity | grep Account
The main core function to generate the previews for the uploaded files.
There are 4 different building arguments for the container to define the codec that is going to be used to generate previews for the uploaded videos:
These four options are available since the user has to make a choice between them, depending if they are willing to pay license fees (x264), comply with some conditions to avoid paying royalty fees (openh264) or just use open source codecs such as vp9 and av1 to be used within File Preview.
The encoder
can be changed to one of the values in this list: ['vp9', 'av1', 'x264', 'openh264']
. An example command to build the docker image and install vp9 in it would be:
$ cd previews && docker build --build-arg encoder=vp9 -t {account_id}.dkr.ecr.{region}.amazonaws.com/{repo_name}:previews .
Note: {repo_name}
has to match the AWS ECR private repository you created previously.
Checks existing items in the bucket on a particular path and calls File Preview on each file that does not have a preview.
$ cd ../previews-checker && docker build -t {account_id}.dkr.ecr.{region}.amazonaws.com/{repo_name}:checker .
This is only used to filter out S3 upload triggers using a low resource lambda by file extensions to reduce the costs in the long run.
$ cd ../previews-filter && docker build -t {account_id}.dkr.ecr.{region}.amazonaws.com/{repo_name}:filter .
Push all the docker images to your ECR repository
$ docker push {account_id}.dkr.ecr.{region}.amazonaws.com/{repo_name}:previews
$ docker push {account_id}.dkr.ecr.{region}.amazonaws.com/{repo_name}:checker
$ docker push {account_id}.dkr.ecr.{region}.amazonaws.com/{repo_name}:filter
Next step is using the terraform script to install those AWS Lambda functions automatically.
-
Update both
aws_secret_key
andaws_access_key
in./terraform-aws/previews/variables.tf
. Make sure to use asecurity_token
when using short-term credentials. Your account must be able to install the following resources:- Bucket
- IAM roles and policies
- Lambda functions
-
Change variable values as needed in
terraform-aws/previews/variables.tf
- Note:
image_uri_previews
,image_uri_checker
andimage_uri_filter
should match the docker images pushed to your ECR repository in the previous step.
- Note:
-
Initialize a working directory containing Terraform configuration files:
$ cd terraform-aws/previews $ terraform init
-
The Terraform script will create a new bucket. To import an existing bucket make sure the
bucket_names
value invariables.tf
is properly changed and run (optional):./import.sh
Find the
{bucket_names}
for your existing AWS S3 bucket:- Sign in to your AWS account, and in the search bar, search for keyword
S3
. - In the Buckets list, choose the names of the buckets that you want to select, and use those names as the list for
{bucket_names}
- Sign in to your AWS account, and in the search bar, search for keyword
-
Create an execution plan that lets you preview the changes that Terraform plans to make to your infrastructure
$ terraform plan
-
Validate the configuration files
terraform validate
-
Execute the actions proposed in the Terraform plan
terraform apply
After running terraform apply
, it will create the following resources:
- S3 bucket with private access. (If no existing one was provided)
- New IAM Role.
- 4 different Lambda functions:
- 2 of them will be using the same container for File Preview, the difference between them is the resources allocated, to reduce costs.
- A filtering function with minimum resources to invoke File Preview on S3 trigger. Again, used to reduce costs.
- A previews checker that will navigate through a specific path (could be root) on a S3 bucket to generate previews on existing files.
- IAM policies for the new role:
logs:CreateLogGroup
,logs:CreateLogStream
,logs:PutLogEvents
to generate logs.s3:GetObject
,s3:PutObject
,s3:ListBucket
,s3:GetObjectTagging
,s3:PutObjectTagging
. Necessary to upload, fetch and list files, while also tagging them to know that they're already been processed. Only assigned to the new S3 bucket.lambda:InvokeAsync
,lambda:InvokeFunction
,lambda:GetFunctionConfiguration
. Needed to invoke a Lambda function. Assigned to both File Preview lambdas.
- AWS S3 trigger for each file uploaded and assigned to the filtering function.
Once the installation of everything is completed, please wait for a couple of minutes before uploading or transferring new files.
To generate previews for those exisiting files in your s3 bucket use previews-checker.
You need to define the following parameters to invoke the AWS Lambda function for preview-checker
:
-
path
: Location in the bucket where you want to check for previews, and which can also be empty to invoked Lambda function from the bucket root level. -
bucket
: Bucket name. The associated role should have access to it.
Run the following command to invoke previews-checker
lambda function with the previously defined parameters:
$ aws lambda invoke \
--function-name {checker_function_name} \
--invocation-type Event \
--payload '{ "path": {my_path}, "bucket": {bucket_name} }' \
--region {region} \
response.json
If for whatever reason the whole previews folder structure or one of its subfolders are deleted, then previews-checker can be used to re-generate previews for those files.
Optional: You can delete previews-checker lambda since we will only use it once to generate previews for those existing files. Here is the command to delete the Lambda Function and the Log Group of the preview-check:
$ terraform destroy --target aws_cloudwatch_log_group.terraform_log_checker --target aws_lambda_function.terraform_lambda_checker
Delete the docker image from your ECR private repository:
$ aws ecr batch-delete-image \
--repository-name {repo_name} \
--region {region} \
--image-ids imageTag=checker
Delete every resource created by Terraform:
$ terraform destroy
Note: In some cases terraform destroy
will not be able to remove the s3 bucket due to an existing file. If you want to delete your bucket you will need to empty it first.
In order to empty the bucket without deleting it, use --recursive
on the rm
command.
$ aws s3 rm s3://my_bucket --recursive
Then, run terraform destroy
again to delete the bucket.
Remove docker images from the ECR Repository:
$ aws ecr batch-delete-image \
--repository-name {repo_name} \
--region {region} \
--image-ids imageTag=[preview|checker|filter]
Delete the ECR Repository (Optional)
$ aws ecr delete-repository \
--repository-name {repo_name} \
--force
- Extensions:
.mxf
and.divx
will only be displayed its preview as an image. - There's something else besides moov atom when piping file bytes.
- Both .flv and .mpeg extensions don't have moov atom, yet only .flv is able to pipe into ffmpeg.