This project demonstrates the creation of an ETL (Extract, Transform, Load) pipeline using the Serverless Framework on AWS. The pipeline extracts data from an S3 bucket, performs a simple transformation, and loads the transformed data back into another S3 bucket. This README provides an overview of the project and instructions for deployment and usage.
Before you begin, ensure you have met the following requirements:
- Serverless Framework installed globally.
- An AWS account with appropriate permissions.
Clone this repository to your local machine and navigate to the project folder:
bash git clone https://github.com/your-username/serverless-etl-pipeline.git cd serverless-etl-pipeline
Install project dependencies using npm:
npm install
In the serverless.yml file, you can configure the AWS region, event triggers, and other settings for your ETL pipeline. Make sure to customize this file according to your specific requirements.
Deploy the ETL pipeline using the Serverless Framework:
serverless deploy
This command will package and deploy the service to your AWS account.
The ETL pipeline is triggered automatically when an S3 object is created in the source bucket. The steps are as follows:
Data is extracted from the source S3 bucket.
You can customize the transformation logic in the Lambda function to suit your ETL requirements.
You can monitor the execution of your ETL pipeline in the AWS Step Functions console and view CloudWatch Logs for each Lambda function. Configure CloudWatch Alarms and other monitoring solutions as needed for real-time insights.
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or create a pull request.
This project is licensed under the MIT License - see the LICENSE file for details.