Skip to content

Data Pipeline for Data Analysis built separately from CUBEMS environment

License

Notifications You must be signed in to change notification settings

NichaRoj/cubems-data-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cubems-data-pipeline

Data Pipeline for Data Analysis built separately from CUBEMS environment

Running Dataflow Pipeline

  1. Go into dataflow folder

  2. Set up virtual environment

python3 -m venv venv
  1. Activate the virtual environment
  • Windows: venv\Scripts\activate.bat
  • Linux/Mac: source venv/bin/activate
  1. Install required packages
pip3 install -r requirements.txt

Deploy Dataflow Pipeline

  1. Setup GOOGLE_APPLICATION_CREDENTIAL environment variable
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json

OR create .env with the following content

GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
  1. Run pipeline. Check direct-runner.py for local pipeline template and dataflow-runner.py for dataflow pipeline template
python3 pipeline-file.py
  1. Deploying Dataflow Template. The filename must be the same as template name and the file must follow the guideline https://cloud.google.com/dataflow/docs/guides/templates/creating-templates
python3 deploy.py --template [template name]

Deploying Firebase Functions

  1. Go into functions folder

  2. Install dependencies

yarn
  1. Deploy to Firebase
yarn deploy

About

Data Pipeline for Data Analysis built separately from CUBEMS environment

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages