Skip to content

Latest commit

 

History

History
36 lines (26 loc) · 810 Bytes

README.md

File metadata and controls

36 lines (26 loc) · 810 Bytes

ETL Pipeline v1

Web capture_25-1-2024_123919_

Introduction

This code contains the steps to build an ETL pipeline that carries out the following tasks:

  • Extracts 400k transactions from Redshift
  • Identifies and removes duplicates
  • Loads the transformed data to a s3 bucket

Requirements

The minimum requirements:

  • Python 3+

Instructions on how to execute the code

  1. Clone the repository, and go to the week19 folder

  1. Install the libraries that they need to run main.py
pip3 install -r requirements.txt
  1. Copy the .env.copy file to .envand fill out the environment variabls.

  2. Run the main.py script Mac users:

python3 main.py
python main.py