Start Pyspark framework with Jupyter Notebook and Snowflake Datawarehouse using JSON and CSV data files

learn pyspark with jupyter notebook and snowflake datawarehouse with load/unload json.csv files
  1. Have a trial account on Snowflake Datawarehouse using https://signup.snowflake.com/ to load/unload data using .snowsql and pyspark.
  2. Python is required on you system, get python version info —
    python--version (3.7+)
  3. Install PIP using python -m pip install — upgrade pip
  4. Install pyspark using pip install pyspark
  5. Install jupyter notebook pip install notebook
  6. run the jupyter notebook python -m notebook
start pyspark with jupyter notebook using python
extract data from snowflake shared database
extract data into snowflake user stage
download csv/json files data using snowsql get command.
snowsql get command to download the json/csv files
pyspark dataframe operations, filters etc.
pyspark data filters and methods
pyspark aggregate functions
pyspark aggregate functions

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store