Skip to content

This captures details on how to use Key Pairs to connect to Snowflake using PySpark.

Notifications You must be signed in to change notification settings

dipaktec/snowflake-rsa-auth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

snowflake-rsa-auth

Use Key Pairs to connect to Snowflake using PySpark.

Background and Scope

Snowflake data can be accessed from different Snowflake clients(e.g. SnowSQL CLI, JDBC Driver, Snowflake Connector for Spark etc.) For more details on Snowflake connector & driver use this link.

While Snowflake allows basic authentication, for enhance security it supports Key Pair(RSA) authentication.

This guide is to show how quickly we can build a pyspark application to do so.

Prerequisites

  • An Account in Snowflake (you can use free tier for 30 days)
  • Spark installtion completed
  • Spark Snowflake connector are installed
  • Any IDE/text editor to build pyspark code

Steps

1. Generate Private Key

Snowflake allows using both encrypted and unencrypted keys, but some clients(SnowSQL CLI) need encrypted keys only. Also encrypted keys are recommended. Here we will use Openssl to create these keys.

  • Create Unencrypted key
$ openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out sf_rsa_key.p8 -nocrypt
  • Create Encrypted key
$ openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out sf_rsa_key.p8

2. Generate Public Key

This steps creates public key using the private key.

openssl rsa -in sf_rsa_key.p8 -pubout -out sf_rsa_key.pub

3. Assign public key to Snowflake user

Open the public key file in a text editor(I have used VSCode) and copy the key. Then execute below from Snowflake UI or CLI.

alter user <username> set RSA_PUBLIC_KEY = '<key-value>;

4. Veirfy

Use below command to verify that public key is added.

desc user <username>

image

5. Configure Snowflake Client(in this case PySpark script) to use RSA authentication

The pyspark code is added here. The code reads the private key, creates spark session, builds snowflake context and then finally connects to snowflake to read data.

To execute the code use below command

$ spark-submit .../TestSnowflakeRSA.py

Tip: Add these jars in Spark classpath - snowflake-jdbc-3.13.3.jar and spark-snowflake_2.12-2.8.5-spark_3.0.jar.

Key points to note

  • This authentication method requires, as a minimum, a 2048-bit RSA key pair
  • Snowflake supports uninterrupted rotation of public keys, uses two RSA Public Key properties to do same
  • Creating encrypted private key, requires using a passphrase. Snowflake recommens PCI DSS standard to generate the passphrase.