When we at Daltix decided to make Snowflake a central part of our technical strategy for our Data Warehousing and Data Mart needs, there was one very not small problem: a lack of a decent high-level and simple python connector.
While some very good fundamentally sound and API-complete options existed, such as a Snowflake SQLAlchemy connector as well as the more generic Snowflake Connector for python, we had a few simple needs:
- To be able to quickly and easily make ad-hoc connections to the database
- To be able to execute SQL strings against our Snowflake instance
- Make connections using credentials that are safely stored in a managed secrets vault and can be rotated
An additional requirement to #1 was that we, in no way ever, wanted the passwords to be passed to the connector via plaintext in the code itself. While we at Daltix take security very seriously and have several processes in place to ensure that plaintext passwords do not make it into our source code, we wanted our main method of connection to Snowflake to not even allow this as an option.
Requirement #2 come from the fact that mostly what we do with Snowflake are run pure-SQL ETLs, rather than do stuff like power web applications. Most of our pure-SQL ETLs are exactly that and make heavy use of uncommon and, in some cases, Snowflake-specific functionality on semi-structured fields like lists and objects. The implication here is that traditional ORMs are not necessary and even the relatively simple functionality of the python-snowflake-connector with it’s cursor management and whatnot is overkill.
Requirement #3 seems like a nice-to-have but in reality it is absolutely necessary. We want our analysts and Data Scientists to be able to make connections without wasting time reading docs about some AWS managed vault service while at the same time following best practices in terms of security.
So in order to meet these requirements, Dalton gave me the green light to build SnowConn. It is a very simple 300 lines of code (including docstrings) that is primarily a simple wrapper around the only required dependency: snowflake-sqlalchemy. It provides two different ways to connect:
- SnowConn.connect() leverages the same configuration file that SnowSQL CLI uses
- SnowConn.credsman_connect() leverages the AWS secretsmanager via the optional dependency of boto3
Although the first option does not remove the option of having plaintext passwords stored on the machine you are on, it does remove the option completely of passing them on as part of the API and thus as plaintext in our version controlled source code.
The rest is happy times! You just pip install snowconn, follow the instructions to setup your authentication, and you are a few lines of code away from executing queries against your Snowflake instance.
I would like to thank Daltix very much for allowing me the freedom to develop this solution as part of my contract with them. Supporting open source initiatives in this way is definitely something they should be commended for. If you are looking for an interesting job in a fast-paced and challenging but fair environment, I have no problem recommending it as a place to work.
Written by Sam Hopkins, who worked for Daltix as a Data Science and Data Engineering contractor through his company DareData Engineering.