0

This post might be related to this one. I would like to encrypt a .csv file with a password or token. I would then like to write a script that decrypts the file using the password, reads in the .csv file as a data frame, and continues doing data analysis with the content. How would one achieve this?

Example:

import pandas as pd
import csv

# 1.) Create the .csv file
super_secret_dict = {'super_secret_information':'foobar'}

with open('super_secret_csv.csv','w') as f:
    w = csv.DictWriter(f,super_secret_dict.keys())
    w.writeheader()
    w.writerow(super_secret_dict)

# 2.) Now encrypt the .csv file with a very safe encryption method and generate
# a password/token that can be shared with people that should have access to the
# encrypted .csv file
# ...
# ...
# 3.) Everytime a user wants to read in the .csv file (e.g. using pd.read_csv())
# the script should ask the user to type in the password, then read in
# the .csv file and then continue running the rest of the script

super_secret_df = pd.read_csv('./super_secret_csv.csv')
Johannes Wiesner
  • 750
  • 7
  • 24

1 Answers1

1

You can use the cryptography library to create an encryption scheme.

Create a Key:

from cryptography.fernet import Fernet
key = Fernet.generate_key()
f = Fernet(key)

Save that key somewhere!

Load your key when you want to encrypt!

def load_key():
    return open(PATH TO SECRET KEY,"rb").read()

Encrypt your file

def encrypt_it(path_csv):
  """Takes a message an encrypts it
  """
  key = load_key()
  encrypted = ''
      
  # create Fernet using secret
  f = Fernet(key)

  with open(path_csv, 'rb') as unencrypted:
      _file = unencrypted.read()
      encrypted = f.encrypt(_file)
  
  with open('encrypted_file.csv', 'wb') as encrypted_file:
     encrypted_file.write(encrypted)

Read it back later:

def decrypt_it(path_encrypted):
  key = load_key()
  f = Fernet(key)
  decrypted = ''
  with open(path_encrypted, 'rb') as encrypted_file:
      decrypted = f.decrypt(encrypted_file.read())
  return decrypted