0

I'm able to write pytest functions by manually giving column names and values to create data frame and passing it to the production code to check all the transformed fields values in palantir foundry code repository.

Instead of manually passing column names and their respective values I want to store all the required data in dataset and import that dataset into pytest function to fetch all the required values and passing over to the production code to check all the transformed field values.

Is there anyways to accept dataset as an input the test function in planatir code repository.

Gavisha BN
  • 139
  • 6

1 Answers1

2

You can probably do something like this:

Lets say you have your csv inside a fixtures/ folder next to your test.

test_yourtest.py
fixtures/yourfilename.csv

You can just read it directly and pass it to create a new dataframe. I didn't test this code but it should be something similar to this:

def load_file(spark_context):
    filename = "yourfilename.csv"
    file_path = os.path.join(Path(__file__).parent,  "fixtures", filename)

    return open(file_path).read()

Now you can load your CSV, it's just a matter of loading it into a dataframe and passing it into your pyspark logic that you want to test. Get CSV to Spark dataframe

fmsf
  • 35,134
  • 48
  • 145
  • 193