0

I have a python script that opens a csv file (3GB/18.9 million rows) and reads the file row for row using the csv library, then injects the row into a Postgres db. In an earlier version of this script, not only was I using a MySQL db, but I was also appending the rows to a list and once that list reached 20,000 rows I would loop through the list and inject the indexes one by one.

I haven't noticed any major difference in speed when injecting row for row into a postgres db as compared to the MySQL injection method. Both seem to move at nearly the same pace. Both are currently injecting ~100 rows/second

However, I am pretty new to handling such large data injections, and we are definitely going to be using the Postgres db. I wanted to find out if anyone knows of a more efficient and faster way of handling large csv files that need to be injected into a Postgres db, or if there's a library that speeds up this process?

Kevin G
  • 103
  • 8

0 Answers0