1

I'm working on a Python script which needs to use the gdal python module, and not the subprocess call method.

I've been using this, but I can't find any example or good documentation online about how to load a local file to PostgreSQL. Most examples I've seen are the other direction: about exporting from PG to local file. I've tried:

from osgeo import gdal

conn_string = r'PG:"host=localhost dbname=postgres port=5432 user=admin password=xxxx' in_file = 'polygon.shp'

gdal.VectorTranslate( out_table, # how do I set a PostgreSQL table name as the destination? How do I pass the PG connection string in? in_file, format = 'PostgreSQL', geometryType = 'POLYGON', accessMode = 'overwrite', dstSRS=27700, reproject=True, layerCreationOptions=['SCHEMA=schema_name', 'PRECISION=NO', 'OVERWRITE=YES'] )

But get errors like ERROR 1: You must specify at least a db name or a service name or ERROR 1: PostgreSQL driver doesn't currently support database creation. (I just want to create/overwrite a table). Any tips?

Theo F
  • 1,817
  • 12
  • 34
  • what is different to using psycopg2 https://gis.stackexchange.com/a/90110/276 – Mapperz Jul 06 '23 at 00:45
  • Plenty of examples at https://gdal.org/drivers/vector/pg.html Either don't specify a table and the new layer (table) will named be the same as the input layer (in your example, "polygon"), or specify it using the -nln new_layer_name option. – user2856 Jul 06 '23 at 06:21
  • @user2856 that link shows command line examples (I already know how to use gdal that way), I need pythonic examples which use the ‘osgeo’ library and ‘VectorTranslate’ function. – Theo F Jul 06 '23 at 06:57
  • Pretty much the same. Don't specify a tsble or specify a new layer name with layerName=something option e.g. VectorTranslate(etc..., layerName="something") – user2856 Jul 06 '23 at 07:20
  • e.g. opt = gdal.VectorTranslateOptions(options=[r'PG:"host=localhost, dbname=postgres, port=5432, user=admin, password=xxxx]) and then then pass opt to gdal.VectorTranslate(). This example is untested and the expected string format in the list might be slightly different. – PyMapr Jul 06 '23 at 08:32
  • @user2856 as I understand it, VectorTranslate requires 2 arguments. 1: output file/name, and 2: source dataset. So I have to specify the output name. – Theo F Jul 06 '23 at 09:06
  • 2
    The connection string is the output name. As I said - the new layer (table) will named be the same as the input layer (in your example, "polygon") unless you specify a different name via layerName="something". – user2856 Jul 06 '23 at 09:13
  • @user2856 ok thanks, that's what confused me- the connection string being passed as the output name. I've managed to get a working script, will post here along with a fix for PROJ: proj_identify errors which occur from running osgeo module which has it's own proj.db, on a system which has a core GDAL installation elsewhere. – Theo F Jul 06 '23 at 10:18

1 Answers1

4

With some steer from @user2856 and others in the comments, I've managed to form a working script, with some very important information on proj.db.

import os
from osgeo import gdal

""" if you have a separate system installation of GDAL, you must make sure you only use the proj.db of the imported osgeo python package. We can use os.environ[] to do this. Otherwise python will try run your code using a different proj.db, which can produce errors like: ERROR 1: PROJ: proj_identify: C:\Program Files\GDAL\projlib\proj.db lacks DATABASE.LAYOUT.VERSION.MAJOR / DATABASE.LAYOUT.VERSION.MINOR metadata. It comes from another PROJ installation. """ os.environ['PROJ_LIB'] = r'C:<user>\AppData\Roaming\Python\Python39\site-packages\osgeo\data\proj'

conn_string = r'dbname=database host=localhost port=5432 user=username password=xxxx'

gdal.VectorTranslate( destNameOrDestDS=f'PG:{conn_string}', srcDS=r'folder\polygon.shp', layerName='polygon_table', # destination table name accessMode='overwrite', geometryType='POLYGON', dstSRS=f'EPSG:27700', layerCreationOptions=['SCHEMA=schema_name', # the destination schema 'PRECISION=NO', # prevents field overflow 'OVERWRITE=YES', 'GEOMETRY_NAME=geom'] # name for geometry column (default is wkb_geometry) )

The proj.db issue is very important if you're running these kind of scripts on a cron machine/scheduler. If you are, make sure the PROJ_LIB environment is set to somewhere your scheduler can reach (ie. C:\Program Files instead of C:<user docs>)

Theo F
  • 1,817
  • 12
  • 34