1

I am desperately trying to vectorize raster files in a loop for which I came across GDAL's polygonize function (https://gdal.org/programs/gdal_polygonize.html).

However, I don't get the hang of it. To me, the input parameters are cryptic and I failed to figure out the exact commands required for application in Python.

** Edit**

With help of Ben's comment I got to this point:

from osgeo import gdal, ogr
import sys

this allows GDAL to throw Python Exceptions

gdal.UseExceptions()

get raster datasource

src_ds = gdal.Open(inputFilename) srcband = src_ds.GetRasterBand(1)

create output datasource

dst_layername = "POLYGONIZED_STUFF" drv = ogr.GetDriverByName("ESRI Shapefile") dst_ds = drv.CreateDataSource( outputPath+outputFile ) dst_layer = dst_ds.CreateLayer(dst_layername, srs = None )

gdal.Polygonize( srcband, None, dst_layer, -1, [], callback=None )

GDAL now creates three files:

  • filename.dbf (6,451 KB)
  • filename.shp (75,929 KB)
  • filename.shx (1 KB)

However, when I import the file with GeoPandas (test = gpd.read_file(outputPath+outputFile)), the GeoDataFrame is empty.

enter image description here

Bzw. I am working with this data set: http://www.earthstat.org/harvested-area-yield-175-crops/

Stücke
  • 529
  • 4
  • 21
  • Perhaps this example here will help: http://pcjericks.github.io/py-gdalogr-cookbook/raster_layers.html#polygonize-a-raster-band – Ben W Nov 26 '21 at 12:00
  • First, that doc is for the commandline application, not the gdal.Polygonize function. Second, you need to use GDAL to open the raster, GDAL doesn't understand rioxarray or rasterio objects. If you're using rioxarray/rasterio, try rasterio.features.dataset_features or rasterio.features.shapes https://rasterio.readthedocs.io/en/latest/api/rasterio.features.html – user2856 Nov 26 '21 at 12:02
  • Thank you both for your comments. I think I got a step further. I actually manage to create a shape file but it's still not fully working. I've adjusted my question. – Stücke Nov 26 '21 at 13:05
  • If you get a shapefile after you run your gdal.Polygonize code, then isn't your problem now an issue not with gdal.Polygonize but with opening that file. Does the shapefile open in a GIS client like QGIS? – nmtoken Nov 26 '21 at 14:04
  • I can open it but the raster band has not been transferred correctly. I guess this is a new issue. Therefore, I will close this question and open a new issue. – Stücke Nov 26 '21 at 14:12

2 Answers2

7

I downloaded the dataset you linked to and tested with the following script- it works ok for me.

In the following script, I left my own paths in for reference but you should of course change them to match your own. I also created a spatial reference system for the output file, as well as a field to write the raster values to (I guess that would suit your goal).

from osgeo import gdal, ogr, osr

in_path = 'C:\Users\Ben\Desktop\GIS Files\Zipped data\HarvestedAreaYield175Crops_Geotiff\HarvestedAreaYield175Crops_Geotiff\GeoTiff\oilpalm\oilpalm_HarvestedAreaHectares.tif'

out_path = 'C:\Users\Ben\Desktop\GIS Files\Zipped data\HarvestedAreaYield175Crops_Geotiff\HarvestedAreaYield175Crops_Geotiff\GeoTiff\oilpalm\oilpalm_HarvestedAreaHectares.shp'

get raster datasource

src_ds = gdal.Open( in_path )

srcband = src_ds.GetRasterBand(1) dst_layername = 'oilpalm_HarvestedAreaHectares' drv = ogr.GetDriverByName("ESRI Shapefile") dst_ds = drv.CreateDataSource( out_path )

sp_ref = osr.SpatialReference() sp_ref.SetFromUserInput('EPSG:4326')

dst_layer = dst_ds.CreateLayer(dst_layername, srs = sp_ref )

fld = ogr.FieldDefn("HA", ogr.OFTInteger) dst_layer.CreateField(fld) dst_field = dst_layer.GetLayerDefn().GetFieldIndex("HA")

gdal.Polygonize( srcband, None, dst_layer, dst_field, [], callback=None )

del src_ds del dst_ds

Afterwards, I loaded the created shapefile in QGIS and deleted all features with 0 value.

The loaded layer looks like this:

enter image description here

Here, you can see the polygonized layer with transparent fill & labelled with "HA" field value. The selected feature corresponds to the raster pixel with highest value (shown by the Identify tool).

enter image description here

References:

http://pcjericks.github.io/py-gdalogr-cookbook/raster_layers.html#polygonize-a-raster-band

Output field name when using gdal.Polygonize() in python

GDAL polygonize in python creating blank polygon?

Ben W
  • 21,426
  • 3
  • 15
  • 39
  • Thank you Ben! This has been a great help! Strangely my kernel crashes each time upon finishing the script but the conversion still seems to work. Thanks! – Stücke Nov 27 '21 at 08:00
  • 1
    @Stücke, no problem. I admit I just ran this in the QGIS Python console (not as a standalone Python script). I don't think it should make a difference. I don't have time right now but will test later. – Ben W Nov 27 '21 at 08:32
  • 1
    Also worked fine for me run as a Python file from command line in QGIS Python environment. – Ben W Nov 28 '21 at 06:19
1

Alternatively it is also possible to use the function gdal_polygonize from osgeo_utils.gdal_polygonize instead of polygonize from osgeo.gdal (this is also the function that's documented here), for example:

from osgeo_utils.gdal_polygonize import gdal_polygonize

input_fp = "path/to/some/raster.tif"

status_code = gdal_polygonize( src_filename=input_fp, band_number=1, dst_filename=input_fp.replace(".tif", ".geojson"), driver_name="GeoJSON" )

All possible options are:

src_filename: Optional[str] = None,
band_number: Union[int, str] = 1,
dst_filename: Optional[str] = None,
driver_name: Optional[str] = None,
dst_layername: Optional[str] = None,
dst_fieldname: Optional[str] = None,
quiet: bool = False,
mask: str = "default",
options: Optional[list] = None,
layer_creation_options: Optional[list] = None,
connectedness8: bool = False,
Bert Coerver
  • 1,945
  • 10
  • 22