33

I am seeking an open-source python solution to convert raster to polygon (no ArcPy).

I did know the GDAL function to convert raster to polygon, and here is the manual: http://pcjericks.github.io/py-gdalogr-cookbook/raster_layers.html#polygonize-a-raster-band

Nevertheless, I expect that the output can be shapely polygons or any object, temporarily in memory, not saved as a file. Is there any package or code to handle this issue?

If the raster has been processed in a numpy array, the approach is listed below.

PolyGeo
  • 65,136
  • 29
  • 109
  • 338
Vicky Liau
  • 727
  • 1
  • 9
  • 18
  • I am not sure if one mix those two packages. Are you aware of mutual incompatibilites between rasterio and gdal? > Mutual Incompatibilities Rasterio and GDAL’s bindings can contend for global GDAL objects. Unless you have deep knowledge about both packages, choose exactly one of import osgeo.gdal or import rasterio see here – Nerolf05 Mar 15 '21 at 13:53

4 Answers4

54

Use rasterio of Sean Gillies. It can be easily combined with Fiona (read and write shapefiles) and shapely of the same author.

In the script rasterio_polygonize.py the beginning is

import rasterio
from rasterio.features import shapes
mask = None
with rasterio.Env():
    with rasterio.open('a_raster') as src:
        image = src.read(1) # first band
        results = (
        {'properties': {'raster_val': v}, 'geometry': s}
        for i, (s, v) 
        in enumerate(
            shapes(image, mask=mask, transform=src.transform)))

The result is a generator of GeoJSON features

 geoms = list(results)
 # first feature
 print geoms[0]
 {'geometry': {'type': 'Polygon', 'coordinates': [[(202086.577, 90534.3504440678), (202086.577, 90498.96207), (202121.96537406777, 90498.96207), (202121.96537406777, 90534.3504440678), (202086.577, 90534.3504440678)]]}, 'properties': {'raster_val': 170.52000427246094}}

That you can transform into shapely geometries

from shapely.geometry import shape
print shape(geoms[0]['geometry'])
POLYGON ((202086.577 90534.35044406779, 202086.577 90498.96206999999, 202121.9653740678 90498.96206999999, 202121.9653740678 90534.35044406779, 202086.577 90534.35044406779))

Create geopandas Dataframe and enable easy to use functionalities of spatial join, plotting, save as geojson, ESRI shapefile etc.

geoms = list(results)
import geopandas as gp
gpd_polygonized_raster  = gp.GeoDataFrame.from_features(geoms)
Victor D.
  • 3
  • 2
gene
  • 54,868
  • 3
  • 110
  • 187
  • 1
    if the raster has been processed as a numpy array, is there any way to convert a numpy array as polygons? Thanks! – Vicky Liau Apr 03 '16 at 18:45
  • In theory, yes- – gene Apr 03 '16 at 19:17
  • 1
    The mask variable and parameter in your example seems unnecessary. I would however recommend to add if value > src.nodata to the list comprehension to make use of the source's nodata value and discard any shapes that correspond to it. Not sure what would happen if there is no nodata vaue though. :o) – bugmenot123 Apr 28 '17 at 18:58
  • 7
    meanwhile they changed rasterio.drivers to rasterio.Env and src.affine to src.transform – Leo Dec 03 '18 at 09:31
  • When I tried this I get a polygon with a crs of None. So I tried setting the crs to the original image's but then it shows it in the completely wrong place. Any ideas? – Emtomp Apr 01 '20 at 14:50
8

Here is my implementation.

from osgeo import ogr, gdal, osr
from osgeo.gdalnumeric import *  
from osgeo.gdalconst import * 
import fiona
from shapely.geometry import shape
import rasterio.features

#segimg=glob.glob('Poly.tif')[0]
#src_ds = gdal.Open(segimg, GA_ReadOnly )
#srcband=src_ds.GetRasterBand(1)
#myarray=srcband.ReadAsArray() 
#these lines use gdal to import an image. 'myarray' can be any numpy array

mypoly=[]
for vec in rasterio.features.shapes(myarray):
    mypoly.append(shape(vec))

The way to install rasterio is by 'conda install -c https://conda.anaconda.org/ioos rasterio', if there is an installation issue.

PolyGeo
  • 65,136
  • 29
  • 109
  • 338
Vicky Liau
  • 727
  • 1
  • 9
  • 18
  • 1
    The result of rasterio is directly a numpy array, therefore you don't need myarray=srcband.ReadAsArray() #or any array – gene Apr 04 '16 at 06:16
  • @gene I revised the note. This line (myarray=srcband.ReadAsArray()) uses gdal to import the image. – Vicky Liau Apr 04 '16 at 18:21
  • import the image as a numpy array and rasterio import directly the image as a numpy array – gene Apr 04 '16 at 18:51
  • 2
    This worked for me although I had to index vec because it was being returned as a tuple. shape(vec[0]) – narmaps Oct 09 '19 at 17:00
3
# PYTHON 3

import rasterio.features
from shapely.geometry import shape

# Mask is a numpy array binary mask loaded however needed
maskShape = rasterio.features.shapes(mask.astype('uint8'))
mypoly=[]
for vec in maskShape:
    mypoly.append(vec[0])
print(mypoly)

This is what worked for me.

Vince
  • 20,017
  • 15
  • 45
  • 64
Zexelon
  • 131
  • 3
  • 1
    Is it also possible to change the input data from an numpy array to an GeoTIF (dataset = rasterio.open('raster_data.tif'). How would you change the parameters from rasterio.features.shapes()? – soph Apr 24 '22 at 14:32
  • You can always just get the numpy array directly from the GeoTIF: n_array = yourgeotif.read(1), although you would miss the geospatial info which you may have to provide by yourself transform = yourgeotif.transform – JEquihua Apr 25 '22 at 10:43
  • I am trying soph's task too, and my already uint8 binary raster is producing a maskShape that seems not to be iterable in the line for vec in maskShape: Error at that line is 'DatasetReader' object has no attribute 'dtype.' type(maskShape) is generator. – J Kelly Apr 25 '22 at 18:26
  • the resulting vectors in mypoly are in local coordinates (row, col); to transform them to geographic coordinates, you can use the transform of the raster and the function rasterio.transform.xy() – Jan Pisl Mar 15 '23 at 13:41
0

Incase of pixels with high variability, convert the arrays to integer type. Helps to save on computation time.

with rio.Env():
with rio.open(input_file_path) as depth_image:
    depth_array=depth_image.read(1).astype(int)
CalebJ
  • 21
  • 5