1

I would like to polygonize every single cell of a raster dataset using rasterio. I have been experimenting with the following function based on this question:

def polygonize_raster(dataset):

    # Read the dataset's valid data mask as a ndarray. Dataset is a rasterio read object open for reading
    mask = dataset.dataset_mask()

    array = dataset.read(1)
    generator = rasterio.features.shapes(source=array, mask=mask, transform=dataset.transform)
    # Extract feature shapes and values from the array
    geom_list = []
    for geom, value in generator:
        # Print GeoJSON shapes to stdout
        geom = shapely.geometry.shape(geom)
        geom_list.append(geom)

    return geom_list

The part which vectorizes everything is:

generator = rasterio.features.shapes(source=array, mask=mask, transform=dataset.transform)

It seems from the docs that rasterio.features.shapes has built-into it the fact that regions with identical values get their own polygon. I don't want this. I want every single cell to get its own polygon even if an adjacent cell has the same value.

One solution I thought of is to give each cell a unique value, but this would require writing another function which could be computationally expensive. I would much rather if a more built-in method within rasterio or shapely could do this?

lambertj
  • 3,037
  • 3
  • 18
  • 37
user32882
  • 3,514
  • 2
  • 31
  • 67
  • 1
    Just a thought: Maybe it would be easier to build a list of shapely Polygons from scratch via the raster pixel size? E.g. using rasterio.transform.xy() for the coordinates, then creating a Polygon from shapely.geometry.Point(x,y).buffer(pixelsize_in_projection).bounds? Not sure how efficient this would be though. – Christoph Rieke Oct 04 '18 at 12:29
  • You can create an array with unique values in a single line: https://stackoverflow.com/a/25369467/8195528 Would not be computationally expensive. Can also use numpy's arange and reshape to desired array shape. – Jon Oct 04 '18 at 17:53

1 Answers1

0

I haven't used Python in a while, so this is definitely rough around the edges. For smallish data, it should work though. Thanks @Christoph Rieke for the transform.xy() suggestion; I wanted to avoid the buffer call though, since AFAIK that's a bit more costly.

def rio_image_to_gdf(image, transform, width, height):
    rows = np.arange(0, height)
    cols = np.arange(0, width)
lons_ul, _ = rio.transform.xy(transform, rows=0, cols=cols, offset="ul")
_, lats_ul = rio.transform.xy(transform, rows=rows, cols=0, offset="ul")
lons_lr, _ = rio.transform.xy(transform, rows=0, cols=cols, offset="lr")
_, lats_lr = rio.transform.xy(transform, rows=rows, cols=0, offset="lr")

polygons = [
    poly
    for lonul, lonlr in zip(lons_ul, lons_lr)
    for poly in [
        Polygon(
            [
                [lonul, latul],
                [lonlr, latul],
                [lonlr, latlr],
                [lonul, latlr],
                [lonul, latul],
            ]
        )
        for latul, latlr in zip(lats_ul, lats_lr)
    ]
]
polygons = gpd.GeoSeries(polygons, crs=4326)

vals = pd.Series(image.reshape(polygons.size, order="F"))
return gpd.GeoDataFrame({"value": vals}, geometry=polygons)

ps: Well aware that the nested for loop logic in the list comprehension is atrocious, but I don't have more time.

jhat
  • 21
  • 1