0

I read in a raster and turned it into a binary image (land/water mask) of 0's (water) and 255's (land). I've named this array src_landCoverArray.

I want to output this array as a shapefile with water as polygons and land as empty space.

I've tried using rasterio.features.shapes() described here (https://stackoverflow.com/questions/37898113/python-boolean-array-to-polygon) but it produced a shapefile with the wrong extent. My code:

import shapely.geometry
myShapes = rasterio.features.shapes(src_landCoverArray)
polygons = [shapely.geometry.Polygon(shape[0]["coordinates"][0]) for shape in myShapes if shape[1] == 0]
crs = {'init': 'epsg:4326'}
my_gdf = gpd.GeoDataFrame(crs=crs, geometry=polygons)
my_gdf.to_file("Water_Poly2.shp", driver='ESRI Shapefile')

I've tried using a similar method with rasterio described here (How to polygonize raster to shapely polygons) but failed again with the location/extent being incorrect. My code:

mypoly = []

for vec in rasterio.features.shapes(src_landCoverArray): mypoly.append(shape(vec[0])) crs = {'init': 'epsg:4326'} my_gdf = gpd.GeoDataFrame(crs=crs, geometry=mypoly) my_gdf.to_file("Water_Poly_mypoly.shp", driver='ESRI Shapefile')

I feel that I'm close to success if I could figure out the issue of extent/location being wrong. Has anyone tackled a problem similar to this?

Kadir Şahbaz
  • 76,800
  • 56
  • 247
  • 389
  • can you say more about what goes wrong in each case and what you expected it to produce instead – Ian Turton Dec 03 '20 at 08:30
  • @IanTurton Yes. Ideally I'd like to produce a Polygon shapefile that has water (0's) as polygons and land (255's) as empty space. The first attempt I listed produces a .shp file, but when I load it in QGIS it's nowhere near where it should be; the CRS is correct but the extent is wrong. The same thing happens with my second attempt. Clearly I'm not defining the extent/pixel size but I'm lost on how to incorporate that into this script. – user123456789 Dec 03 '20 at 14:47

1 Answers1

1

I think I have experienced this issue before. The key detail is that rasterio.features.shapes needs to know about the "transform" of the raster dataset. (The "transform" is how it maps the pixels in the array to the actual coordinates.)

The dataset has a .transform property that you can pass to rasterio.features.shapes, as in this example from the rasterio documentation (where src is the dataset):

shapes = features.shapes(blue, mask=mask, transform=src.transform)

I don't think I see the name of your dataset variable in your post, so I can't suggest the exact code you need.

Max
  • 11
  • 2
  • I've seen that .transform function before, that makes perfect logical sense. I tried what you suggested, src_Transform = src_rasterio.transform src_Shapes = rasterio.features.shapes(src_landCoverArray, transform=src_Transform) and got a shapely generator object. But when I go to turn that into a geodataframe and then save it as a shapefile it failed with "TypeError: unhashable type: 'dict'". Maybe I don't need to go that extra step of using a geodataframe? – user123456789 Dec 03 '20 at 19:56
  • Does the error occur when you create the GeoDataFrame, or afterwards when you try to save it as a shapefile? And are you creating the GeoDataFrame in the same way as in your original post (the code snippet starting with mypoly = [])? – Max Dec 04 '20 at 16:19
  • So I actually was able to output a shapefile that is in the correct location of the globe, has the correct CRS, and has the correct extent; but it's slightly offset from "correct". I believe this is due to me assigning an incorrect origin to the shapefile. My question now is: can I edit a rasterio.Affine? Printing out_trans gives me 6 values: the resolution in the x and y, the xy origin, and two 0's. Can I edit those values individually? – user123456789 Dec 07 '20 at 20:46