2

Fairly new to geoprocessing in general and within Python.

I am using Python geoprocessing packages to extract meteorological information from a raster file within a watershed shapefile (the blue outline in the below image).

ArcGIS Extract of watershed area (blue), Elevation Band Area (brown), and raster

I followed along with the example here to determine the average raster value within the watershed successfully.

Now I want to adapt that script to extract the average raster value within the elevation band area (EBA) (the brown outlined area within the image). The format of the EBA is a multipolygon. In adapting the same script, I'm only getting a few points within the EBA. I think this might be because the resolution of the raster is too low, and so only a few of the raster squares (or their centroids?) actually lie within the EBA. Maybe the rest are getting cut out because the raster resolution is so much larger than the multipolygons.

I tried adding some lines of code to re-sample the raster at up to 100x greater resolution (which was extremely slow). This unfortunately did not improve the result.

My script is shown below.

import numpy as np
import rasterio
from rasterio import Affine # or from affine import Affine
from rasterio.mask import mask
from rasterio.enums import Resampling
import geopandas as gpd
import shapely; shapely.speedups.disable()
from shapely.geometry import Point
import os, os.path

rasterpath='C:/.../' ebapath='C:/.../'

shapefile = gpd.read_file(str(ebapath+"EBA_ID_2.shp"), crs="EPSG:3005")

#Transform to WGS84 to match coordinate system of raster data shapefile=shapefile.to_crs(epsg=4326)

extract the geometry in GeoJSON format

geoms = shapefile.geometry.values # list of shapely geometries geometry = geoms[0] # shapely geometry

transform to GeJSON format

from shapely.geometry import mapping geoms = [mapping(geoms[0])]

upscale_factor=100 filename='metfile.asc'

extract the raster values values within the polygon

Coordinate system for the raster data is WGS 84 is EPSG:4326

with rasterio.open(str(rasterpath+filename), crs="EPSG:4326") as src: data = src.read( out_shape=( src.count, int(src.height * upscale_factor), int(src.width * upscale_factor) ), resampling=Resampling.bilinear)

scale image transform

transform = src.transform * src.transform.scale(
    (src.width / data.shape[-1]),
    (src.height / data.shape[-2])
)

out_image, out_transform = mask(src, geoms, crop=True)

no data values of the original raster

no_data=src.nodata

extract the values of the masked array

data = np.array(out_image.data)[0,:,:]

extract the row, columns of the valid values

row, col = np.where(data != no_data) ppt = np.extract(data != no_data, data)

T1 = out_transform * Affine.translation(0.5, 0.5) #0.5, 0.5 # reference the pixel centre rc2xy = lambda r, c: (c, r) * T1

d = gpd.GeoDataFrame({'col':col,'row':row,'ppt':ppt})

coordinate transformation

d['x'] = d.apply(lambda row: rc2xy(row.row,row.col)[0], axis=1) d['y'] = d.apply(lambda row: rc2xy(row.row,row.col)[1], axis=1)

geometry

d['geometry'] =d.apply(lambda row: Point(row['x'], row['y']), axis=1)

d.plot(column='ppt')

d.plot should return a plot with several points located within the EBA area. Instead, I get the following:

enter image description here

Any suggestions??? I will be doing this for 40 elevation band areas so need to automate in Python...

Vince
  • 20,017
  • 15
  • 45
  • 64
Kingle
  • 141
  • 5

1 Answers1

2

The solution was to resample the raster to a higher resolution so that the centroids of the raster grids were located within the polygon. My initial code wasn't performing the resampling properly for some reason.

To solve the problem, I replaced this script:

with rasterio.open(str(rasterpath+filename), crs="EPSG:4326") as src:
    data = src.read(
        out_shape=(
            src.count,
            int(src.height * upscale_factor),
            int(src.width * upscale_factor)
        ),
        resampling=Resampling.bilinear)

scale image transform

transform = src.transform * src.transform.scale(
    (src.width / data.shape[-1]),
    (src.height / data.shape[-2])
)

out_image, out_transform = mask(src, geoms, crop=True)

With this:

import rasterio
from rasterio import Affine, MemoryFile
from rasterio.enums import Resampling

use context manager so DatasetReader and MemoryFile get cleaned up automatically

@contextmanager def resample_raster(raster, scale=10): t = raster.transform

# rescale the metadata
transform = Affine(t.a / scale, t.b, t.c, t.d, t.e / scale, t.f)
height = raster.height * scale
width = raster.width * scale

profile = raster.profile
profile.update(transform=transform, driver='GTiff', height=height, width=width)

data = raster.read( # Note changed order of indexes, arrays are band, row, col order not row, col, band
        out_shape=(raster.count, height, width),
        #resampling=Resampling.bilinear,
    )

with MemoryFile() as memfile:
    with memfile.open(**profile) as dataset: # Open as DatasetWriter
        dataset.write(data)
        del data

    with memfile.open() as dataset:  # Reopen as DatasetReader
        yield dataset  # Note yield not return     


with rasterio.open(str(rasterpath+filename), crs="EPSG:4326") as src: with resample_raster(src) as resampled: print('Orig dims: {}, New dims: {}'.format(src.shape, resampled.shape)) print(repr(resampled))

    out_image, out_transform = mask(resampled, geoms3, crop=True)

Plotting the elevation band area and data together yields this:

enter image description here

I also experimented with all_touched=True, a parameter of rasterio mask function. But decided resampling the raster would reduce any introduced bias.

Kingle
  • 141
  • 5
  • Could you please tell me what did you assign for geoms3 , i would guess this is now multipolygon? This you also changed i assume. – Lusia Dec 14 '22 at 11:09