17

I'm a beginner with python. I would like to extract raster (.tif) values from a set of specific points that I have in a shapefile. I would like to have the same result as the QGIS Point Sampling Tool plugin. I can't install on Python libraries like qgis.core arcpy pygrass, but I have available GeoPandas GDAL Fiona Shapely rasterio and others.

How can I do this?

ndawson
  • 27,620
  • 3
  • 61
  • 85
kino
  • 179
  • 1
  • 1
  • 3

6 Answers6

33

I use rasterio and geopandas. My example uses UTM coordinates. Obviously these fields will depend on your particular shapefile. In my experience this produces indentical results to the QGIS Point Sampling Tool. I like this method because the resulting DataFrame of point and corresponding raster values is easy to analyze (e.g. compute the difference between the point and raster values) and then plot or export to some other tabular format (e.g. CSV).

import rasterio
import geopandas as gpd

# Read points from shapefile
pts = gpd.read_file('your_point_shapefile.shp')
pts = pts[['UTM_E', 'UTM_N', 'Value', 'geometry']]
pts.index = range(len(pts))
coords = [(x,y) for x, y in zip(pts.UTM_E, pts.UTM_N)]

# Open the raster and store metadata
src = rasterio.open('your_raster.tif')

# Sample the raster at every point location and store values in DataFrame
pts['Raster Value'] = [x for x in src.sample(coords)]
pts['Raster Value'] = probes.apply(lambda x: x['Raster Value'][0], axis=1)
Charlie Parr
  • 1,946
  • 15
  • 25
  • 12
    You can avoid the last line if you do pts['Raster Value'] = [x[0] for x in src.sample(coords)] in the line above. – martinfleis Dec 16 '19 at 11:20
  • 2
    If you don't have specific coordinates stored in your shapefile's fields, or just want more generic code, you can instead use: zip(pts.geometry.x, pts.geometry.y) (apologies for the lack of formatting, none of the instructions on formatting actually work when I use them) – Anders Mar 06 '20 at 15:28
  • 3
    thats awesome!! But what exactly is this probes? – Lenn Oct 17 '20 at 10:09
  • @Lenn Probes is the variable name for the point locations where I same the raster. In my case they were snow depth probes. – Charlie Parr Oct 19 '20 at 17:46
  • What interpolation is used? Or is it a nearest neighbor sampling? – user3496060 Mar 09 '22 at 01:37
  • Yes this is just reading the array value corresponding to the passed indices - no interpolation. see the docs https://github.com/rasterio/rasterio/blob/master/rasterio/sample.py – Charlie Parr Mar 09 '22 at 16:34
  • shapely can also be used instead of geopandas which can be difficult to install. If your shape has atributes x and y you could do: shp_file = shapefile.Reader(shp_filename) and then for i in range(len(shp_file)): record = shp_file.record(i) x[i] = record.x y[i] = record.y – Javgs Jun 01 '22 at 01:58
1

You can use the "Sample raster values" tool from the Processing toolbox. Like any of the Processing tools, if you run the tool from the GUI, and then look at the Processing history window you can then click the corresponding entry in the log, you can get an equivalent PyQGIS command which does the same operation:

enter image description here

ndawson
  • 27,620
  • 3
  • 61
  • 85
  • 1
    Like I said before I have problem to install Qgis's library, so I can't use this type of algorithm – kino Apr 02 '19 at 07:55
1

I am sure that it can be done in a more pythonic way, but I use a simple python script, which calls gdallocationinfo process and exports results to CSV (you must have GDAL installed). If you want to try, just copy the script below, fill in your sites list and path to your raster file.

import os
from subprocess import Popen,PIPE
import csv

sites= [ ['Point1',-15.39495,28.33711], ['Point2',-15.548307,28.248216] ] ## 'Name',Lat,Lon

rast = '/path/to/raster/my_raster.tif' param = 'Raster_VALUE' csvoutfl = 'sites.csv'

##==

scr = open(csvoutfl, 'w') header = 'site,lat,lon,{}'.format(",".join(param))

for i in sites: csvline = '{},{},{}'.format(i[0],i[1],i[2]) result = os.popen("gdallocationinfo -wgs84 -valonly {0} {1} {2}".format(rast, i[2], i[1])).read() try: result = float(result) except ValueError: result = 'Err' csvline += ',{}'.format(result) scr.write('{}\n'.format(csvline)) scr.close() print "\n\nCREATED: {}\n\n=== Finished ===".format(csvoutfl)

jurajb
  • 1,204
  • 11
  • 17
1

I found that the geopandas solution was quite slow as I needed to sample a lot of rasters. This numpy solution may be useful to others as it is much quicker when dealing with large numbers of points. It filters out points outside the raster extent and returns the coordinates within the extent and associated values.

import numpy as np
import rasterio

def sampleRaster(r,pts,band=1): """ Providing a rasterio raster and numpy 2-D array of coordinates, returns the values at that point. Assumes CRS of raster and coordinates are the same. Args: r: Rasterio raster pts: Numpy array of dimensions (n,2) band: The raster band number to be evaluated, defaults to 1 Example: r = rasterio.open("raster.tif") pts = np.genfromtxt("points.csv",delimiter=",",skip_header=1,usecols=(1,2)) rasterVals = sampleRaster(r,pts) """ ras = r.read(band) inPts = pts[np.where((pts[:, 0] >= r.bounds[0]) & (pts[:, 0] < r.bounds[2]) & (pts[:, 1] >= r.bounds[1]) & (pts[:, 1] < r.bounds[3]))] originX = r.bounds[0] originY = r.bounds[3] cellSize = r.transform[0] col = ((inPts[:, 0] - originX) / cellSize).astype(int) row = ((originY - inPts[:, 1]) / cellSize).astype(int) res = ras[row,col] return(inPts,res)

Liam G
  • 2,156
  • 11
  • 11
0

A bit late to the party, but based on the accepted answer I wrote a simple function for extracting raster values at point locations. Contrary to the accepted answer, it automatically closes the raster file by using a context manager (with ..):

def raster_values_at_points(raster_path: Path, points_gdf: gpd.GeoDataFrame, column_name: str) -> gpd.GeoDataFrame:
new_gdf = points_gdf.copy()  # do not change original gdf
coords = points_gdf.get_coordinates().values

with rasterio.open(raster_path) as src:
    new_gdf[column_name] = [x[0] for x in src.sample(coords)]

return new_gdf


raster_path = Path(r'your_raster_path.tif') points_gdf = gpd.read_file('your_point_shapefile.shp') raster_values_gdf = raster_values_at_points(raster_path, points_gdf, "rastervalue")

JohanB
  • 21
  • 3
0

I tried all above code but it not worked for me. I alway get the error such as: "AttributeError: 'numpy.ndarray' object has no attribute 'index'" or "AttributeError: 'numpy.ndarray' object has no attribute 'sample'" maybe I don't install enough package. and i find a code work for me, and i share it here.

    import rasterio
    import geopandas as gpd
# Load raster dataset
raster_path = 'path/to/your/raster.tif'
raster = rasterio.open(raster_path)

Load point shapefile

points_path = 'path/to/your/points.shp' points_gdf = gpd.read_file(points_path) def extract_raster_values_at_points(raster, points_gdf): values = []

for index, point in points_gdf.iterrows():
    x, y = point.geometry.x, point.geometry.y
    row, col = raster.index(x, y)  # Get the row and column index in the raster
    value = raster.read(1, window=((row, row+1), (col, col+1)))  # Read the value at the given location
    values.append(value[0, 0])  # Append the value to the list

return values

raster_values = extract_raster_values_at_points(raster, points_gdf)

Add the extracted values to the GeoDataFrame

points_gdf['raster_values'] = raster_values

Print the GeoDataFrame with the extracted values

print(points_gdf)

output_path = 'path/to/output/points_with_values.shp' points_gdf.to_file(output_path)

MrXsquared
  • 34,292
  • 21
  • 67
  • 117