Removing polygons from GeoDataFrame which do not intersect with any polygons of other GeoDataFrame using Python

Question

I am looking for an efficient way to remove any polygons from a GeoDataFrame which do not intersect with any polygons of another GeoDataFrame. I tried it the following way which seems really inefficient.

import geopandas as gpd
gdf1 = gdp.read_file('./my_geodataframe1.gpkg')
gdf2 = gdp.read_file('./my_geodataframe2.gpkg')
gdf2_dissolved = gdf2.dissolve()
indices_to_remove = []
for index, row in gdf1.iterrows():
    polygon = row['geometry']
    if not gdf2_dissolved.intersects(polygon)[0]:
        indices_to_remove.append(index)

Is there any quicker way to do so?

You can try this one liner, but I think internally it's just a for loop, so may not be any quicker.
gdf1 = gdf1[gdf1.geometry.intersects(gdf2_dissolved)] — Shawn, Nov 03 '23 at 19:54

BERA · Accepted Answer · 2023-11-04T14:40:37.143

Spatial join is very fast. Try this:

import geopandas as gpd
import os
folder = r"/home/bera/Drives/data1/data/LMV/Topografi100/Topografi_100_Sverige_230721"
df1 = gpd.read_file(os.path.join(folder, "mark_sverige.gpkg"), layer="markframkomlighet")
df2 = gpd.read_file(os.path.join(folder, "naturvard_sverige.gpkg"), layer="skyddadnatur")
print(df1.shape)
#(3320, 5)
#Create an id column to filter by after the spatial join
df1["tempid"] = range(df1.shape[0])
#Spatial join with predicate "inner" to only keep features intersecting
matches = gpd.sjoin(left_df=df1, right_df=df2, how="inner").tempid #A Series of tempids in df1 intersecting df2
#Select, and drop tempid column
df1 = df1.loc[df1.tempid.isin(matches)].drop(columns="tempid")
print(df1.shape)
#(1623, 5)

gene · Answer 2 · 2023-11-04T14:14:26.973

There are many solutions, see Intersecting two shape problem using geopandas for example

Generally, iterrows should only be used in specific cases (How to iterate over rows in a DataFrame in Pandas)

In addition to Bera's solution, you can use the apply function and the simple unary_union property

gdf1 = gpd.read_file('stac1.shp')
gdf2 = gpd.read_file('stac2.shp')
uu = gdf1.unary_union
gdf2['inter'] = gdf2['geometry'].apply(lambda row: row.intersects(uu))
# polygons from gdf2 which do not intersect with any polygons of gdf1
print(gdf2.loc[gdf2.inter == 0]
   id  ori              geometry                                inter
2   3  gdf2  POLYGON ((-1.095 0.843, -0.376 0.760, -1.022 0...  False
3   2  gdf2  POLYGON ((-0.417 -0.616, 0.173 -0.912, -0.398 ...  False

Another solution is to use a simple for loop (without iterating over the GeoDataFrame):

 indices_to_remove = []
 for i,geom in enumerate(list(gdf2.geometry)):
      if not geom.intersects(uu):
           print(i)
           indices_to_remove.append(i)
 2
 3
 print(gdf2.iloc[indices_to_remove])
    id   ori           geometry
 2   3  gdf2  POLYGON ((-1.095 0.843, -0.376 0.760, -1.022 0...  
 3   2  gdf2  POLYGON ((-0.417 -0.616, 0.173 -0.912, -0.398 ...

Removing polygons from GeoDataFrame which do not intersect with any polygons of other GeoDataFrame using Python

2 Answers2