2

I have a pandas dataframe that contains information to construct (poly)lines, and I want to use shapely & geopandas tools to make a SHP.

In the example below, I have 3 lines differentiated by "myid" and the order of the vertices is in "myorder."

Making shapefile from Pandas dataframe? is a great explanation for making a point shapefile, but I am looking for a polyline SHP. Creating Shapely LineString from two Points let's me know I need to use from shapely.geometry import LineString to make the polylines, but I don't understand from the answer there (nor the shapely documentation) how to specify groupby("myid") and sortby("myorder").

How would I do this?

Using Windows 10, Python 3.7.6, Conda 4.6.14.

myid = [1, 1, 1, 2, 2, 3, 3]
myorder = [1, 2, 3, 1, 2, 1, 2]
lat = [36.42, 36.4, 36.4, 36.49, 36.48, 36.39, 36.39]
long = [-118.11, -118.12, -118.11, -118.09, -118.09, -118.10, -118.11]
df = pd.DataFrame(list(zip(myid, myorder, lat, long)), columns =['myid', 'myorder', 'lat', 'long']) 
display(df)

enter image description here

a11
  • 940
  • 10
  • 22

1 Answers1

2

You can do this with geopandas by building a geodataframe, then sorting and grouping and applying a lambda to build the lines.

import pandas as pd
import geopandas as gpd
from shapely.geometry import LineString

myid = [1, 1, 1, 2, 2, 3, 3] myorder = [1, 2, 3, 1, 2, 1, 2] lat = [36.42, 36.4, 36.4, 36.49, 36.48, 36.39, 36.39] long = [-118.11, -118.12, -118.11, -118.09, -118.09, -118.10, -118.11] df = pd.DataFrame(list(zip(myid, myorder, lat, long)), columns =['myid', 'myorder', 'lat', 'long'])

Convert to GeoDataFrame

gdf = gpd.GeoDataFrame( df, geometry=gpd.points_from_xy(df['long'], df['lat']))

display(gdf)

Sort and group points to make lines

line_gdf = gdf.sort_values(by=['myorder']).groupby(['myid'])['geometry'].apply(lambda x: LineString(x.tolist())) line_gdf = gpd.GeoDataFrame(line_gdf, geometry='geometry')

display(line_gdf)

Write out

line_gdf.to_file("lines.shp")

enter image description here

user2856
  • 65,736
  • 6
  • 115
  • 196
  • This is perfect, thank you. The only thing I had to add was line_gdf.crs = "EPSG:4326" before exporting to SHP – a11 Jun 26 '20 at 20:11