10

I am trying to understand how I can use the csv module in python to open a csv file in the same folder as the python script, and then create a shapefile using the shapefile module pyshp.

The csv file looks like this, but can have a couple of thousand rows of records:

id_nr;date;target;start_lat;start_lon
1;2012-05-21;navpoint 25x;55.123654;13.456954
1;2012-05-23;navpoint 11f;55.143654;12.456954
PolyGeo
  • 65,136
  • 29
  • 109
  • 338
kogia
  • 101
  • 1
  • 1
  • 3

3 Answers3

14

The pyshp module is a bit tricky to get the hang of, but really useful once you get it going. I've written a script that reads in a csv of the example data and writes out a shapefile with the data stored as attributes of the correct datatypes. The pyshp/xbase datatyping has always been tricky for me until I found this user guide for the xbase format and as a result of this question I have written a small note on my blog regarding the relevant pyshp datatypes, part of which I have pasted below:

  • C is ASCII characters
  • N is a double precision integer limited to around 18 characters in length
  • D is for dates in the YYYYMMDD format, with no spaces or hyphens between the sections.
  • F is for floating point numbers with the same length limits as N
  • L is for logical data which is stored in the shapefile's attribute table as a short integer as a 1 (true) or a 0 (false). The values it can receive are 1, 0, y, n, Y, N, T, F or the python builtins True and False

The full listing is as follows:

import shapefile as shp
import csv

out_file = 'GPS_Pts.shp'

#Set up blank lists for data
x,y,id_no,date,target=[],[],[],[],[]

#read data from csv file and store in lists
with open('input.csv', 'rb') as csvfile:
    r = csv.reader(csvfile, delimiter=';')
    for i,row in enumerate(r):
        if i > 0: #skip header
            x.append(float(row[3]))
            y.append(float(row[4]))
            id_no.append(row[0])
            date.append(''.join(row[1].split('-')))#formats the date correctly
            target.append(row[2])

#Set up shapefile writer and create empty fields
w = shp.Writer(shp.POINT)
w.autoBalance = 1 #ensures gemoetry and attributes match
w.field('X','F',10,8)
w.field('Y','F',10,8)
w.field('Date','D')
w.field('Target','C',50)
w.field('ID','N')

#loop through the data and write the shapefile
for j,k in enumerate(x):
    w.point(k,y[j]) #write the geometry
    w.record(k,y[j],date[j], target[j], id_no[j]) #write the attributes

#Save shapefile
w.save(out_file)

I hope this helps.

sgrieve
  • 3,726
  • 1
  • 19
  • 36
  • Very nice script. I got an error as it didn't read it as text so I changed this line: with open('input.csv', 'rt') as csvfile: – againstflow Aug 29 '14 at 12:45
  • 1
    I think you can improve the performance by using next(r) before the for loop to skip the header instead of checking using an if statement. – rovyko Nov 15 '17 at 18:18
  • @sgrieve - this script converts a csv with specific pre-determined fields. I'd like a generic script to convert any csv into a feature class. Perhaps there are useful arcpy functions to achieve this? – Waterman May 03 '18 at 00:23
2

As an alternative you do not need to hold the data in lists.

# import libraries
import shapefile, csv

# create a point shapefile
output_shp = shapefile.Writer(shapefile.POINT)
# for every record there must be a corresponding geometry.
output_shp.autoBalance = 1
# create the field names and data type for each.
# you can insert or omit lat-long here
output_shp('Date','D')
output_shp('Target','C',50)
output_shp('ID','N')
# count the features
counter = 1
# access the CSV file
with open('input.csv', 'rb') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    # skip the header
    next(reader, None)
    #loop through each of the rows and assign the attributes to variables
    for row in reader:
        id= row[0]
        target= row[1]
        date = row[2]
        # create the point geometry
        output_shp.point(float(longitude),float(latitude))
        # add attribute data
        output_shp.record(id, target, date)
        print "Feature " + str(counter) + " added to Shapefile."
        counter = counter + 1
# save the Shapefile
output_shp.save("output.shp")

You can find a working example of this implementation here.

Clubdebambos
  • 1,700
  • 1
  • 12
  • 19
0

I did not have success with any of the solutions here but I was able come up with a solution that worked using Python's shapely and fiona modules. It uses a tab-delineated .ascii file (my preference as opposed to .csv) but can easily be adapted to use a .csv as in the question posed. Hopefully this is helpful someone else trying to automate this same task.

# ------------------------------------------------------
# IMPORTS
# ------------------------------------------------------

import os import pandas as pd from shapely.geometry import Point, mapping from fiona import collection

------------------------------------------------------

INPUTS

------------------------------------------------------

Define path

path = os.path.abspath(os.path.dirname(file))

Set working directory

os.chdir(path)

Define file to convert

file = 'points.ascii'

Define shp file schema

schema = { 'geometry': 'Point', 'properties': { 'LocationID': 'str', 'Latitude': 'float', 'Longitude': 'float' } }

Read in data

data = pd.read_csv(file, sep='\t')

Define shp file to write to

shpOut = 'points.shp'

Create shp file

with collection(shpOut, "w", "ESRI Shapefile", schema) as output: # Loop through dataframe and populate shp file for index, row in data.iterrows():

    # Define point
    point = Point(row['Longitude'], row['Latitude'])
    # Write output
    output.write({
        'properties': {'LocationID': row['LocationID'], 'Latitude': row['Latitude'], 'Longitude': row['Longitude'] }, 
        'geometry': mapping(point)
    })

Casivio
  • 339
  • 1
  • 2
  • 10