2

At my internship, I have been asked to learn Python on the job. It's been a rough road never having coded before so please forgive me if my problem is elementary.

The Lightning CSV files I am trying to convert have over 400,000 rows.
The headers go as follows: mon, day, year, hr, min, sec, lat, lon, ht, type, Ip, Id

I found the answers to this question as potential solutions. I struggle to modify this code in order to convert the CSVs into shapefiles.

The path for the CSVs is C:\Users\zherran\Desktop\shp\enwest2015*.csv

The path where I want the shapefile C:\Users\zherran\Desktop\shp\shapefiles

import shapefile as shp
import csv

out_file = 'enwest3days.shp'

#Set up blank lists for data
mon,day,year,hr,min,sec,lat,lon,ht,type,Ip,Id=[],[],[],[],[],[],[],[],[],[],[],[]

# sample row mon,day,year,hr,min,sec,lat,lon,ht,type,Ip,Id
#            6,2,2015,7,27,16.6,41.5,-101.4,13789.5,1,-7900,26abb8a


#read data from csv file and store in lists
with open('enwest*.csv', 'rb') as csvfile:
    r = csv.reader(csvfile, delimiter=';')
    for i,row in enumerate(r):
        if i > 0: #skip header
            mon.append(int(row[0]))
            day.append(int(row[1]))
            year.append(int(row[2]))
            hr.append(int(row[3]))
            min.append(int(row[4]))
            sec.append(float(row[5]))
            lat.append(float(row[6]))
            lon.append(float(row[7]))
            ht.append(float(row[8]))
            type.append(int(row[9]))
            Ip.append(float(row[10]))
            Id.append(complex(row[11]))

#Set up shapefile writer and create empty fields
w = shp.Writer(shp.POINT)
w.autoBalance = 1 #ensures gemoetry and attributes match
w.field('X','F',10,8)
w.field('Y','F',10,8)
w.field('Date','D')
w.field('Target','C',50)
w.field('ID','N')
w.field('ID','N')
w.field('ID','N')
w.field('ID','N')
w.field('ID','N')
w.field('ID','N')
w.field('ID','N')

#loop through the data and write the shapefile
for j,k in enumerate(x):
    w.point(k,y[j]) #write the geometry
    w.record(k,y[j],date[j], target[j], id_no[j]) #write the attributes

#Save shapefile
w.save(out_file)
nmtoken
  • 13,355
  • 5
  • 38
  • 87
Z. Herran
  • 21
  • 5
  • 1
    You should add the code that you have written. – alphabetasoup Jun 09 '16 at 20:55
  • 2
    The reason I ask to see your code is that it sounds as if you are having difficulty with Python syntax and programming concepts generally: with just the posted code from someone else, we have no idea what you don't understand. – alphabetasoup Jun 09 '16 at 21:19
  • 3
    Do you have to use python? If this is just a task you need to get done you could try using GDAL/OGR commandline libraries. Specifically the ogr2ogr function, this post is extremely useful for what you've got: http://gis.stackexchange.com/questions/127518/convert-csv-to-kml-or-geojson-etc-using-ogr2ogr-or-equivalent-command-line-tool. If you are set on using Python, I would also recommend using the GDAL/OGR python modules. – Dave-Evans Jun 09 '16 at 21:42
  • 1
    As @Dave-Evans said, do you have your heart set on 'shapefile'? OGR is available for python and has a driver for CSV which would simplify all this code down to a few lines. – Michael Stimson Jun 09 '16 at 23:01
  • can you explain what is actually not working? – Ian Turton Jun 10 '16 at 08:41
  • @iant thank you for the edits! I just back to the office this morning and I am wrapping up the process of changing the variables in order to give the specifics. – Z. Herran Jun 10 '16 at 13:26
  • @Dave-Evans thank you for the suggestion! I will look into using the GDAL/OGR modules. I'll post an update when I get the chance to work with! – Z. Herran Jun 10 '16 at 13:30
  • @RichardLaw so after modifying the code (as seen above), what I am having difficulty understanding is how the pyshp datatype letters and corresponding numbers signify in the w.field portion of the code. – Z. Herran Jun 10 '16 at 14:08
  • FYI, I'm using Spyder (Python 3.5) – Z. Herran Jun 10 '16 at 14:11

1 Answers1

3

a) With the solution you use (Pyshp (shapefile), you need

1) to extract the fields of the csv file
2) to define the fields of the shapefile (w.field('Target','C',50))
3) to construct the geometry from the lon et lat fields ( w.point(float(i['x']),float(i['y'])) (see CSV to Shapefile )

b) With Fiona, it is easier but you have the same problem of field definition.
c) osgeo/ogr complicates the things...
d) Therefore, be modern, to convert directly csv files to shapefiles, the best solution is now Pandas, GeoPandas(uses Fiona) and Shapely. You do not have to worry about the fields and the fields definitions (pandas.DataFrame.from_csv).

from pandas import DataFrame
from geopandas import GeoDataFrame
from shapely.geometry import Point
# convert the csv file to a DataFrame
data = DataFrame.from_csv('enwest2015_1.csv', index_col=False)
# extract the geometry from the DataFrame
points = [Point(row['lon'], row['lat']) for key, row in data.iterrows()]
#convert the DataFrame to a GeoDataFrame 
geo_df = GeoDataFrame(data,geometry=points)
# save the resulting shapefile
geo_df.to_file('enwest2015_1.shp', driver='ESRI Shapefile') 

For the paths, it is a pure Python problem

gene
  • 54,868
  • 3
  • 110
  • 187