1

I have read several other posts about how to import csv files with read.csv but skipping specific columns. However, all the examples I have found had very few columns, and so it was easy to do something like:

 columnHeaders <- c("column1", "column2", "column_to_skip")
 columnClasses <- c("numeric", "numeric", "NULL")
 data <- read.csv(fileCSV, header = FALSE, sep = ",", col.names = 
 columnHeaders, colClasses = columnClasses)

I have 201 columns, without column labels. I would like to skip the last column. How would it be possible to do this without naming all the other columns to keep? Many thanks.

R. Schifini
  • 8,782
  • 2
  • 24
  • 31
dede
  • 1,079
  • 4
  • 14
  • 32

2 Answers2

0

Bit hacky but, I usually read in a small number of the rows of the dataset I want, then use sapply(..., class) to find the column types and set the last one to "NULL".

data<-read.table("test.csv", sep=',', nrows = 100)
colClasses<-sapply(data, class)
colClasses[length(colClasses)]<-"NULL"

Then you can pass this colClasses to your read.csv() function

Andrew Haynes
  • 2,514
  • 2
  • 22
  • 33
0

You can just read in all the data and then eliminate the offenders afterwards.

data <- read.csv("../CAASPP_clustering/ca2016_1_csv_v3.zip")
data_trimmed <- data[,1:(ncol(data)-1)]

If you prefer to screen the classes more programmatically then you could do something like this:

class_list <- lapply(data, class)
chosen_cols <- names(class_list[class_list != "NULL"])
data_trimmed <- data[chosen_cols]
leerssej
  • 12,592
  • 5
  • 45
  • 53