1

I am storing the result of a PostgreSQL query as a data.frame in R. One of the "columns" is an integer[] array type. In R, this is imported as a character string by the RPostgreSQL package.

How do I convert the character string type into a numeric list type column (or as separate numeric columns) in my data.frame?

connect and get data

require(RPostgreSQL)
drv = dbDriver("PostgreSQL")
con = dbConnect(drv, host = ..., post =..., dbname =..., user=..., password=...)
df = dbGetQuery(con, query_string)
dbDisconnect(con)

minimum working example

df = data.frame(id = c(1:100), arrcol = c(rep(paste0("{{",paste0(1:99,collapse=","),"}}"),10)))

remove brackets

df$arrcol = gsub(fixed=T, "{", "", df$arrcol)
df$arrcol = gsub(fixed=T, "}", "", df$arrcol)

convert to numeric list

# Attempt 1: 
df$arrcol = as.numeric(df$arrcol)
# Error: (list) object cannot be coerced to type 'double'

# Attempt 2:
df$arrcol = lapply(df$arrcol, function(x) strsplit(x, ",", fixed=T))
# no error, but now the data appears to be stored as a list of character lists: 
# arrcol[1]: list(c("1", "2", "3", "4", "5",...

# Attempt 3:
df$arrcol = lapply(df$arrcol, function(x) as.numeric(unlist(strsplit(x, ",", fixed=T))))
# this one seems to work
Brian D
  • 2,388
  • 24
  • 41
  • there is an issue filed with the RPostgreSQL package repository but I know outside linking is discouraged: https://github.com/codeinthehole/rpostgresql/issues/5 – Brian D Dec 05 '16 at 20:51

1 Answers1

0

My own best answer:

df$arrcol = lapply(df$arrcol, function(x) as.numeric(unlist(strsplit(x, ",", fixed=T))))

Alternatively, (as long as each of the arrays has the same length) you could use this trick (Splitting a dataframe string column into multiple different columns) to parse the string into separate columns. Note that read.table is clever enough to recognize each of the new variables as integers.

newdf = read.table(text = df$arrcol, header = F, sep = ",")

Additionally, you can easily append those as their own columns onto the original data.frame:

df = cbind(df, newdf)
Community
  • 1
  • 1
Brian D
  • 2,388
  • 24
  • 41