1

I am beginner in R and recently gone through few packages. For practice sessions I have a created a csv data set having 3 columns Part, Claimid and Cost. The dataset looks as below:-

Part  Claimid Cost
Part1 ID1 12
Part1 ID20 29
Part2 ID21 21
Part2 ID40 13
Part3 ID41 11
Part3 ID60 10

The cost column is a random number between 1 to 10 I am trying to run a loop for every Part (here 3 parts) and use dplyr package to create three distinct dataframes

library(dplyr)
claimid <- read.csv(file.choose(),header = TRUE)
plist <- unique(claimid$Part) ##Create the number of loops (Here 3)
  for (i in plist) {
      plist <- claimid %>% select(Part,Claimid) %>% filter(Part %in% i)
  }

I am getting the last 20 observations when I print plist because obviously R is saving the last observation of loop. Any help would be great to take me forward.

akrun
  • 789,025
  • 32
  • 460
  • 575
ARIMITRA MAITI
  • 163
  • 2
  • 3
  • 13

2 Answers2

1

We need to create a list to store the output if we are using the for loop. It is better to keep the data.frames in a list and not as three separate data.frame objects.

plist <- unique(claimid$Part) 
lst <- setNames(vector("list", length(plist)), plist)
 for (i in seq_along(plist)) {
   lst[[i]] <- claimid %>%
                  select(Part,Claimid) %>% 
                  filter(Part %in% plist[i])
}

But, this can be done more directly with lapply

lst1 <- lapply(plist, function(nm) claimid %>%
                                      select(Part, Claimid) %>%
                                      filter(Part %in% nm)
                      )

However, if we need to create three different data.frame objects, assign is the option (but not recommended)

for (i in plist) {
       assign(i, claimid %>% select(Part,Claimid) %>% filter(Part %in% i))
  }


Part1
#   Part Claimid
#1 Part1     ID1
#2 Part1    ID20

Part2
#   Part Claimid
#1 Part2    ID21
#2 Part2    ID40

 Part3
#   Part Claimid
#1 Part3    ID41
#2 Part3    ID60
akrun
  • 789,025
  • 32
  • 460
  • 575
  • Many Thanks "akrun". I got to learn seq_along function and assign function. Thank you for helping beginners like me. God bless. – ARIMITRA MAITI Aug 20 '16 at 08:36
  • Hi akrun, If I use this code claimid % filter(Part %in% i) %>% group_by(Part) %>% summarise(totcost = sum(Cost), claim = n_distinct(Claimid))) rbind(df,i) } Would I be able to append the datasets Part1, Part2 and Part3 together in a single dataset df? – ARIMITRA MAITI Aug 20 '16 at 09:11
  • @ARIMITRAMAITI If you started with a single dataset, I am not sure why you are splitting it to different datsets. – akrun Aug 20 '16 at 11:46
0

If I use this code `

claimid <- read.csv(file.choose(),header = TRUE) 
df <- data.frame(Part = character(),totcost = integer(),claim = integer(),stringsAsFactors = FALSE) 
plist <- unique(claimid$Part) 
for (i in plist) { assign(i, claimid %>% filter(Part %in% i) %>% group_by(Part) %>% summarise(totcost = sum(Cost), claim = n_distinct(Claimid))) 
rbind(df,i) }

Would I be able to append the data sets Part1, Part2 and Part3 together in a single data set df?

jogo
  • 12,306
  • 11
  • 34
  • 41
ARIMITRA MAITI
  • 163
  • 2
  • 3
  • 13