0

I would like to create a new column esg.ordered$flowpct <- esg.ordered$flow[i]/lag(esg.ordered$size[i]) for my data frame esg only if the value (/name) in a certain row has the same value (/name) as in the previous row in column fundid. Otherwise the value in column flowpct should have "NA" in the respective rows. Here is my code:

for (i in esg.ordered) {
  if(esg.ordered$fundid[i]==lag(esg.ordered$fundid[i],n=1)){
    esg.ordered$flowpct <- esg.ordered$flow[i]/lag(esg.ordered$size[i])
  }else{
    esg.ordered$flowpct <- "NA"
  }
}

Unfortunately, I get two mistakes:

  1. Error in if (esg.ordered$fundid[i] == lag(esg.ordered$fundid[i], n = 1)) { : missing value where TRUE/FALSE needed

  2. Warning: In if (esg.ordered$fundid[i] == lag(esg.ordered$fundid[i], n = 1)) { : the condition has length > 1 and only the first element will be used

Can you guys help me solving these mistakes?

Here is the data

fundid size flow
FS00008KNP 78236537 7038075.43
FS00008KNP 73048868 -5691940.56
FS00008KNP 74688822 -193188.79
FS00008KNP 95330799 11991514.11
FS00008L0W 44170465 -15706588.66
FS00008L0W 33278560 -12749545.90
FS00008L0W 26084262 -6879079.19
FS00008L0W 23857701 -3227825.03
CodingGirl
  • 13
  • 4
  • Can you provide us with a [Minimal Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example)? – DeBARtha Sep 15 '21 at 10:10
  • Welcome to SO. You also appear to be attempting to store both chracater and numeric values in `flowpct`. That's likely to cause problems later on. Some sample data would help us to help you. [This post](https://stackoverflow.com/help/minimal-reproducible-example) may help you create a minimum working example. Finally, a good rule of thumb when using R is "if I'm using a loop, there's probably a better way to do it. This, I suspect, is a case in point. – Limey Sep 15 '21 at 10:12

1 Answers1

0
  1. "NA" is not the same as NA (which might be appropriate there.

  2. for (i in esg.ordered) is wrong: it is iterating over each column in your frame named esg.ordered, so i is a full vector. I think you mean for (i in seq_len(nrow(esg.ordered))).

  3. The error missing value where TRUE/FALSE needed is easily searched and should return (among other links) Error in if/while (condition) {: missing Value where TRUE/FALSE needed. It is because the conditional in if is returning NA.

  4. You appear to be doing something on a whole vector at a time, this is a literal translation of what you are trying to do (but without the for loop):

    esg.ordered$flowpct <- ifelse(
      c(TRUE, esg.ordered$fundid[-1] == esg.ordered$fundid[-nrow(esg.ordered)]),
      esg.ordered$flow / c(NA, esg.ordered$size[-nrow(esg.ordered)]),
      NA)
    esg.ordered
    #       fundid     size        flow      flowpct
    # 1 FS00008KNP 78236537   7038075.4           NA
    # 2 FS00008KNP 73048868  -5691940.6 -0.072752972
    # 3 FS00008KNP 74688822   -193188.8 -0.002644651
    # 4 FS00008KNP 95330799  11991514.1  0.160552996
    # 5 FS00008L0W 44170465 -15706588.7           NA
    # 6 FS00008L0W 33278560 -12749545.9 -0.288644140
    # 7 FS00008L0W 26084262  -6879079.2 -0.206712045
    # 8 FS00008L0W 23857701  -3227825.0 -0.123746075
    

    However, the relies wholly on fundid being ordered correctly. I think a safer way to go is this, using ave to get the last size within the current fundid, and then dividing:

    esg.ordered$flowpct <- with(esg.ordered,
      flow / ave(size, fundid, FUN = function(z) c(NA, z[-length(z)])))
    

    Same results as above, but much safer.

r2evans
  • 108,754
  • 5
  • 72
  • 122
  • Thank you for helping me :) trying my best to understand your suggestions and I will give you my data in a sec – CodingGirl Sep 15 '21 at 12:08
  • See the edit. This is still a dupe of the two links provided (the error and the warning), I suggest you click on the dupe-links and read up on several of the excellent answers/comments about the problems. As for the calculation itself, I offer two suggestions, I urge you to use the second (`ave`) as it is much more robust to ordering of the data (i.e., `fundid` does not need to be contiguous/ordered). – r2evans Sep 15 '21 at 13:33
  • 1
    Thank you so much again, I have tried your suggestion and it works! You really saved the day!! :) – CodingGirl Sep 15 '21 at 13:39
  • Since you're new ... even though the question was closed as a duplicate of other questions, you may still [accept](https://stackoverflow.com/help/someone-answers) the answer; the impetus for closing it is to allow the well-detailed answers (in the links) to stay at the top of the search-list instead of being diluted by identical questions. There is no rush to answer, but when you are certain it addresses your question, please remember to come back and accept it. Thanks! – r2evans Sep 15 '21 at 13:51