2

I have the following column:

Growth

0.1
0.5
0.23
.
0.5
0.67
0.23
0.42
.
.
.
0.12
0.20

Where dots indicate that there was no data available at that time.

Now I want to compute the difference in growth between the current case and the last available case. Thus, in the second last case this should become: 0.12 - 0.42 = -0.30.

Lag(growth) takes the last case, and returns nothing if it is not defined. So how do I get the last available case?

Response:

This seems to work:

define !lastAvailable(target !Tokens(1) /level !Tokens(1) /Varys !CMDEND)
   compute id= $casenum.
   dataset copy tmpset2.
   select if not miss(!target).
   compute diff= !target - lag(!target, !level).
   match files /file= * /file= tmpset2 /by id.
   exec.
   dataset close tmpset2.
!enddefine.

!lastAvailable target=growth level=3
Tom
  • 509
  • What are you really trying to do? Perhaps RMV (Transform>Replace Missing Values) would be useful here. – JKP Jan 08 '12 at 17:01
  • I'm afraid not, these values cannot be computed because they should not exist. In real life, there are no values at these intervals either. – Tom Jan 08 '12 at 17:02

1 Answers1

4
if not miss(lag(growth)) #x= lag(growth).
comp diff= growth-#x.
exec.

 .10
 .40
-.27
 .
 .27
 .17
-.44
 .19
 .
 .
 .
-.30
 .08

Addition. The "general" solution that you ask for in the comment and that should allow for arbitrary lag seems hard to achieve in the context of presence of missing values not following a specific pattern. So, the straightforward way would be to get rid of missing values temporarily; without them, the task becomes trivial.

comp id= $casenum. /*Create case identificator
dataset copy set2. /*Take a copy of the dataset
select if not miss(growth). /*Delete missing cases from original dataset
comp diff= growth-lag(growth,3). /*And compute difference with the lagged case (in this example, lag 3)
match files /file= * /file= set2 /by id. /*Insert back missing cases from set2
exec.
dataset close set2. /*Delete set2
ttnphns
  • 57,480
  • 49
  • 284
  • 501
  • What language is this? It does not seem to run in SPSS. Also, if lag is missing, how will it take the last available value? Did you understand my question? – Tom Jan 07 '12 at 14:34
  • This is clean SPSS syntax. I checked it on SPSS 20 and the result is what I perceive you wanted in your question. Now I show the resultunt column in my answer. – ttnphns Jan 07 '12 at 15:05
  • You are correct. I should have mentioned that I also need to be able to get the second last item, third last item, etc. I figured there was a general method for this. I understand why your logic works for the last available item, but can it be applied for the second last available item? – Tom Jan 07 '12 at 15:41
  • So, you want to pick, say, the 2nd (not the 1st) value above the current value and, if that value is missing, then pick the closest above it. Right? Then change both lag(growth) in the first command to lag(growth,2). Will that suit you now? – ttnphns Jan 07 '12 at 16:21
  • No, I want to get the second last available value. So, in my example it should return 0.12 - 0.23 = -0.11 for the second last row (the first last available value is 0.42, the second last available value 0.23 and the sixth last available value is 0.5). – Tom Jan 07 '12 at 16:26
  • I added another solution to the answer – ttnphns Jan 08 '12 at 08:40
  • That works just fine. Two remarks though: 1) Can you not simply temporarily copy the column? That seems more efficient than copying the entire dataset. 2) The current version also copies an id column into the original dataset. Would it be possible to put this logic in a custom function for future reference? I need to do this a lot. Something like comp diff = growth - getLastAvail(growth, 3). – Tom Jan 08 '12 at 11:26
  • SPSS copies the entire dataset to the new dataset, no other way round. Does that really bother you? 2) Dont't quite understand what you seek. ID variable is necessary both in the original dataset and in its copy, set2, to be able to insert back missing cases from set2 into the original set. As for creating custom function out of that syntax for future, you could create a macro (read DEFINE chapter in SPSS Command Reference Guide) or write command extensibility function in Python language which is an add-on to SPSS.
  • – ttnphns Jan 08 '12 at 12:57
  • P.S. You can always delete ID column afterwards if you don't want to keep it. DELETE VARIABLES id. – ttnphns Jan 08 '12 at 14:00
  • Thanks, I accepted your answer though I'm having problems with the macro. I cannot figure out how to make a macro return a value. – Tom Jan 08 '12 at 17:21
  • I added something I tried to the original post, which gives errors. I still don't know how to make a macro return a value though. – Tom Jan 08 '12 at 17:42
  • FYI I moved my macro questions to SO http://stackoverflow.com/questions/8780034/spss-macro-to-open-csv-from-variable – Tom Jan 08 '12 at 18:30