300 likes | 421 Views
This article discusses a practical SAS solution for transforming datasets using look-ahead and look-back techniques. Given an input dataset with ID and Measure, it demonstrates how to create a new table containing Next_Measure and Last_Measure fields. The process is broken down step-by-step, emphasizing the logic behind the data manipulation. The method leverages data steps to handle the transformation by using first and last observations to assign values. This tutorial is valuable for SAS users seeking to enhance their data management skills.
E N D
Look Both Ways Dr. Arthur TabachneckDirector, Data Management Note: program stolen from a SASOPEDIA article by Howard Schreierhttp://www.sascommunity.org/wiki/Look-Ahead_and_Look-Back
suppose you had the following data: data have; input ID $ Measure; cards; A 11 A 12 A 13 A 14 B 21 B 22 B 23 ;
and you needed to have the following table: data need; input ID $ MeasureNext_Measure Last_Measure; cards; A 11 12 . A 12 13 11 A 13 14 12 A 14 . 13 B 21 22 . B 22 23 21 B 23 . 22 ;
that is, with the following assignments made: ID MeasureNext_Measure Last_Measure -- ------------ --------------------- -------------------- A 11 12 . A 12 13 11 A 13 14 12 A 14 . 13 B 21 22 . B 22 23 21 B 23 . 22
that is, with the following assignments made: ID MeasureNext_Measure Last_Measure -- ------------ --------------------- -------------------- A 11 12 . A 1213 11 A 1314 12 A 14 13 B 21 22 . B 2223 21 B 23 22
that is, with the following assignments made: ID MeasureNext_Measure Last_Measure -- ------------ --------------------- -------------------- A 11 12 A 12 13 11 A 13 14 12 A 14 . 13 B 21 22 B 22 23 21 B 23 . 22
a data step solution data need; set have; by ID; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); run;
Why it works: Under the hood-Iteration #1: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A11 10 . . 0 1 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #1: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A 11 1 0 12 . 0 1 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #1: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A 11 1 0 12 . 0 1 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #2: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A1200 12 . 0 2 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #2: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A 12 0 0 13 . 0 2 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #2: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A 12 0 0 1311 0 2 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #3: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A13 00 13 11 0 3 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #3: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A 13 0 0 14 11 0 3 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #3: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A 13 0 0 14 12 0 3 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #4: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A14 01 14 12 0 4 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #4: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A 14 0 1 21 12 0 4 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #4: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ A 14 0 1 . 13 0 4 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #5: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ B21 10 . 13 0 5 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #5: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ B 21 1 0 22 . 0 5 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #5: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ B 21 1 0 22 . 0 5 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #6: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ B22 00 22 . 0 6 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #6: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ B 22 0 0 23 21 0 6 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #6: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ B 22 0 0 23 21 0 6 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #7: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ B23 01 23 21 0 7 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #7: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ B 23 0 1 11 22 0 7 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
Why it works: Under the hood-Iteration #7: ID Measure FIRST.ID LAST.ID Next_Measure Last_Measure _ERROR_ _N_ B 23 0 1 . 22 0 7 PDV ID Measure -- ------------ A 11 A 12 A 13 A 14 B 21 B 22 B 23 data need; set have; by ID; PutLog _all_ ; set have ( firstobs = 2 keep = Measure rename = (Measure = Next_Measure) ) have ( obs = 1 drop = _all_); PutLog _all_ ; Last_Measure = ifn( first.ID, (.), lag(Measure) ); Next_Measure = ifn( last.ID, (.), Next_Measure ); PutLog _all_ ; run;
ending up with the following table ID MeasureNext_Measure Last_Measure -- ------------ --------------------- -------------------- A 11 12 . A 12 13 11 A 13 14 12 A 14 . 13 B 21 22 . B 22 23 21 B 23 . 22
Questions? Your comments and questions are valued and encouraged. Contact the author: Dr. Arthur Tabachneck Director, Data Management Insurance Bureau of Canada Toronto, Ontario L3T 5K9 Email: atabachneck@ibc.ca