1 / 24

PROC SQL: Tips and Translations for Data Step Users

PROC SQL: Tips and Translations for Data Step Users. By: Gail Jorgensen Susan Marcella. AGENDA. SQL Syntax Review Joins Translated SQL Strengths & Uses. PROC SQL: Tips and Translations for Data Step Users Syntax. Proc SQL; create table/view newdsname as

topaz
Download Presentation

PROC SQL: Tips and Translations for Data Step Users

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PROC SQL: Tips and Translations for Data Step Users By: Gail Jorgensen Susan Marcella

  2. AGENDA • SQL Syntax Review • Joins Translated • SQL Strengths & Uses

  3. PROC SQL: Tips and Translations for Data Step UsersSyntax Proc SQL; create table/view newdsnameas select var1, var2, … varN from dsname where condition ; Quit;

  4. PROC SQL: Tips and Translations for Data Step Users JOIN vs MERGE Types of JOINs • Inner Join – selects only matching records (same as: if ina and inb) • Outer Join – selects some non-matching records • Left Join – selects all records from first table, only matching records from second (same as: if ina) • Right join – selects all records from second table, only matching records from first (same as: if inb) -- Full join – selects all records from both tables (same as having no if statement)

  5. PROC SQL: Tips and Translations for Data Step UsersInner Join Dataset DS_A1 Dataset DS_BB data c; merge ds_a1(in=ina) ds_b(in=inb rename=(idno=idfld)); by idfld; if ina and inb; run; proc sql; create table ds_c as select ds_a1.*, ds_b.* from ds_a1, ds_b where ds_a1.idfld = ds_b.idno; quit;

  6. PROC SQL: Tips and Translations for Data Step UsersLeft Join Dataset inf_a Dataset inf_b proc sql; create table sql_left as select a.*, b.* from inf_a as a left join inf_b as b on a.idfld = b.idfld; quit; data ds_left; merge inf_a(in=ina) inf_b(in=inb); by idfld; if ina ; run; Dataset sql_left Dataset ds_left

  7. PROC SQL: Tips and Translations for Data Step UsersRight Join proc sql; create table sql_right as select a.*, b.* from inf_a as a right join inf_b as b on a.idfld = b.idfld; quit; data ds_right; merge inf_a(in=ina) inf_b(in=inb); by idfld; if inb; run;

  8. PROC SQL: Tips and Translations for Data Step Users Full Join CList07 CList08 procsql; create table sql_clist as select c7.name, c7.recd as recd07, c8.recd as recd08, c7.sent as sent07, c8.sent as sent08 from clist07 as c7 full join clist08 as c8 on c7.name=c8.name; quit; procsort data=clist07; by name; run; procsort data=clist08; by name; run; data data_clist; merge clist07 clist08 (rename=(recd=recd08 sent=sent08)); by name; run;

  9. PROC SQL: Tips and Translations for Data Step Users Full Join (Con’t) Sql_CList Data_CList

  10. PROC SQL: Tips and Translations for Data Step UsersHandling Duplicate Variable Names To always select the variable from one dataset: • Drop unwanted version of variable (PROC SQL permits all SAS dataset options) • Select variable from specific table To keep variable from both tables: • Rename the variable in one dataset To select variable based on value: • Use CASE statement

  11. PROC SQL: Tips and Translations for Data Step UsersCASE Statement procsql; create table NewCList as select case when missing(c7.name) then c8.name else c7.name end as name, c7.recd as recd07, c8.recd as recd08, c7.sent as sent07, c8.sent as sent08 from clist07 as c7 full join clist08 as c8 on c7.name=c8.name; quit; procsort data=clist07; by name; run; procsort data=clist08; by name; run; data data_clist; merge clist07 clist08 (rename=(recd=recd08 sent=sent08)); by name; run;

  12. PROC SQL: Tips and Translations for Data Step UsersCASE Statement - Results

  13. PROC SQL: Tips and Translations for Data Step UsersDown Calculations PROC SORT data=shs.exposure; by subject_id; run; DATA counters(KEEP=TableName MAXOBS TOTOBS); SET shs.exposure END=LAST; BY subject_id; length TableName $ 50; RETAIN MAXOBS OBSCNTR TOTOBS 0; TableName=“exposure"; TOTOBS+1; OBSCNTR+1; IF LAST.subject_id THEN DO; IF MAXOBS < OBSCNTR THEN MAXOBS=OBSCNTR; OBSCNTR=0; END; IF LAST THEN OUTPUT chemcnts; label maxobs='Maximum number of obs per person' totobs='Total Number obs in table'; run; procprint data=counters; run;

  14. PROC SQL: Tips and Translations for Data Step UsersDown Calculations procsql; create table sqlcounter as selectdistinct subject_id, count(*) as subjcnt from fshs.exposure group by subject_id; select “exposure" as TableName, max(subjcnt) as MaxObs, sum(subjcnt) as TotObs from sqlcounter; quit; sqlcounter

  15. PROC SQL: Tips and Translations for Data Step UsersCounts and Nesting Queries procsql; selectdistinct genre, count(*) from itunes groupby genre; quit; procsqloutobs=1; select (select count(*) from itunes) as TotalSongs, (select count(distinct genre) from itunes) as GenreCnt, (select count(distinct artist) from itunes) as ArtistCnt, (select count(distinct album) from itunes) as AlbumCnt from itunes; quit;

  16. PROC SQL: Tips and Translations for Data Step UserDictionaries proc sql; create view detail as select * from dictionary.columns ; create view extern as select * from dictionary.members ; create view tbl as select * from dictionary.tables ; create view gotem as select trim(libname) as LibName, trim(memname) as TableName, trim(name) as ColName, label as ColLabel from sashelp.vcolumn ; quit; SAS

  17. PROC SQL: Tips and Translations for Data Step UserDictionaries – Getting variable names proc sql; /* get names of all variables you want */ select name into:drinkvars separated by ', ' from dictionary.columns where libname=‘AUG' and memname='DEMOG' and lowcase(name) contains ‘ndrk'; /* use your newly created macro variable in your select statement */ create table drinks as select &drinkvars from aug.demog; quit;

  18. PROC SQL: Tips and Translations for Data Step UserDictionaries – Getting variable names procsql; /* add the table alias to the front of each variable name as you create your macro variable */ select'd.'||name into :aliasvars separatedby', ' from dictionary.columns where libname='AUG'and memname='DEMOG'and lowcase(name) contains ('ndrk'); /* do your merge or whatever using the macro variable you just created */ createtable newtable as select &aliasvars, c.expcategory from aug.demog as d leftjoin aug.exposure as c on d.jcml_id=c.jcml_id; quit;

  19. PROC SQL: Tips and Translations for Data Step UserViews • Views are ‘virtual tables’ • Created with CREATE VIEW statement • Can be used as if they are normal physical tables • Enhance security • can construct a view of only fields and rows that user is allowed to view • Enhance ease-of-use • Can combine rows and columns from multiple tables into a single view • Facilitate data integrity • Can have several views on the same table, but only have to update the base table • Users always see up-to-date data procsql; createview aug.testview as select d.subject_id, d.case_id, d.age, e.job_num, e.exposure_element from aug.demog as d, aug.exposure as e where d.subject_id=e.subject_id; quit;

  20. PROC SQL: Tips and Translations for Data Step Users Creating Data Source Indicators dads faminc proc sort data=dads out=sorted_dads; by famid; run; proc sort data=faminc out=sorted_faminc; by famid; run; data ds_fj; merge sorted_dads(in=in1) sorted_faminc(in=in2); by famid; if in1 and in2 then indic=1; else indic=0; dadind=in1; famind=in2; fid=famid; run; proc sql; create table sql_fj as select *, (dads.famid=faminc.famid) as indic, (dads.famid ~=.) as dadind, (faminc.famid ~=.) as famind, coalesce(dads.famid, faminc.famid) as fid from dads full join faminc on dads.famid=faminc.famid; quit;

  21. PROC SQL: Tips and Translations for Data Step UsersFull Join - cont Sql_fj Ds_fj

  22. PROC SQL: Tips and Translations for Data Step UsersAdditional Uses procsql; title'Bad Control Matches'; select c.subject_id, c.casenum, c.gender, age as CntlAge label='CntlAge', (select age from cases where subject_id=c.casenum) as CaseAge, abs(cntlage-calculated caseage) as AgeDiff from controls as c leftjoin demog as d on c.subject_id=d.subject_id where (not within5 and not within10); quit; Scenario: For a case/control study, verify that all controls have age within 5 to 10 years of the related case age.

  23. PROC SQL: Tips and Translations for Data Step UserMerging Multiple Tables proc sql; create table sql_c3 as select a.name, b.class, case when missing(c.grade1) then "missing 1“ when missing(c.grade2) then "missing 2“ when missing(c.grade3) then "missing 3“ when missing(c.grade4) then "missing 4" else "none missing“ end as miss_grade from indat_a as a, indat_b as b, indat_c as c where c.classid=b.classid and c.perid=a.perid; quit;

  24. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.

More Related