1 / 40

CCPR Computing Services Workshop: Introduction to Stata June, 2006

CCPR Computing Services Workshop: Introduction to Stata June, 2006. Outline. Stata Command Syntax Basic Commands Abbreviations Missing Values Combining Data Using do-files Basic programming Special Topics Getting Help Updating Stata. Stata Syntax. Basic command syntax:

duc
Download Presentation

CCPR Computing Services Workshop: Introduction to Stata June, 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CCPR Computing ServicesWorkshop: Introduction to StataJune, 2006

  2. Outline • Stata • Command Syntax • Basic Commands • Abbreviations • Missing Values • Combining Data • Using do-files • Basic programming • Special Topics • Getting Help • Updating Stata

  3. Stata Syntax • Basic command syntax: [by varlist:] command [varlist] [= exp] [if exp] [in range] [weighttype=weight] [, options] • Brackets = optional portions • Italics = user specified

  4. Stata Syntax, cont. • Complete syntax [by varlist:] command[varlist] [= exp] [if exp][in range] [weighttype=weight] [, options] • Example 1 (webuse union) • Stata Command: .bysort black: summarizeageif year >= 80, detail • Results: • Summarizes age separately for different values of black, including only observations for which year >= 80, includes extra detail.

  5. Stata Syntax, cont. • Complete syntax [by varlist:] command[varlist][= exp][if exp][in range] [weighttype=weight] [, options] • Example 2 (webuse union) • Stata Commands: .generateagelt30= age .replaceagelt30= 1if age < 30 .replaceagelt30= 0if age >= 30 & age < . • Result: • Variable agelt30 set equal to 1, 0, or missing • Generally [= exp] used with commands generate and replace

  6. Basic Commands – Load “auto” data and look at some vars • Load data from Stata’s website webuse auto.dta • Look at dataset describe • Summarize some variables codebook make headroom, header inspect weight length

  7. Basic Commands – Load “auto” data and look at some vars • Look at first and last observation list make price mpg rep78 if _n==1 list make price mpg rep78 if _n==_N • Summarize a variable in a table table foreign table foreign, c(mean mpg sd mpg)

  8. Keep/Save a Subset of the Data • “Keep” a subset of the variables in memory keep make headroom trunk weight length • List variables in current dataset • ds • List string variables in current dataset • ds, has(type string) • Save current dataset • save tempdata/myauto

  9. Generating New Variables • Create new variable = headroom squared generate headroom2 = headroom^2 • Generate numeric from string variable encode make, generate(makeNum) list make makeNum in 1/5 • Can’t tell it’s numeric, but look at “storage type” in describe: describe make makeNum

  10. Generating New Variables, cont. • Create categorical variable from continuous variable • “price” is integer-valued with minimum 3291 and max 15906 • Generate categorical version - Method 1: generate priceCat = 0 replace priceCat = 1 if price < 5000 replace priceCat = 2 if price >= 5000 & price < 10000 replace priceCat = 3 if price >= 10000 & price < .

  11. Generating New Variables, cont. • Generate categorical version of numerical variable: Method 2 generate priceCat2 = price recode priceCat2 (min/5000 = 1) (5000/10000=2) (10000/max=3) • Compare price, priceCat, and priceCat2 table price priceCat table priceCat priceCat2

  12. Variable Labels and Value Labels • Create a description for a variable: label variable priceCat “Categorical price" • Create labels to represent variable values: label define priceCatLabels 1 cheap 2 mid-range 3 expensive label values priceCat priceCatLabels • View results: describe list price priceCat in 1/10

  13. Reshape • Wide -> Long: reshape long uniqueschool author, i(year session order) j(count) • Long -> Wide: reshape wide author, i(year session order) j(count) Wide format: Long format:

  14. A few other commands • compress - saves data more efficiently • sort/ gsort • order • rename • more

  15. Abbreviations in Stata • Abbreviating command, option, and variable names • shortest uniquely identifying name is sufficient • Example: • Assume three variables are in use: make, price, mpg • “UN-abbreviated” Stata command: .summarize make price • AbbreviatedStata command: .su ma p • Exceptions • describe (d), list (l), and some others • Commands that change/delete • Functions implemented by ado-files

  16. Missing Values in Stata 8 and 9 • Stata 8 and later versions • 27 representations of numerical “missing” • ., .a, .b, … , .z • Relational comparisons • Biggest number < . < .a < .b < … < .z • Mathematical functions • missing + nonmissing = missing • String missing = • Empty quote: “”

  17. Missing Values in Stata - Pitfalls • Pitfall #1 • Missing values changed after Stata7: • Pitfall #2 • Do NOT: .replace weightlt200 = 0 if weight >= 200 • INSTEAD: .replace weightlt200 = 0 if weight >= 200 & weight < .

  18. Combining Data • Append vs. Merge • Append – two datasets with same variables, different observations • Merge – two datasets with same or related observations, different variables • Appending data in Stata • Example: append.do

  19. Combining Data- merge and joinby • Demonstrate with two sample datasets: • Neighborhood and County samples • One-to-one merge • onetoone.do • One-to-many merge – use match merge • onetomany.do • Many-to-many merge – use joinby • manytomany.do

  20. Combining Data • Variable _merge (generated by merge and joinby) • Pitfalls • pitfall_merge1.do: Merging unsorted data • pitfall_merge2.do : many-to-many using merge instead of joinby

  21. Do-files • What is a do-file? • Stata commands can be executed interactively or via a do-file • A do-file is a text file containing commands that can be read by Stata • Running a do-file within Stata .do dofilename.do

  22. Do-files • Why use a do-file? • Documentation • Communication • Reproduce interactive session? • Interactive vs. do-files • Record EVERYTHING to recreate results in your do-file!

  23. Do-files > Header, Version Control • Header • Include in do-files – name, project, project location, date, purpose, inputs, outputs, special instructions • Version Control • include version at top of do-file • Why? • Example: • Under version 7, .==.a==.b==….==.z

  24. Do-files > Comments • Comments • Lines beginning with * will be ignored • Words between // and end of line will be ignored • Spanning commands over two lines: • Words between /* and */ will be ignored, including end of line character • Words between /// and beginning of next line will be ignored

  25. Do-file > End of Line Character • Commands requiring multiple lines • delimit ; • This command tells Stata to read semi-colons as the end-of-line character instead of the carriage return • Comment out the carriage return with • /* at the end of line and */ at the beginning of next • Comment out the carriage return with ///

  26. Do-files > Examples webuse auto, clear *this is a comment #delimit ; summarize price mpg rep78 headroom trunk weight; #delimit cr summarize price mpg rep78 headroom trunk weight //this is a comment summarize price mpg rep78 /// headroom trunk weight summarize price mpg rep78 /* */ headroom trunk weight

  27. Saving output • Work in do-files and log your sessions! • log using filename • replace, append • log close • Output choices: • *.log file - ASCII file • *.smcl file - nicer format for viewing and printing in Stata

  28. Saving Output, cont. • Graphs are not saved in log files • Use “saving” option of graph commands • saving(graph.ext) • Export current graph: • graph export graph.ext • Ex: graph export graph.eps • Supported formats: • .ps, .eps, .wmf, .emf .pict

  29. Example using local macro . local mypath "C:\Documents and Settings\MyStata" . display `mypath' C:\Documents invalid name r(198); . display C:\Documents and Settings\MyStata C:\Documents invalid name r(198); . display "`mypath'" C:\Documents and Settings\MyStata

  30. Example– foreach, return, display *see samplePrograms.do, runLoop foreach var of varlist tenure-lnwage { quietly summarize `var' local varmean = r(mean) display "Variable `var' has mean `varmean’ " }

  31. Example using forvalues, display *see samplePrograms.do, runCount forvalues counter = 1/10 { display `counter' } forvalues counter = 0(2)10 { display `counter' }

  32. Example: forvalues, generating random variables *see samplePrograms.do, runRandomGen forvalues j = 1/3 { generate x`j' = uniform() generate y`j' = invnormal(uniform()) } foreach x of varlist x1-x3 y1-y3 { summarize `x' }

  33. Example – if/else *see samplePrograms.do, runIfElse foreach var of varlist tenure-ln_wage { quietly summarize `var' local varmean = r(mean) if `varmean' > 10 { display "`var' has mean greater than 10" } else { display "`var' has mean less than 10" } }

  34. Special Topic: regular expressions • webuse auto • List all values of make starting with a capital and containing an additional capital: list make if regexm(make, "^[A-Z].+[A-Z].+") • AND ending in a number list make if regexm(make, "^[A-Z].+[A-Z].+[0-9]+$")

  35. Special Topic: accessing data in another database • odbc list • odbc query testStata • odbc query testStata • odbc desc "Summary2006$“ • odbc load year type session order author1 author2, table("Summary2006$") dsn("testStata")

  36. Special Topic: Exporting results using outreg • User-written program called outreg • From within Stata, type findit outreg • Very simple!! • Basically add one line of code after each regression to export results • For an example of code, see http://www.ats.ucla.edu/stat/stata/faq/outreg.htm

  37. Getting Help in Stata • help command_name • abbreviated version of manual • search • search keywords, local • search keywords, net • search keywords, all • findit keywords • same as search keywords, all • Search Stata Listserver and Stata FAQ

  38. Stata Resources • www.stata.com > Resources and Support • Search Stata Listserver • Search Stata (FAQ) • Stata Journal (SJ) • articles for subscribers • programs free • Stata Technical Bulletin (STB) • replaced with the Stata Journal • Articles available for purchase, programs free • Courses (for fee)

  39. Updating Stata • help update • update all

  40. Questions/feedback

More Related