160 likes | 277 Views
Introduction to R. Tara Jensen National Center for Atmospheric Research Boulder, Colorado USA jensen@ucar.edu. R Exercises. Find sample data and R scripts at: ftp:// ftp.ncmrwf.gov.in/pub/outgoing/raghu/6WVMW/Tutorial/Day1/R-tutorial Download to directory on your computer Start R
E N D
Introduction to R Tara Jensen National Center for Atmospheric Research Boulder, Colorado USA jensen@ucar.edu
R Exercises • Find sample data and R scripts at: • ftp://ftp.ncmrwf.gov.in/pub/outgoing/raghu/6WVMW/Tutorial/Day1/R-tutorial • Download to directory on your computer • Start R • Open intro2R.2014wmo.R
What is R? • A statistical programming and graphics language • In part, developed from the S Programming Language from Bell Labs (John Chambers) • Created to: • Allow rapid development of methods for use in different types of data. • Require small amounts of system resources
Why R? • R ~ the dominant language in the statistical research community. • R is Open Source and free. • Runs on most operating systems • Nearly 2,400 packages contributed. • Packages and applications in nearly every field of science, business and economics. • See R Notes, R Journal and Journal of Statistical Software. www.jstatsoft.org • More than 100 books with accompanying code • Very large, active user base. • Many default parameters are chosen, but users retain complete control.
Why not R? • NCL, IDL, Matlab, SAS, … are all viable alternatives to R. If you are a part of an active community of researchers using another language, do likewise. • R may be limited by memory. For verification of large gridded datasets – consider using Model Evaluation Tools (MET) • R is does not produce a compiled executable so may not bedesirableto some operationalcenters
The R Community • Developers • R Core Group (20 members), only 2 have left since 1997 • Major update in April/October (freeze dates, beta versions, bug tracking, ...) • Mailing lists • Help list ~ 150 messages/day, archived, searchable. • http://www.r-project.org/mail.html • 5 International Conferences, 2 US, 1 China
Everything about R is at www.r-project.org • Source code • Binary compilations (Windows, Mac OS, Linux • Documentation ( Main documents, plus numerous contributed. Some in foreign languages.) • Newsletter (replaced by R Journal.) • Mailing list (Several search engines) • Packages on every topic imaginable • Wiki with examples • Reference list of books using R. ( more than 100) • Task Manager
Use R with scripts • In Linux - Emacs Speaks Statistics • Provides syntax-based • Object name completion • Key stroke short cuts • Command history • Alt-x R to invoke R with Xemacs. • In Windows, use editor • Added GUI features • <control> R sends a line or highlighted section into R. • Install package with GUIs • Save graphics by point and click. • Mac OS • Similar to Windows with advantages of system calls.
R Coding principles • Make verification code transparent and easy to read • Comment and document liberally • Archive your code • Share your code • Label and save your data • Share your data
Packages in R • Contributed by people world wide. • Allow scientists or statisticians to push their ideas. • Apply and extend R capabilities to meet the needs of specific communities. • Accompany many statistical textbooks • Accompany applied articles (Adrian Raftery, Doug Nychka, TilmanGneiting, Barbara Casati, Matt Briggs)
101 Windows or Mac Linux R Packages • Mirror must be selected • Packages -> Set CRAN mirror • chooseCRANmirror() • Packages must be installed to call • Packages -> Install Package(s) • install.packages(c("package 1","package 2","package 3", etc.)) • Packages must be loaded (aka called into use) • Packages -> Load Package(s) • library(“package1”) • library(“package2”) etc… • Base packages are installed by default • To see what packages are installed • Packages -> Load Package(s) • installed.packages(.Library, priority="package 1") • To see what packages are installed • remove.packages(package1,package2, lib=file.path("path to library" )
A sample of useful packages • verification • fields (spatial stats) • radiosondes • extRemes • BMA(Bayesian Model Averaging) • BMAensemble • circular • Rsqlite • SpatialVx • Rgis, spatstat (GIS) • ncdf( support for netcdf files ) • rgdal(support for grib1 files) • rNOMADS(support for grib2 files archived by NCEP) • Rcolorbrewer • randomForests
Very useful functions in R • q( ) – allows you to exit R – you will then be asked if you would like to save your workspace • ls( ) – shows you the objects in your workspace • rm( ) – allows you to remove an object • system( ) – allows you to call system command from R • help(package or function) – brings up help page • ?(package or function) – brings up a help page • read.fwf – read fixed width format data • read.table– read text file with delimiters
More useful functions • aggregate - applies a function to groups of data subset by categories. • apply - incredibly efficient in avoiding loops. Applies functions across dimensions of arrays. • %in% - returns logical showing which elements in A are in B. (e.gA%in%B) • table – create contingency table counts. • boot – apply bootstrap function correctly • par – control everything in a graph • pairs – the most under utilized plot – plots a matrix of 4 columns in a 4x4 plot layout • xyplot (in the lattice package) slightly advance graphic techniques
Windows or Mac Linux R Exercises • Find sample data and R scripts at: • ftp://ftp.ncmrwf.gov.in/pub/outgoing/raghu/6WVMW/Tutorial/Day1/R-tutorial • Download to directory on your computer • Start R • Click on on your desktop • type R at command line • Open intro2R.2014wmo.R • Select File -> Open Script -> select intro2R.2014wmo.R • Open in another window using your favorite editory