1 / 17

Introduction to Exploratory Descriptive Data Analysis in S-Plus

Introduction to Exploratory Descriptive Data Analysis in S-Plus. Jagdish S. Gangolly State University of New York at Albany. S-Plus in Unix & MS-Windows. To start S-Plus in Solaris/CDE: Create a directory, say, s. mkdir s Go to that directory cd s Initialise it as a new S-Plus chapter

gita
Download Presentation

Introduction to Exploratory Descriptive Data Analysis in S-Plus

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Exploratory Descriptive Data Analysis in S-Plus Jagdish S. Gangolly State University of New York at Albany

  2. S-Plus in Unix & MS-Windows • To start S-Plus in Solaris/CDE: • Create a directory, say, s. • mkdir s • Go to that directory • cd s • Initialise it as a new S-Plus chapter • splus CHAPTER • Start splus • splus

  3. S-Plus in Unix & MS-Windows • To invoke a graphics window: • Motif() • To invoke the help system (Java based): • Help.start() • To quit S-Plus shell: • Q() or Ctrl-D The S-Plus prompt is >

  4. Simple Structures I: Arithmetic Operators • Arithmetic Operators • *, /, +, and -. • Avoid amguity by using parantheses, eg., (7+2)*3, since 7+2*3=13 and not 27. • Multiplication and division are evaluated before addition & subtraction. Raising to a power (^ or **) takes precedence over everything else.

  5. Simple Structures II: Assignments • Assignments: X <- 3 or 3 -> x or x_3 or x=3 Not a good idea to use underscore for assignment or the equals sign. • To see the value of a variable x: X or print(x) • To remove a variable x: Rm(x)

  6. Simple Structures III: Concatenation • Concatenation: • Used to create vectors of any length > X <- c(1.5, 2, 2.5) > X 1.5 2.0 2.5 > X^2 2.25 4.00 6.25 .c can be used with any type of data

  7. Simple Structures IV: Sequence • Sequence command • Seq(lower, upper, increment) Some examples: seq(1,35,5):1 6 11 16 21 26 31 seq(5,15,1.5): 5 6.5 8.0 9.5 11 12.5 14.0 seq(50,25,-5): 50 45 40 35 30 25

  8. Simple Structures V: Replicate • Replicate command: to generate data that follow a regular pattern: Some examples: rep(8,5): 8 8 8 8 8 rep(“8”, 5): “8” “8” “8” “8” “8” rep(c(0,”ab”),2):“0” “ab” “0” “ab” rep(1:4, 1:4): 1 2 2 3 3 3 4 4 4 4 Rep(1:3, rep(2,3)): 1 1 2 2 3 3 Rep(c(1,8,7),length=5)):1 8 7 1 8

  9. Simple Structures VI: Expressions > X <- seq(2,10,2) > Y <- 1:5 > Z <- ((3*x^2+2*y)/((x+y)*(x-y)))^(0.5) > X 2 4 6 8 10 > Y 1 2 3 4 5 > Z 2.160247 2.081666 2.054805 2.041241 2.033060

  10. Simple Structures VI: Logical Operators • < Less Than • > Greater than • <= Less than or equal to • >= Greater than or equal to • == Equal to • != Not equal to

  11. Simple Structures VII Index Brackets: Square brackets are used to index vectors and matrices. > x <- seq(0,20,10) > x[2] 10 > x[5] NA > X[c(1,3)] 0 20 > X[-1] 10 20

  12. Data Manipulation I: Frames & matrices I • Matrices: two-dimensional vectors (have row and column indices • Arrays: General data structure in S-Plus • Zero-dimensional: scalar • One-dimensional: vector • Two-dimensional: matrix • Three to eight-dimensional: arrays • The data in a matrix must all be of the same datatype (usually numeric datatypes)

  13. Data Manipulation I: Frames & matrices II • The columns in dataframes can be of different datatypes • Lists: The most general datatype in S-Plus

  14. Data Manipulation I: Matrices I • Reading data • S-Plus is very finicky about format of input data • To read a table: • Read.table(“filename”) • The first column must be rownames • The first row must be column names • The top left cell must be empty • Space/tab the default column delimiters • See the example in /db4/teach/acc522/fasb103.txt and play around with it.

  15. Data Manipulation I: matrices II • Read.table and as.matrix(): x <- Read.table(“filename”) as.matrix(x) • Enter data directly: Matrix(data, nrow, ncol, byrow=F) Example: x <- Matrix(1:6, nrow=2, byrow=T) • dim(x): (2 X 3) • Dimnames(x): (NULL)

  16. Data Manipulation I: matrices III • Elements of matrices are accessed by specifying the row and column indices. Example: data <- c(227,8,1.3,1534,58,1.2,2365,82,1.8) dountries <- c(“austria”, “france”, “germany”) variables <- c(“gdp”, “pop”, “inflation”) country.data <- matrix(data,nrow=3,byrow=T) dimnames(country.data)<- list(countries,variables) Country.data[1:2,2:3]:pop and inflation of austria & france

  17. S-Plus Graphics I • To open a graphics window: motif() • You can adjust the color scheme and print options through the drop-down menu on the motif window. • To plot two variables x and y, plot(x,y) Example: (sine curve) plot(1:100, sin(1:100/10))

More Related