500 likes | 576 Views
Dr Yasser Fouad. STRUCTURE OF PROGRAMMING LANGUAGES. Book. Quote of the Day. “A language that doesn't affect the way you think about programming, is not worth knowing.” - Alan Perlis. You work in a little web search company.
E N D
Dr Yasser Fouad STRUCTURE OF PROGRAMMING LANGUAGES
Quote of the Day “A language that doesn't affect the way you think about programming, is not worth knowing.” - Alan Perlis
You work in a little web search company Your boss says: “We will conquer the world only if our search box answers all questions the user may ask.” So you build gcalc: CS212 helps you answer questions your boss really cares about.
Then you decide to get a PhD You get tired of the PowerPoint and its animations. You embed a domain-specific language (DSL) into Ruby. …
Reasons for Studying Concepts of Programming Languages • Increased ability to express ideas • Improved background for choosing appropriate languages • Increased ability to learn new languages • Better understanding of significance of implementation • Overall advancement of computing
How is this class different? It’s about: • foundations of programming langauges • but also how to design your own languages • how to implement them • and about PL tools, such as analyzers • Ans also learn about some classical C.S. algorithms.
Why a developer needs PL New languages will keep coming • Understand them, choose the right one. Write code that writes code • Be the wizard, not the typist. Develop your own language. • Are you kidding? No. Learn about compilers and interpreters. • Programmer’s main tools.
Overview • how many languages does one need? • how many languages did you use? Let’s list them here:
Develop your own language Are you kidding? No. Guess who developed: • PHP • Ruby • JavaScript • perl Done by smart hackers like you • in a garage • not in academic ivory tower Our goal: learn good academic lessons • so that your future languages avoid known mistakes
Programming Domains • Scientific applications • Large number of floating point computations • Fortran • Business applications • Produce reports, use decimal numbers and characters • COBOL • Artificial intelligence • Symbols rather than numbers manipulated • LISP • Systems programming • Need efficiency because of continuous use • C • Web Software • Eclectic collection of languages: markup (e.g., XHTML), scripting (e.g., PHP), general-purpose (e.g., Java)
Programming Methodologies Influences • 1950s and early 1960s: Simple applications; worry about machine efficiency • Late 1960s: People efficiency became important; readability, better control structures • structured programming • top-down design and step-wise refinement • Late 1970s: Process-oriented to data-oriented • data abstraction • Middle 1980s: Object-oriented programming • Data abstraction + inheritance + polymorphism
Language Categories • Imperative • Central features are variables, assignment statements, and iteration • Examples: C, Pascal • Functional • Main means of making computations is by applying functions to given parameters • Examples: LISP, Scheme • Logic • Rule-based (rules are specified in no particular order) • Example: Prolog • Object-oriented • Data abstraction, inheritance, late binding • Examples: Java, C++ • Markup • New; not a programming per se, but used to specify the layout of information in Web documents • Examples: XHTML, XML
Program • A program is a machine-compatible representation of an algorithm • If no algorithm exists for performing a task, then the task can not be performed by a machine • Programs and algorithms they represent collectively referred to as Software
Favorite programming language June 2012 • Python (3,054) • Ruby (1,723) • JavaScript (1,415) • C (970) • C# (829) • PHP (666) • Java (551) • C++ (529) • Haskell (519) • Clojure (459) • CoffeeScript (362) • Objective C (326) • Lisp (322) • Perl (311) • Scala (233) • Scheme (190) • Other (188) • Erlang (162) • Lua (145) • SQL (101)
job listings collected from Dice.com • Python 3,456 (+32.87%) • Ruby 2,141 (+39.03%) • HTML5 (+276.85%) • Flash 1,261 (+95.2%) • Silverlight 865 (-11.91%) • COBOL 656 (-10.75%) • Assembler 209 (-1.42%) • PowerBuilder (-18.71%) • FORTRAN 45 (-33.82%) • Java 17,599 (+8.96%) • XML 10,780 (+11.70%) • JavaScript (+11.64%) • HTML 9,587 (-1.53%) • C# 9,293 (+17.04%) • C++ 6,439 (+7.55%) • AJAX 5,142 (+15.81%) • Perl 5,107 (+3.21%) • PHP 3,717 (+23%)
ENIAC (1946, University of Philadelphia) ENIAC program for external ballistic equations:
ENIAC (1946, University of Philadelphia) • programming done by • rewiring the interconnections • to set up desired formulas, etc • Problem (what’s the tedious part?) • programming = rewiring • slow, error-prone • solution: • store the program in memory! • birth of von Neuman paradigm
Assembly – the language (UNIVAC 1, 1950) Idea: mnemonic (assembly) code • Then translate it to machine code by hand (no compiler yet) • write programs with mnemonic codes (add, sub), with symbolic labels, • then assign addresses by hand Example of symbolic assembler clear-and-add a add b store c translate it by hand to something like this (understood by CPU) B100 A200 C300
Assembly Language ADDI R4 R2 21 ADDI R4,R2,21 10101100100000100000000000010101 • Use symbols instead of binary digits to describe fields of instructions. • Every aspect of machine visible in program: • One statement per machine instruction. • Register allocation, call stack, etc. must be managed explicitly. • No structure: everything looks the same.
Assembler – the compiler (Manchester, 1952) • a loop example, in MIPS, a modern-day assembly code: loop: addi $t3, $t0, -8 addi $t4, $t0, -4 lw $t1, theArray($t3) # Gets the last lw $t2, theArray($t4) # two elements add $t5, $t1, $t2 # Adds them together... sw $t5, theArray($t0) # ...and stores the result addi $t0, $t0, 4 # Moves to next "element“ # of theArray blt $t0, 160, loop # If not past the end of # theArray, repeat jr $ra
High-level Language • Provides notation to describe problem solving strategies rather than organize data and instructions at machine-level. • Improves programmer productivity by supporting features to abstract/reuse code, and to improve reliability/robustness of programs. • Requires a compiler.
FORTRAN I (1954-57) Langauge, and the first compiler • Produced code almost as good as hand-written • Huge impact on computer science • Modern compilers preserve its outlines By 1958, >50% of all software is in FORTRAN
FORTRAN I Example: nested loops in FORTRAN • a big improvement over assembler, • but annoying artifacts of assembly remain: • labels and rather explicit jumps (CONTINUE) • lexical columns: the statement must start in column 7 • The MIPS loop from previous slide, in FORTRAN: DO 10 I = 2, 40 A[I] = A[I-1] + A[I-2] 10 CONTINUE
Side note: designing a good language is hard Good language protects against bugs, but lessons take a while. An example that caused a failure of a NASA planetary probe: buggy line: DO 15 I = 1.100 what was intended (a dot had replaced the comma): DO 15 I = 1,100 because Fortran ignores spaces, compiler read this as: DO15I = 1.100 which is an assignment into a variable DO15I, not a loop. This mistake is harder to make (if at all possible) with the modern lexical rules (white space not ignored) and loop syntax for (i=1; i < 100; i++) { … }
Goto considered harmful L1: statement if expression goto L1 statement Dijkstra says: gotos are harmful • use structured programming • lose some performance, gain a lot of readability how do you rewrite the above code into structured form?
Evolution of Programming Languages • ALGOL - 60 (ALGOrithmic Language) Goals : Communicating Algorithms Features : Block Structure (Top-down design) Recursion (Problem-solving strategy) BNF - Specification • LISP (LISt Processing) • Goals : Manipulating symbolic information • Features : List Primitives • Interpreters / Environment
Evolution of Programming Languages • Pascal • Goal : Structured Programming, Type checking, • Compiler writing. • Features : • Rich set of data types for efficient • algorithm design • E.g., Records, sets, ... • Variety of “readable” single-entry • single-exit control structures • E.g., for-loop, while-loop,... • Efficient Implementation • Recursive descent parsing
Other Languages • Functional • LISP, Scheme • ML, Haskell • Logic • Prolog • Object-oriented • Smalltalk, SIMULA, Modula-3, Oberon • C++, Java, C#, Eiffel, Ada-95 • Hybrid • Python, Ruby, Scala • Application specific languages and tools
Programming Languages 3/4 • C • Bell labs Dennis Ritchie, 1973 • C++ • Bjarne Stroustrup, 1980 • Hybrid OOP • Java • Sun Microsystems (formally announced in May 1995) • Pure OOP • Web programming
Designed for gluing applications : flexibility Interpreted Dynamic typing and variable creation Data and code integrated : meta-programming supported Examples: PERL, Tcl, Python, Ruby, PHP, Scheme, Visual Basic, Scala, etc. Designed for building applications : efficiency Compiled Static typing and variable declaration Data and code separated : cannot create/run code on the fly Examples: PL/1, Ada, Java, C, C++, C#, Scala, etc. Scripting vs Systems Programming Languages
Current Trend • Multiparadigm languages • Functional constructs for programming in the small • Focus on conciseness and correctness • Object-Oriented constructs for programming in the large • Focus on programmer productivity and code evolution • Example languages • Older: Python, Ruby, • Recent: Scala, F#, etc
Scheme (dialect of LISP) • Recursive definitions • Symbolic computation : List Processing • Higher-order functions • Dynamic type checking • Functional + Imperative features • Automatic storage management • Provides a uniform executable platform for studying, specifying, and comparing languages.
Java vs Scala //Java - what we're used to seeing public String buildEpochKey(String... keys) { StringBuilder s = new StringBuilder("elem") for(String key:keys) { if(key != null) { s.append(".") s.append(key) } } return s.toString(). toLowerCase() }
Java vs Scala //Scala def buildEpochKey(keys: String*): String = { ("elem" +: keys) filter(_ != null) mkString(".") toLowerCase }
Implementation Methods • Compilation • Programs are translated into machine language • Pure Interpretation • Programs are interpreted by another program known as an interpreter • Hybrid Implementation Systems • A compromise between compilers and pure interpreters
Compilation • Translate high-level program (source language) into machine code (machine language) • Slow translation, fast execution • Compilation process has several phases: • lexical analysis: converts characters in the source program into lexical units • syntax analysis: transforms lexical units into parse trees which represent the syntactic structure of program • Semantics analysis: generate intermediate code • code generation: machine code is generated
Additional Compilation Terminologies • Load module (executable image): the user and system code together • Linking and loading: the process of collecting system program and linking them to user program
Pure Interpretation • No translation • Easier implementation of programs (run-time errors can easily and immediately displayed) • Slower execution (10 to 100 times slower than compiled programs) • Often requires more space • Becoming rare on high-level languages • Significant comeback with some Web scripting languages (e.g., JavaScript)
Hybrid Implementation Systems • A compromise between compilers and pure interpreters • A high-level language program is translated to an intermediate language that allows easy interpretation • Faster than pure interpretation • Examples • Perl programs are partially compiled to detect errors before interpretation • Initial implementations of Java were hybrid; the intermediate form, byte code, provides portability to any machine that has a byte code interpreter and a run-time system (together, these are called Java Virtual Machine)
Just-in-Time Implementation Systems • Initially translate programs to an intermediate language • Then compile intermediate language into machine code • Machine code version is kept for subsequent calls • JIT systems are widely used for Java programs • .NET languages are implemented with a JIT system
Programming Environments • The collection of tools used in software development • UNIX • An older operating system and tool collection • Nowadays often used through a GUI (e.g., CDE, KDE, or GNOME) that run on top of UNIX • Borland JBuilder • An integrated development environment for Java • Microsoft Visual Studio.NET • A large, complex visual environment • Used to program in C#, Visual BASIC.NET, Jscript, J#, or C++
What Does This C Statement Mean? *p++ = q++ modifies *p increments p increments q Does this mean… … or … or *p = *q; ++p; ++q; tp = p; ++p; tq = q; ++q; *tp = *tq; *p = *q; ++q; ++p;
Languages • Simula • Smalltalk • Algol • Cobol • F# • Prolog • Pascal • Modula-2 • ADA • PL/I • CORBA • PERL • BASIC • JAVASCRIPT • LISP • MIRANDA • ML • SCHEMA • SNOBOL • APL • DELPHI • MAYA