1 / 34

Compiler Construction

Compiler Construction. Run-time Environments,. Run-Time Environments (Chapter 7) Continued: Access to No-local Names. Non-locals. Assume we have stack allocation of activation records. SCOPE RULES of the source language determine how we handle non-local references.

mrinal
Download Presentation

Compiler Construction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compiler Construction Run-time Environments,

  2. Run-Time Environments (Chapter 7)Continued:Access to No-local Names

  3. Non-locals • Assume we have stack allocation of activation records. • SCOPE RULES of the source language determine how we handle non-local references. • Most languages use LEXICAL (also called STATIC) scoping. • Lexical scoping means it is possible to determine the declaration corresponding to a reference just by examining the program. • Pascal, C, Ada, etc. use static scoping. • Languages with DYNAMIC scoping require examination of the stack, at runtime, to find the right declaration.

  4. Block structure • C and many other languages have BLOCKs: stmt -> block | … block -> { decls stmts } • The scope of a declaration in a block uses the MOSTCLOSELY- NESTED rule: • The scope of a declaration in block B includes B • If “x” is referred to but not declared in B, then “x” is in the scope of a declaration in an enclosing block B’ s.t. • B’ has a declaration of “x” and • B’ is more closely nested around B than any other block with a declaration of “x”

  5. C program with blocks DeclScope int a = 0; B0-B2 int b = 0; int b = 1; int a = 2; int b = 3; what is the output?)

  6. Stack allocation of declarations in blocks • Declarations in each block can be allocated on the stack. • It is similar to a procedure call (with no parameters). • Space is allocated on the stack when we enter the block. • Space is deallocated on the stack when we exit the block.

  7. Lexical scope without nested procedures • C and related languages do NOT allow nested procedures. • A program is a series of declarations and functions. • All non-local references inside functions must refer to declarations at file (global) scope.

  8. Example: lexical scope • Consider the C code: int a[11]; void readarray( void ) { … a … } int partition( int y, int z ) { … a … } void quicksort( int m, int n ) { … } int main( void ) { … a … } • The references to a are always to the array declared on the first line.

  9. Lexical scope • Without nested procedures: • Locals use stack dynamic allocation. • All non-local data is allocated in the static data area. • At compile time, if a reference is not found in the current procedure’s AR, we look in the static data area and use the resulting static address. • Otherwise, the reference is local and accessible relative to the top of stack pointer. • Passing procedures as parameters is also simple if there is no nesting (all non-locals have static addresses).

  10. program sort( input, output ); var a : array[0..10] of integer; x : integer; procedure readarray; var i : integer; begin … a … end { readarray }; procedure exchange( i, j : integer ); begin x := a[i]; a[i] := a[j]; a[j] := x end { exchange }; procedure quicksort( m, n: integer ); var k, v : integer; function partition( y, z: integer ): integer; var i, j : integer; begin… a … … v … … exchange( i, j ); … end { partition }; begin … end { quicksort }; begin … end { sort }; Lexical scope with nested procedures

  11. Nesting depth • The reference to a on line 15: • The ref is inside partition(), which is inside quicksort(). • The most closely nested declaration is line 2, at program (global) scope. • The reference to exchange on line 17: • The ref is in partition(), which is nested in quicksort(). • The most closely nested declaration is line 7. • The compiler need to keep track of the NESTING DEPTH of each declaration: • sort() is at depth 1 • quicksort() is at depth 2 • partition() is at depth 3 • i of partition(): depth 4

  12. Access Links • We need some way to traverse from one AR to another when searching for the declaration corresponding to a reference. • A new pointer, the ACCESS LINK, is added to the AR. • If procedure P is nested inside procedure Q in the program, then the access link in P’s AR should point to the access link in Q’s AR.

  13. Access links • How to find a non-local reference using access links? • Suppose procedure P at nesting depth np refers to a nonlocal “a” with nesting depth na <= np. We find the storage for variable a as follows: • When control is in P, there must be an AR for P on top of the stack. We follow np - na access links. • After following np - na access links, we have the correct AR. The storage for a is some fixed offset relative to the beginning of that AR.

  14. Setting up access links • At compile time, non-local references are represented by the pair (np-na, offset). • We need to set up the access links at procedure call time. • Suppose procedure P at depth np calls procedure X at depth nx. The resulting code depends on whether the called procedure is nested within the caller or not. • Case np < nx : this means X is nested more deeply then P, so X’s access link just needs to point to P’s AR. • Case np >= nx : this means X is at the same level or an outer scope. We have to find the common ancestor of P and X. This will be np-nx+1 access links from P.

  15. Parameter Passing

  16. Parameter Passing • Parameters are the most common way for a calling procedure to communicate with the callee. • Different languages have different parameter semantics. • Mostly, the differences lie in whether an l-value or rvalue or text of the actual parameter is passed. • We consider four protocols: • Call by value • Call by reference • Copy-restore • Call by name

  17. Call by value • This is the simplest parameter passing method. • The caller computes r-values for the actuals. • The caller places the resulting values on the stack, in the AR of the callee. • The callee may change the parameters, but this has no effect on the caller. • This is the default protocol in Pascal, and the ONLY protocol in C.

  18. Parameter passing example • program reference( input, output ); • var a, b: integer; • procedure swap( var x, y: integer ); • var temp : integer; • begin • temp := x; • x := y; • y := temp; • end; • begin • a := 1; b := 2; • swap( a, b ); • writeln( ‘a = ‘, a ); writeln( ‘b = ‘, b ) • end. Specifies call-by-reference

  19. Call by reference • The caller passes the called procedure a POINTER to the storage address of the actual parameter. • If the actual has an l-value, it is used. • If the actual is an expression, we place the result of the expression in a temporary and pass a pointer to the temporary. • Pascal uses call by reference if the “var” keyword is used. • C++ uses call by reference if the “&” operator is specified.

  20. Copy restore • This is a hybrid between call-by-value and call-by reference. • Before callee is activated, we evaluate the actuals and put their r-values in the AR for the callee. • But we also compute and save the l-values of the actuals. • In the return sequence, we copy the updated r-values from the callee’s AR to the location for the saved values. FORTRAN used this approach.

  21. Call by name (macro expansion) • In this method, we just substitute the body of the procedure for the procedure call. • In the copied body, the formal parameters are replaced by the text of the actuals. • #define macros in C/C++ use this technique.

  22. Symbol Tables

  23. Symbol table implementation • The symbol table stores many kinds of information about names: • The NAME itself • STORAGE information • SCOPE information • So a symbol table entry is typically a record data type. • The table itself could be a simple linear array, or a more complex data structure (hash table, etc.).

  24. The NAME entry • Most languages put some bound on the length of ID names. • If the limit is small, we can place the name in the ST entry itself: typedef struct { char name[MAX_LENGTH+1]; … } tSymbolTableEntry • But otherwise, we should use the heap to store the names and simply point to them: typedef struct { char *name; … } tSymbolTableEntry;

  25. Storage information • The code generator needs to know about the storage required for declared names. • Statically allocated variables just have an offset relative to the beginning of the static data area. • Each definition needs to reserve space in the static data area and advance a pointer to the next available location. • For stack dynamic variables, we need to store the offset of the variable relative to the activation record for the procedure. • Heap dynamic variable storage requirements are not known until runtime.

  26. Linear list representations • We add new ST entries to the end of an array. • The array has to be reallocated if it gets too big. • Search for an item begins at the end and goes backwards, to ensure we get the most recent declaration of a name. • Checking for existence takes n/2 checks on average. • For n insertions and e lookups, we have O(n(n+e)) time. • Usually e >> n, so we can write O(ne). • This running time is generally too large for big programs.

  27. Hash table representations of the ST • We try to reduce search time to insert and search the ST with a hash table. • OPEN HASHING gives us a run time of O(n(n+e)/m) for any m we desire. • The table is an array of m BUCKETS. • To determine if s is in the table, we appy a HASH FUNCTION h() to s, such that 0 <= h(s) < m • Then we search the linked list for h(s).

  28. Hash table representations of the ST • Complexity: the average list length is n/m, so as long as m is within a constant factor of n, the search takes nearly constant time. • For h(s), the simplest method is to add up the ASCII values of the characters in s, divide by m, and take the remainder. • There are MANY other techniques. • Most modern languages have library support for hash tables (see hcreate()/hsearch()/hdestroy() if you are a C lover).

  29. Scope and the ST • Each entry in a ST corresponds to a declaration of a name. • When we look up a name in the ST, we want the entry for the declaration at the correct scope to be returned. • The simplest approach is to have a separate hash table for every scope. • Another way is to give each procedure a unique number, and append the number to each name, guaranteeing uniqueness.

  30. Dynamic Storage Allocation

  31. Explicit vs. implicit alloc/dealloc • Most languages support dynamic allocation of memory. • Pascal supports new(p) and dispose(p) for pointer types. • C provides malloc() and free() in the standard library. • C++ provides the new and free operators. • These are all examples of EXPLICIT allocation. • Other languages like Python and Lisp have IMPLICIT allocation.

  32. Garbage • In languages with explicit deallocation, the programmer must be careful to free every dynamically allocated variable, or GARBAGE will accumulate. • Garbage is dynamically allocated memory that is no longer accessible because no pointers are pointing to it. • In some languages with implicit deallocation, GARBAGE COLLECTION is occasionally necessary. • Other languages with implicit deallocation carefully track references to allocated memory and automatically free memory when nobody refers to it any longer.

  33. Dynamic storage allocation • We assume the heap is an initially empty block of memory. • As memory is allocated and deallocated, fragmentation occurs. • For allocation, we must find a HOLE large enough to hold the requested memory. • For deallocation, we must merge adjacent holes to prevent further fragmentation.

More Related