640 likes | 1.41k Views
Context-sensitive Languages. Definition 7.1.1 : for a given set of non-terminals V and set of terminals , a production is context-sensitive if it is of the form A , where A V, (V ) * , and (V ) + ; a grammar is context-sensitive if all its productions are.
E N D
Context-sensitive Languages Definition 7.1.1: for a given set of non-terminals V and set of terminals , a production is context-sensitive if it is of the form A, where AV, (V)*, and (V)+; a grammar is context-sensitive if all itsproductions are. Definition 7.1.2: L * is an -free context-sensitive language if there exists a context-sensitive grammar G so that L = L(G); L is a context-sensitive language if there exists a context-sensitive grammar G so that either L = L(G) or L = L(G) {} [in this latter case wemay write L = L(G)]. Since we have previously proven that erasing rules can be eliminated in context-free grammars, each context-free language is context-sensitive.
Theorem 7.1.2: each non-erasing general phrase structure grammar is equivalent to a context-sensitive grammar. Example 7.1.1. A non-erasing grammar for {a2k | k≥0} (start = S). S DS | A, DA aA, A a, Da aaD. As a sample derivation we have: S DS DDS DDA DaA aaDA aaaA a4 In general, S * Dk A * a2k
Example 7.1.2.The construction in the proof of Theorem 7.1.1 appliedto Example 7.1.1 results in the following equivalent context-sensitive grammar S DS | A, DA XaA, A Xa, Xa a, DXa YXa, YXa YZ, YZ XaZ, XaZ XaXaD. Example 7.1.3.A non-erasing grammar for {ak bk ck | k ≥ 1} (start = A). A aABC, aB ab, C c, A aBC, bB bb, CB BC. As a sample derivation we have: A aABC aaBCBC aabCBC aabBCC aabbCC aabbcC aabbcc
Theorem 7.1.3: each phrase structure language is generated by a context-sensitive grammar augmented with context-free erasing. Theorem 7.1.4: each phrase structure language is the homomorphic image of an -free context-sensitive language. Definition 7.1.3: a homomorphism h: ** is-free (or non-erasing) if h() ≠ for each . Theorem 7.1.5: context-sensitive languages are closed under -free homomorphisms.
Definition 7.1.4: if G = (V, , P, S) is a context-sensitivegrammar, for w1,w2(V)* a rewriting is leftmost, w1l w2, if w1 = xyA, w2 = xy, yAyP, and x,y*; a derivation is leftmost if each of its steps is; the leftmost language of G is LL(G) = {w* | S *l w}. Theorem 7.1.6: if G is a context-sensitive grammar, then LL(G) is context-free.
Linear Bounded Automata Definition 7.2.1: a linear bounded automaton (LBA) A is a 6-tuple A = (S, , s0, R), where S (states) is a finite non empty set, (tape alphabet) is a finite non empty set, (input alphabet) is a non empty set, s0S (start state), R S (recognizing or accepting states), and : Sp(S{l, r}) is the next-move function. Definition 7.2.2: an instantaneous description (ID) for an LBA A = (S, , s0, R ) is a sequence 1s2 where sS and 12*.
Definition 7.2.3: a move of an LBA A = (S, , s0, R)is denoted by a pair of instantaneous descriptions; for IDs I1 and I2 we say I1leads to I2 in A, I1— I2, if I1 = 12 … k-1sk … n (1≤k≤n) and either (a) <t,Y,r>(s,k) and I2 = 12 … k-1Ytk+1 … n , or (b) k>1,<t,Y,l>(s, k), and I2 = 12 … tk-1Yk+1 … n. Definition 7.2.4: a run of an LBA is just a sequence of (zero or more) moves of the machine; that is, there is a run from ID I1 to ID I2, I1—* I2 , provided there is asequence of IDs J0, J1, … , Jk (k≥0) so that I1 = J0, I2 = Jk, and Ji— Ji+1 for 0≤i<k. Definition 7.2.5: for an LBA the language recognizedby A is L(A) = {w*| s0w —*s for some sR and *}.Language L is an LBA language if an LBA exists thataccepts it.
Example 7.2.1.This LBA recognizes the non context-free language {ww| w+} over = {0,1} using tape alphabet {X, Y}. s00s1Xr s01s2Xr s10s10r s10s3Yl s11s11r s21s21r s21s3Yl s20s20r s30s30l s31s31l s3Ys3Yl s3Xs4Xr s40s5Xr s41s7Xr s4Ys9Yr s50s50r s51s51r s5Ys6Yr s6Ys6Yr s60s3Yl s70s70r s71s71r s7Ys8Yr s8Ys8Yr s81s3Yl s9Ys9Yr s90s100r s91s101r where e.g. <s1, X, r>(s0, 0),is written as s00s1Xrs9 is the accept state non-determinism is highlighted in red
Example 7.2.2. (repeats the LBA of Example 7.2.1) 0: if {0} print X, right_to 1; if {1} print X, right_to 2 1: while {0,1} right; if {0} print Y, left_to 3 2: while {0,1} right; if {1} print Y, left_to 3 3: while {0,1,Y} left; if {X} right_to 4 4: if {0} print X, right_to 5; if {1} print X, right_to 7; if {Y} right_to 9 5: while {0,!} right; if {Y} right_to 6 6: while {Y} right; if {0} print Y, left_to 3 7: while {0,1} right; if {Y} right_to 8 8: while {Y} right; if {1} print Y, left_to 3 9: while {Y} right; if {0,1} halt; accept
Using Multiple “Tracks” By taking = ( {A, B, C, …}) positions on an LBA tape can be “marked” without obliterating the input, a b c … can become a b c …A … See Example 7.2.3 in the text for a detailed illustration.
Definition 7.2.6: if G = (V, , P, S) is a phrase structure grammar, and (V)*•V•(V)*, a production is called invertible, and is its inverse. Lemma 7.2.1: if G = (V, , P, S) is a phrase structure grammar and P is invertible with inverse -1, then for all x,y(V)*, x y via if and only if y x via -1. Theorem 7.2.2: each LBA language is context-sensitive.
Context-sensitive Membership Test STEPS:= <S>; /* initialize STEPS to start (derived in 0 steps) */I:= 1; /* initialize I to first step in STEPS */ while I ≤ size(STEPS) do /* try each step */ begin for PDN:= 1 to p do /* with each production */ for POS:= 1 to len(STEPS[I]) do /* at each position */ begin NEXT:= rewrite(STEPS[I], P[PDN], POS); if NEXT = INPUT then return(true) else if len(NEXT) ≤ len(INPUT) then STEPS:= append(STEPS,NEXT) end; I:= I+1 end;return(false) rewrite(x,,p) - returns string x rewritten with pdn at pos p, or x if doesn’t apply at p append(l,x) - returns list l with x added at end, or just l if x already occurs in l
Theorem 7.2.3: each context-sensitive language isan LBA language. Theorem 7.3.1: the context-sensitive languages are closed under the regular language operations, union, concatenation, and Kleene closure. Theorem 7.3.2: the context-sensitive languages are closed under intersection. Definition 7.3.1: a DGSM M = (S, , s0) is -free if (s,) ≠ for all sS and . Theorem 7.3.3: the context-sensitive languages are closed under -free DGSM functions.
Lemma 7.3.4: for each LBA A = (SA, , A, A, sA, RA), there is an LBA B = (SB, , B, B, sB, RB), so that when B is started on input x+, it terminates its run with its tape containing a representation of the number of IDs of A so that sAx —* in A. Theorem 7.3.5: the context-sensitive languages are closed under complement.
Turing Machines Definition 8.1.1: a Turing acceptor T is a 7-tuple T = (S, , s0, b, R), where S (states) is a finite non empty set, (tape alphabet) is a finite non empty set, (input alphabet) is a non empty set, s0S (start state), b (blank symbol), R S (recognizing or accepting states), and : S--> S{l, r} is the (partial) next-move function. Definition 8.1.2: an instantaneous description (ID) for Turing machine T = (S, , s0, b, R) is a sequence 1s2 where sS and 12*; note that blank symbols atthe right of 2 may be omitted.
Definition 8.1.3: a move of a Turing machineT = (S, , s0, b, R) is denoted by a pair of instantaneous descriptions; for IDs I1 and I2 we say I1leads to I2 in T, I1|— I2, if either (1) I1 = 12 …k-1sk … n, and either (a) (s,k) = <s',Y,r> and I2 = 12 … k-1Ys'k+1 … n, or (b) k>1, (s,k) = <s',Y,l> and I2 = 12 …s'k-1Yk+1 …n, or (2) I1 = 12 …k-1k …ns, and either (a) (s,b) = <s',Y,r> and I2 = 12 … k-1k …nYs', or (b) n≥1, (s,b) = <s',Y,l> and I2 = 12 …k-1k …s'nY.
Definition 8.1.4: a run of Turing machine T = (S, , s0, b, R) is just a sequence of moves of the machine; that is, there is a run from ID I1 to ID I2, I1—* I2, provided there is a sequence of IDs J0, J1, … , Jk (k≥0) so that I1 = J0, I2 = Jk, and Ji— Ji+1 for 0≤i<k. Definition 8.1.5: for a Turing acceptor T, the languageaccepted (or recognized) by T is L(T) = {w* | s0w —*1s2 for some sR and 12*}. Definition 8.1.6: a language L * is (partial) Turing-recognizable if there exists a Turing machine T so that L = L(T); if there exists such a Turing machine T that halts for each x*, then L is called totalTuring-recognizable.
Example 8.1.1: a Turing recognizer for {0n1n | n≥1}. transitioncomment s00s1Xrmark a 0 with X and goto search for a corresponding 1 s0Ys3Yrno more 0s remain, goto check that all 1s have been matched s10s10rmove right over 0s and Ys to reach corresponding 1 s1Ys1Yr s11s2Ylmark corresponding 1 with Y and go back for next 0 s20s20lmove left over 0s and Ys to reach last (rightmost) marked 1 s2Ys2Yl s2Xs0Xrgo to repeat for next 0 s3Ys3Yrcheck no unmarked symbols before end of input s3bs4blif so, accept
Example 8.1.1. (continued) 0:if {0} print X, right_to 1; /* mark a 0 with X */ if {Y} right_to 3; /* marking 0s complete */ otherwise reject /* reject if no leading 0 */1:while {0, Y} right; /* search right for 1 */ if {1} print Y, left_to 2; /* mark corresponding 1 with Y */ otherwise reject /* reject if it's missing */2:while {0, Y} left; /* rewind to last X */ if {X} right_to 0; /* loop for more 0s & 1s */ otherwise reject /* other options impossible */3:while {Y} right; /* skip over marked '1's */ if {b} accept; /* check for no extra '0's or */ otherwise reject /* '1's else, reject */
Turing Transducers By a Turingtransducer we simply mean an ordinaryTuring machine where we take notice of thecorrespondence between the initial input sequencefrom * and the content of the tape from * (normallyignoring rightmost blanks) when the machine halts (if ever). Definition 8.1.7: a partial function is called (partial) Turing-computable if there exists a Turing transducer that computes exactly that (partial) function; for a total function, if there exists a Turing transducer that computes that function and halts for every element of the domain, the function is said to be total Turing-computable.
Unary Representation of Nat By the literal interpretation of Definition 8.1.7, Turing machines can compute only string-to-string functions. For functions on non string domains, Turing-computability is understood to be with respect to a “simple” representation of the domain in terms of strings. For the natural numbers Nat = {0, 1, 2, … } we use the unaryrepresentation, namely, nNat 0n1{0,1}*. Since we will consider functions of several arguments,we will also need to encode tuples from Nat by concatenating the representations of the constituents, e.g., <n,m> 0n10m1{0,1}*.
Example: Turing transducer to map 0n1 02n1, for n≥0 = {0,1} and = {0, 1, X, Y, b(blank)}; start state is 0. 0: if {0} print X, right_to 1; /* mark next '0' with 'X' */ if {Y} print 0, right_to 3; /* all '0's copied – finish */ if {1} halt /* n=0 – done */ 1: while {0,Y} right; /* copy '0' as 'Y' at far right */ if {1,b} print Y, left_to 2 2: while {0,Y} left; /* “rewind” to 'X' */ if {X} print 0, right_to 0 /* restore 'X' to '0' and loop */ 3: while {Y} print 0, right; /* replace 'Y' copies with '0' */ if {b} print 1, halt /* then halt */ With this machine for m≥0 and n>m0ms00n-m1Ym—* 0mX0n-m-1Yms11 —* 0ms2X0n-m-1Ym+1b—* 0m+1s00n-(m+1)Ym+1b—* …0ns0Ynb—* 0n+1s3Yn-1b—* 0n0ns3b —02n1
Recursive Functions In this model we are concerned with functions on the natural numbers, Nat = {0, 1, 2, 3, … }. The recursive functions on Nat are developed entirely in terms of function concepts without any explicit computationalmodel. Definition 8.2.1: an initial function is one of the following: (1) the constant function, zero, defined for all xNat by zero(x) = 0, (2) the successor function, succ, defined for all xNat by succ(x) = x+1, and (3) for each n≥1 and 1≤k≤n, a projection functiontaking n arguments, n, defined for all xiNat (1≤i≤n) by n(x1, x2, … , xn) = xk. k k
Three ways are provided to produce new functionsfrom given functions. Definition 8.2.2: for an m-ary (partial) function f, and m n-ary (partial) functions g1, g2, … , gm, the composition of f and gi (1≤i≤m), written f ° <g1, g2, … , gm>, is the n-ary function h defined by h(x1, x2, … , xn) = f(g1(x1, … , xn), g2(x1, … , xn), … , gm(x1, … , xn)); as a partial function, the domain of h is dom(h) = {<x1, x2, … , xn> | <x1, x2, … , xn>dom(gi) for all 1≤i≤m and <g1(x1, x2, …, xn), g2(x1, x2, …, xn), …, gm(x1, x2, …, xn)>dom(f) }. Of course, if all the functions are total, then so is the composition.
Definition 8.2.3: for n-ary total function f and (n+2)-ary total function g, the primitive recursion of f and g is the (n+1)-ary function h, written pr(f,g), defined by h(x1, x2, … , xn, 0) = f(x1, x2, … , xn), and for each yNath(x1, … , xn, y+1) = g(y, h(x1, … , xn, y), x1, … , xn). Definition 8.2.4: for the (n+1)-ary total function f, the minimalization of f is the n-ary function g defined by smallest yNat so that f(x1, x2, … , xn, y) = 0 g(x1, x2, … , xn) = undefined if no such y exists {
Definition 8.2.5: a function is primitive recursive if: (1) it is an initial function, or (2) it is obtained from primitive recursive functions by either composition or primitive recursion. Definition 8.2.6: a function is (partial) recursive if: (1) it is an initial function, or (2) it is obtained from recursive functions by either composition, primitive recursion, or minimalization.
Example 8.2.1.Using informal induction we can define the 'sum' (+)function using only initial functions as for all x,yNat sum(x,0) = x, sum(x,y+1) = succ(sum(x,y)).Formally, this is a definition via primitive recursion, namely, sum = pr(1, succ °2). 1 3 Once the sum function is defined we can use it toform other definitions. Informally the 'product' (*)function can be expressed as prod(x,0) = 0, prod(x,y+1) = sum(prod(x,y), x).Formally this is again defined via primitive recursion,prod = pr(zero, sum ° <2, 3>). 3 3
Theorem 8.2.1: the composition of partial (total) Turing-computable functions yields a partial (total) Turing-computable function. Theorem 8.2.2: primitive recursion applied to total Turing-computable functions yields a total Turing-computable function. Theorem, 8.2.3: the minimalization of a total Turing-computable function is a (partial) Turing-computable function. Theorem 8.2.4: each (partial) recursive function on Nat is (partial) Turing-computable.
RASP Model The Random-Access Stored Program (RASP) model is anidealized version of the familiar von Neumannstyle computer. This model will be shown equivalent to the Turing machine model. We will be concerned here only with arithmeticcomputations involving Nat = {0, 1, 2, … }. Signedarithmetic and non-integers can be treated by thesame means we employ here, but add technical complications that cloud the conceptual clarity ofthe relationship between models.
The RASP model comprises: • internal storage(RAM) – a finite collection of “addressable registers” capable of storing either a machine instruction or a data value (Nat) • arithmetic unit(acc) – the accumulator where all numerical operations are performed • instruction counter(ic) – retains the address of the next instruction to be executed Some versions of the RASP model permit programself-modification — allowing machine instructions to access and change other instructions — but we do not. In our version there is separate data and instruction memory.
Definition 8.3.1: a random-access stored program (RASP) is a finite sequence of instructions i1, i2, … , imthat may refer to three different types of stored values (natural numbers): • the accumulator, acc • the instruction counter, ic • memory locations m1, m2, m3, … Each instruction is one of the following nine types: LDC n LDM mk STM mkADD mk SUB mkTRF n TRB nBZF n BZB n
Definition 8.3.2: an instantaneous description (ID) of a RASP is a tuple of natural numbers <i, x, m1, m2,…, mN>, where i represents a value of the ic, x a value of the acc,and mk, 1≤k≤N, represent the values of the each of the memory locations mk used by the RASP.The input values for a RASP are placed in the leading memory registers, m1, m2, m3, … prior to the start of execution, and the output value, if any, is the content of the accumulator (acc) when the program halts. Additional memory registers may be used for working storage without limit, and we assume that all these registers initially contain 0. We will adopt the convention that the last instruction of a RASP is always a HALT, and to halt at any other point in the program, a branch to the last instruction is executed.
Definition 8.3.3: a run of a RASP i1, i2, … , im with input <x1, x2, … , xp> is a sequence (finite or infinite) of IDs 1, 2, … where 1 = <1, 0, x1, x2, … , xp, 0, 0, … > and for i≥1, if i = <k, … >, either • 1≤k≤m, ik = HALT (i.e., k = m) and i is the last ID in the run, or • 1≤k<m and i+1 is obtained by applying the effect of ik to i, or • k<1 or k>m, and j = i for all j≥i — this is called an abort (representing it as an infinite but unchanging run is intended to signify there is no result).
RASP execution is understood as the familiar “fetch and execute” cycle used in electronic computers. The ic designates an instruction whose execution causes changes to the registers (including the ic) and memory, and instruction execution continues unless a halt instruction or “jam” is encountered. Definition 8.3.4: an n-ary partial function f on the natural numbers is RASP-computable if there is a RASP p = i1, i2, … , im so that a run of p with input <x1, x2, … , xn> is finite if and only if <x1, x2, … , xn>is in the domain of f, and in these cases p’s run ends with an ID of the form <m, f(x1, x2, … , xn), …>.
Example 8.3.1.RASPs for the initial recursive functions are trivial: n zero(x) (x1, x2, … , xn) k succ(x) Theorem 8.3.1: each recursive function is RASP-computable.
RASP-program for f ° <g1, g2, … , gm> where is a program for f and is a program for gi
Theorem 8.3.2: each RASP-computable function is Turing-computable. 0x10y1 … 0z1 acc pgm LDM 0r1 STM 0s1 … 0k1 ic 0p1 m1 … … 0q1 mn
Turing Recognizers & Grammars Theorem 8.4.1: for each Turing acceptor T there exists a phrase structure grammar G so that L(T) = L(G). Theorem 8.4.2: for each unrestricted phrase structure grammar G there exists a Turing machine T so that L(G) = L(T). Corollary 8.4.3: each context-sensitive language is total Turing-recognizable.
Theorem 8.4.4: each phrase structure language is the homomorphic image of the intersection of (deterministic) context-free languages. Theorem 8.4.6: the unrestricted phrase structure languages are closed under union, concatenation, Kleene closure, positive closure, substitution, and intersection. Theorem 8.4.7: if language L is total Turing-recognizable, then ¬L is also total Turing-recognizable. Theorem 8.4.8: if both language L and its complement ¬L are (partial) Turing-recognizable languages, then L (and hence ¬L) is total Turing-recognizable.
Definition 8.4.1: Let = {0,1} and consider a Turing machine T = (S, , , , s0, b, R). We can assume that R contains a single state. We define a sequence encode(T)* that provides a complete description of T as a binary sequence as follows: • choose a numbering of the states S as <s1, s2, … , sm> where s0 = s1 and {s2} = R, and encode each state as its number in unary, • choose a numbering of the tape symbols as <1, 2, … , n> where 0 = 1, 1 = 2, b = 3, and encode each tape symbol as its unary number, • number the directions l as D1 and r as D2, • encode a transition (si,j) = (sp,q,Dr) as 0i10j10p10q10r, • encode T as 111code111code211 … 11codek111 where codei (1≤ i ≤ k) are the encoded transitions of T.
Definition 8.4.2: for = {0,1) the diagonal language Ld* is defined as follows: • let <w1, w2, w3, … > = <, 0, 1, 00, 01, 10, 11, 000, … > be an enumeration of * (i.e., a 1-1, onto mapping : {1, 2, 3, … } *) in lexographical order. • regard each positive integer i as a description of Turing recognizer Ti if encode(Ti) = the binary expansion of i; if there is no such Turing recognizer, regard i as a description of the Turing machine with no transitions, • then Ld = {wi | wiL(Ti)}. Theorem 8.4.9: the diagonal language Ld is not a partial Turing-recognizable language.
A Universal Turing Machine U = ^ {L, R, X, Y, Z, A, B}. States of T are represented as uniform length binarystate numbers. Starting configuration of u
An outline of the operation of the Universal Turing machine is: • locate a match for the current state/symbol within the description of T • copy the portion in the description immediately following the match into the current state/symbol segment to update it • move the current (overprint) symbol to the marked position of T’s tape, access and mark the next position of T’s tape and move that symbol to the current position • repeat these steps until no match is found
Decision Problems A decision problem is a family of yes/no questions ptogether with a “decision mapping” d: p {yes,no}. Membership problem for Turing machines: given a Turing machine T, and an input w* for T, is wL(T)? The membership problem is a “2 parameter” decisionsince to identify an instance of the problem, there aretwo independent arguments to be supplied.
Definition 9.2.1: a decision problem (p,d) is decidable(or solvable) if there is “straightforward” encoding e(p) of the problem instances into sequences in some alphabet so that d: e(p) {yes,no} is a total Turing-computable function; otherwise (p,d) is undecidable(or unsolvable). Since the outcomes are binary, a decision problem (p,d) can also be regarded as a language recognition problem L(p) = {we(p)| w=e() and d()=yes} *.
Theorem 9.2.1: the membership problem for Turing machines is undecidable, or alternatively, the language L(u) = {encode(T) w| T is T.M. and wL(T)} is not total Turing-recognizable. Corollary 9.2.2: Ld is (partial) Turing-recognizable.
The Halting Problem for Turing machines: given a Turing machine T, and an input w* for T, does T halt when started on w? Theorem 9.2.3: the halting problem is undecidable for Turing machines. The algorithm test for Turing machines: given a Turing machine T, does T halt for every input w*? Theorem 9.2.4: the algorithm test is undecidable.
The emptiness problem for Turing machines (also raised for grammars, etc.): given a Turing machine T, is L(T) = ? The language version of this problem is Le = {encode(T) | L(T) = }, where Turing machine T is encoded by previously discussed means. The non-emptiness problem for Turing machines (also raised for grammars, etc.): given a Turing machine T, is L(T) ≠ ? The language version of this problem is Lne = {encode(T) | L(T) ≠ }, where Turing machine T is encoded by previously discussed means.
Theorem 9.2.5: Lne is (partial) Turing-recognizable. Theorem 9.2.6: Le is not total Turing-recognizable (i.e., the emptiness problem is undecidable). Corollary 9.2.7: Lne is not total Turing-recognizable (i.e., the non-emptiness problem is undecidable), and Le is not (partial) Turing-recognizable. Theorem 9.2.9: it is undecidable for Turing machine Twhether or not L(T) is regular.
Post’s Correspondence Problem Definition 9.3.1: Post’s Correspondence Problem (PCP) is the decision problem: “given a finite alphabet and two k-tuples (k≥1) of sequences A = <a1, a2, … , ak> and B = <b1, b2, … , bk> where ai,bi*, do there exist positive integers i1, i2, … , im (m≥1, 1≤ij≤k), called a solution, so that ai1ai2 … aim = bi1bi2 … bim?” Example 9.3.1 a1 a2 a3 … = 10101101 … and b1 b2 b3 … = 101011011 …