430 likes | 670 Views
National University of Computer and Emerging Sciences . Relational Algebra. Lecture # 6 June 30 ,2012. Chapter Outline. Relational Algebra Unary Relational Operations Relational Algebra Operations From Set Theory Binary Relational Operations Additional Relational Operations
E N D
National University of Computer and Emerging Sciences Relational Algebra Lecture # 6 June 30 ,2012
Chapter Outline • Relational Algebra • Unary Relational Operations • Relational Algebra Operations From Set Theory • Binary Relational Operations • Additional Relational Operations • Examples of Queries in Relational Algebra
RENAME OPEARATION • Rename operator is denoted by (rho) • Rename operation can be expressed as: • S(R) rename the relation to S • (B1, B2, …, Bn )(R) rename the attributes to B1, B2, …..Bn • S (B1, B2, …, Bn )(R) rename both relation to S, and attributes to B1, B1, …..Bn • Example: • RESULT(F, M, L, S, B, A, S, SL, SS, DNO) RESULT (F,M,L,S,B,A,S,SL,SS, DNO)(DEP5_EMPS)
Union (Binary Operation) • The result of R S, is a relation that includes all tuples that are either in R or in S or in both R and S • Duplicate tuples are eliminated • The two relations R and S must be “type compatible” (or Union compatible) • R and S must have same number of attributes • Each pair of corresponding attributes must have same or compatible domains
UNION Example • To retrieve the social security numbers of all employees who either work in department 5 or directly supervise an employee who works in department 5 DEP5_EMPS DNO=5 (EMPLOYEE) RESULT1 SSN(DEP5_EMPS) RESULT2(SSN) SUPERSSN(DEP5_EMPS) RESULT RESULT1 RESULT2
INTERSECTION And SET DIFFERENCE (Binary Operations) • INTERSECTION operation: the result of R S, is a relation that includes all tuples that are in both R and S • SET DIFFERENCE operation: the result of R – S, is a relation that includes all tuples that are in R but not in S • Two relations R and S must be “type compatible”
Relational Algebra Operations from Set Theory • Both and are commutative operations • R S = S R, and R S = S R • Both and can be treated as n-ary operations • R (S T) = (R S) T • (R S) T = R (S T) • Minus operation is not commutative • R – S ≠ S – R
Example to illustrate the result of UNION, INTERSECT, and DIFFERENCE
CARTESIAN PRODUCT • The result of Cartesian product of two relations R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm) is given as: Result(A1, A2, . . ., An, B1, B2, . . ., Bm) • Let |R| = nR and |S| = nS , then |R x S|= nR * nS • R and S may NOT be "type compatible” • Cross Product is a meaningful operation only if it is followed by other operations
Example (not meaningful): • F SEX=’F’(EMPLOYEE) • EN FNAME, LNAME, SSN (F) • E_DP EN x DEPENDENT F EN Problem: Retrieve a list of each female employee’s dependents E_DP Example (meaningful): F SEX=’F’(EMPLOYEE) EN FNAME, LNAME, SSN (F) E_DP EN x DEPENDENT A_DP SSN=ESSN(E_DP) R FNAME, LNAME, DEPENDENT_NAME(A_DP) A_DP R
Relational Algebra Operators • Relational Algebra consists of several groups of operations • Unary Relational Operations • SELECT (symbol: (sigma)) • PROJECT (symbol: (pi)) • RENAME (symbol: (rho)) • Relational Algebra Operations From Set Theory • UNION ( ), INTERSECTION ( ), DIFFERENCE (–) • CARTESIAN PRODUCT ( x ) • Binary Relational Operations • JOIN (several variations of JOIN exist) • DIVISION • Additional Relational Operations • OUTER JOINS, OUTER UNION • AGGREGATE FUNCTIONS (These compute summary of information: for example, SUM, COUNT, AVG, MIN, MAX)
JOIN(Binary Operation) • JOIN Operation is denoted by • JOIN combines the sequence of CARTESIAN PRODECT and SELECT into a single operation • It allows us to combine related tuples from various relations • The general form of a join operation on two relations R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is: R <join condition>S
Example of JOIN operation • Retrieve the name of the manager of each department. DEPT_MGR DEPARTMENT MGRSSN=SSN EMPLOYEE • The join condition can also be specified as DEPARTMENT.MGRSSN= EMPLOYEE.SSN
Some properties of JOIN • Consider the following JOIN operation: • R(A1, A2, . . ., An) S(B1, B2, . . ., Bm) R.Ai=S.Bj • Result is a relation Q with degree n + m attributes: • Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order. • The resulting relation state has one tuple for each combination of tuples—r from R and s from S, but only if they satisfy the join condition r[Ai]=s[Bj] • Hence, if R has nRtuples, and S has nStuples, then the join result will generally have less thannR * nStuples.
Theta-join • The general case of JOIN operation is called a Theta-join: R S theta • Theta is a Boolean expression on the attributes of R and S; for example: • R.Ai<S.Bj AND (R.Ak=S.Bl OR R.Ap<S.Bq) • Theta can have any comparison operators {=,≠,<,≤,>,≥,}
Equi-Join • The join condition that involves only equality comparison operator =, is called an EQUIJOIN. • The result of EQUIJOIN always have one or more pairs of attributes (names need not be identical) that have identical values in every tuple. • Example: • DEPT_MGR DEPARTMENT MGRSSN=SSN EMPLOYEE • Retrieve a list of each female employee’s dependents F SEX=’F’(EMPLOYEE) EN FNAME, LNAME, SSN (F) E_DP EN DEPENDENT SSN=ESSN
Issue with Equijoin Operation • Superfluous column
NATURAL JOIN Operation • NATURAL JOIN operation (denoted by *) is created to get rid of the superfluous attribute in an EQUIJOIN condition. • The two join attributes, or each pair of corresponding join attributes must have the same name in both relations • If this is not the case, a renaming operation is applied first. • Example: To apply a natural join on the DNUMBER attributes of DEPARTMENT and DEPT_LOCATIONS, it is sufficient to write: • DEPT_LOCS DEPARTMENT * DEPT_LOCATIONS • Only attribute with the same name is DNUMBER • An implicit join condition is created based on this attribute: DEPARTMENT.DNUMBER=DEPT_LOCATIONS.DNUMBER
Example:Natural Join • Another example: Q R(A,B,C,D) * S(C,D,E) • The implicit join condition includes each pair of attributes with the same name, “AND” together: • R.C=S.C AND R.D.S.D • Result keeps only one attribute of each such pair: • Q(A,B,C,D,E)
Complete Set of Relational Operations • The set of operations including SELECT , PROJECT , UNION , DIFFERENCE - , RENAME , and CARTESIAN PRODUCT X is called a complete set because any other relational algebra expression can be expressed by a combination of these five operations. • For example: • R S = (R S ) – ((R - S) (S - R)) • R <join condition>S = <join condition> (R X S)
DIVISION (Binary Operation) • The division operation is applied to two relations • R(Z) S(X), where X Z. • Let Y = Z – X • We have Z = X Y and Y is a set of attributes of R that are not the attributes of S. • The result of DIVISION is a relation T(Y) • For a tuple t to appear in the result T of the DIVISION, the values in t must appear in R in combination with everytuple in S.
Example of DIVISION • Retrieve all employees who work on all the project that John Smith works on • Smith fname=‘John’ and lname=‘Smith’ (Employee) • Smith_Pnos Pno (Works_onessn=ssn Smith) • Ssn_Pnos Essn,Pno (Works_on) • SSNS(ssn) Ssn_PnosSmith_Pnos
Query Tree • An internal data structure to represent a query • Standard technique for estimating the work involved in executing the query, the generation of intermediate results, and the optimization of execution • Nodes stand for operations like selection, projection, join, renaming, division, …. • Leaf nodes represent base relations • A tree gives a good visual feel of the complexity of the query and the operations involved • Algebraic Query Optimization consists of rewriting the query or modifying the query tree into an equivalent tree.
Additional Relational Operations: Aggregate Functions and Grouping • A type of request that cannot be expressed in the basic relational algebra is to specify mathematical aggregate functions on collections of values from the database. • Examples: • retrieve the average or total salary of all employees • Retrieve total number of employee tuples. • Common functions applied to collections of numeric values include • SUM, AVERAGE, MAXIMUM, and MINIMUM. • COUNT function is used for counting tuples or values.
Aggregate Function Operation • Use of the Aggregate Functional operation ℱ • ℱMAX Salary (EMPLOYEE) retrieves the maximum salary value from the EMPLOYEE relation • ℱMIN Salary (EMPLOYEE) retrieves the minimum Salary value from the EMPLOYEE relation • ℱSUM Salary (EMPLOYEE) retrieves the sum of the Salary from the EMPLOYEE relation • ℱCOUNT SSN, AVERAGE Salary (EMPLOYEE) computes the number of employees and their average salary • Note: count just counts the number of rows, without removing duplicates
Using Grouping with Aggregation • Grouping can be combined with Aggregate Functions • Example: For each department, retrieve the DNO, COUNT SSN, and AVERAGE SALARY • A variation of aggregate operation ℱ • DNO ℱCOUNT SSN, AVERAGE Salary (EMPLOYEE) • This operation groups employees by DNO and computes the count of employees and average salary per department
Recursive Closure Operation • Recursive Closure Operations cannot be specified in general using Relational Algebra • Example of a recursive operation is to retrieve all SUPERVISEES of an EMPLOYEE e at all levels — that is, • all EMPLOYEE e’ directly supervised by e; • all employees e’’ directly supervised by each employee e’; • all employees e’’’ directly supervised by each employee e’’; • and so on. • Although it is possible to retrieve employees at each level and then take their union, we cannot, in general, specify a query such as “retrieve the supervisees of ‘James Borg’ at all levels” without utilizing a looping mechanism. • The SQL3 standard includes syntax for recursive closure.
Outer Join Operation • In INNER JOIN, tuples without a matchingtuple are eliminated from the join result • Tuples with null are also eliminated • This amounts to loss of information. • OUTER joins operations are used when we want to keep all the tuples in R, or all those in S, or all those in both relations in the result of the join, regardless of whether or not they have matching tuples in the other relation.
Outer Join Operation • Left outer join: keeps every tuple in the first or left relation R in R S • if no matching tuple is found in S, then the attributes of S in the join result are filled with null values. • Right outer join: keeps every tuple in the second or right relation S in the result of R S. • Full outer join: keeps all tuplesin both the left and the right relations. It is denoted by
Left Outer Join • List the employees name and the department name that they manage. If they don’t manage one, then indicate this with a null value. • Temp (Employee Ssn=Mgr_Ssn Department) • Result Fname, Minit, Lname, Dname(Temp)
OUTER UNION Operations • The outer union operation take the union of tuples in two relations R(X, Y) and S(X, Z) that are partially compatible, • That is, only some of their attributes, say X, are type compatible. • The attributes that are type compatible are represented only once in the result, and • The attributes that are not type compatible from either relation are also kept in the result relation T(X, Y, Z).
Outer Join Example • An outer union can be applied to two relations whose schemas are STUDENT(Name, SSN, Department, Advisor) and INSTRUCTOR(Name, SSN, Department, Rank). • Tuples from the two relations are matched based on having the same combination of values of the shared attributes— Name, SSN, Department. • If a student is also an instructor, both Advisor and Rank will have a value; otherwise, one of these two attributes will be null. • The result relation STUDENT_OR_INSTRUCTOR will have the following attributes: STUDENT_OR_INSTRUCTOR (Name, SSN, Department, Advisor, Rank)
Examples of Queries in Relational Algebra • Q1: Retrieve the name and address of all employees who work for the ‘Research’ department. RESEARCH_DEPT DNAME=’Research’ (DEPARTMENT) RESEARCH_EMPS (RESEARCH_DEPT DNUMBER= DNOEMPLOYEEEMPLOYEE) RESULT FNAME, LNAME, ADDRESS (RESEARCH_EMPS)
Examples of Queries in Relational Algebra • Q6: Retrieve the names of employees who have no dependents. ALL_EMPS SSN(EMPLOYEE) EMPS_WITH_DEPS(SSN) ESSN(DEPENDENT) EMPS_WITHOUT_DEPS (ALL_EMPS - EMPS_WITH_DEPS) RESULT LNAME, FNAME (EMPS_WITHOUT_DEPS * EMPLOYEE)
Examples of Queries in Relational Algebra • Q5: Retrieve the names of all employees with two or more dependents. T1(SsnNo_of_dependents) Essn ℱ COUNT Dependent_name(DEPENDENT) T2No_of_dependents >1(T1) RESULT LNAME, FNAME (T2 * EMPLOYEE)
Chapter Summary • Relational Algebra • Unary Relational Operations • Relational Algebra Operations From Set Theory • Binary Relational Operations • Additional Relational Operations • Examples of Queries in Relational Algebra