550 likes | 895 Views
Banner and the SQL Select Statement: Part Three (Joins). Mark Holliday Department of Mathematics and Computer Science Western Carolina University 4 November 2005 and 11 November 2005 (updated: 11 November 2005). Outline. The Goal The Concepts A First Example Single Table Selects
E N D
Banner and the SQL Select Statement: Part Three (Joins) Mark Holliday Department of Mathematics and Computer Science Western Carolina University 4 November 2005 and 11 November 2005 (updated: 11 November 2005)
Outline • The Goal • The Concepts • A First Example • Single Table Selects • Joins • Multiple Connected Select Statements
A First Example • Outline • The Relational Model: Single Table • Lab 1: TOAD, Schema Browser • Some Structured Query Language (SQL) Basics • Lab 2: TOAD, SQL Editor
Single Table Selects • Outline • WHERE clause: single condition, multiple conditions • Lab 3: • Order By; Aggregate Functions • Lab 4: • Group By; Having • Lab 5:
Joins • Outline • Why Multiple Tables? • Inner Joins • Lab 6: • Outer joins • Lab 7:
Why Multiple Tables? (franz) One Table database: keeps track of all purchases at our store (known as the ‘flat file’)
Every time a new row is inserted into the table, all columns will be updated. This results in unnecessary "redundant data". For example, every time Wolfgang Schultz purchases something, the following rows will be inserted into the table: Why Multiple Tables? (franz)
What happens if we ONE DAY find outWolfgang Schultz’s last name is really spelled “S-h-o-o-l-t-z-e” instead of “S-c-h-u-l-t-z”? (franz)He has purchased LOTS of stuff from us.And…. He has been making purchases at our store for over five years. In fact, he is our very BEST customer. His purchases provide for most of our gross sales.But now… He says that he will take his business elsewhere if we don’t get this problem corrected!And YOU have to get it corrected NOW!!! Why Multiple Tables?
Hold that thought ! Why Multiple Tables? (franz) This helps to exemplify WHY relational database table structures are so nice ...
Why Multiple Tables? (franz) Original ‘flat file’ table:
Why Multiple Tables? (franz) For this example, an ideal database would have two tables: • One for keeping track of customer information (customer_info) • And the other to keep track of what they purchase (purchases). customer_info table: purchases table:
Our new tables: (franz) The ‘id’ number is the common element that ties the tables together.
Why Multiple Tables? (franz) • After redesigning our data structure into two tables, • whenever a change for the name or address needs to be made to our repeating customer, • we update the “customer_info” table only. • Only the second table, “purchases”, needs to be updated when purchases are made.
Why Multiple Tables? (franz) We've just eliminated the space-taking, time-consuming, useless redundant data of our flat file. That is, we've just NORMALIZED this database! This is the basis for relational database structure.
Joins: Introduction Unfortunately, having multiple tables introduces a problem. Problem: Information you need for a query is often in more than one table. Example: Find the zip code of the person whose first name is “Ab” and last name is “Mazlan”. • need the spraddr table for the zip code and • need the spriden table for the name
Joins: Introduction Solution: Need a way to temporarily (just in the query) connect a row in one table with the row (or rows) in another table that “match” it. Refined Solution: Use some type of JOIN operation
Joins: Introduction Problem: How do we find the “matching” rows in the other table? • In other words, how do we do a join? Solution: • Conceptually, match each row in the left table with every row in the right table. • take the Cartesian Product of the two tables • Only keep the row pairs that “match”.
Joins: Introduction (Step One of Join) Cartesian Product --
Joins: Introduction (Step Two of Join) only keep rows that have pidm1 = pidm2
Joins: Introduction Questions: • When does a row from the left table “match” a row from the right table? • What to do with a row in one table that does not match a row in the other table?
Joins: Introduction Question One: When does a row from the left table “match” a row from the right table? Answer: The two rows match if the condition expression is TRUE. • The expression is a sequence of conditions connected by AND. • Each condition is a comparison of a left table column and a right table column • The comparison operator is equality.
Joins: Introduction Example condition expression • just one condition spriden.spriden_pidm = spraddr.spraddr_pidm
Joins: Introduction Good relational database design => the only common column in two tables is the key for each table => use the key of each table to get from a left table row to the matching right table row • they have the same key value
Joins: Introduction Question Two: What to do with a row in one table that does not match a row in the other table? • ignore them: inner join • don’t ignore them: outer join • This section covers inner joins
INNER Join (franz) • An example using reserved words to indicate a join: • SELECT spriden_last_name, spriden_first_name, spraddr_city, spraddr_stat_code, spraddr_zip • FROM spriden INNER JOIN spraddr • ON spriden_pidm = spraddr_pidm • Implicitly, • take cartesian product of spriden and spraddr • 2) only keep the rows that meet the condition
INNER Join (franz) Syntax for a join using reserved words SELECT "list-of-columns“ FROMtable1 [“type”JOIN] table2 ON “field matching” WHERE "search-condition(s)"
INNER Join (franz) Table ‘spriden’: Table ‘spraddr’:
INNER Join (franz) • SELECT spriden_last_name, spriden_first_name, spraddr_city, spraddr_stat_code, spraddr_zip • FROM spriden INNER JOIN spraddr • ON spriden_pidm = spraddr_pidm
Task: Select from ‘spriden’ for the active record (spriden_change_ind is null). Pull the permanent mailing address types (‘MA’) in the US (spraddr_natn_code is null). INNER Join (franz) This will pull a large amount of data. Let’s create a smaller subset by picking zip codes on the northern East Coast (zip codes starting with ‘0’). Let’s also order our output by the last name then first name.
INNER Join (franz) To accomplish this we would take the previous SQL and make these additions: SELECT spriden_last_name, spriden_first_name, spraddr_city, spraddr_stat_code, spraddr_zip FROM spriden INNER JOIN spraddr ON spriden_pidm = spraddr_pidm WHERE spriden_change_ind is null and spraddr_atyp_code = 'MA' and spraddr_zip between '000%' and '1%' and spraddr_natn_code is null ORDER BY spriden_last_name, spriden_first_name
INNER Join (franz) SELECTsubstr(spriden_last_name,1,12) || ', ' ||substr(spriden_first_name,1,12), rtrim(spraddr_city) || ', ' || spraddr_stat_code || ' ' || spraddr_zip FROM spriden INNER JOIN spraddr ON spriden_pidm = spraddr_pidm WHERE spriden_change_ind is null and spraddr_atyp_code = 'MA' and spraddr_zip between '000%' and '1%' and spraddr_natn_code is null ORDER BY spriden_last_name, spriden_first_name
INNER Join (franz) Part of the Resultant Subset Would Be:
If we had left out the check for a null change indicator, we would have received duplicate rows of data. (We would get the Gary Abbot record with a null indicator, and also the one with an “I” indicator.) INNER Join (franz) SELECTsubstr(spriden_last_name,1,12) || ', ' ||substr(spriden_first_name,1,12), rtrim(spraddr_city) || ', ' || spraddr_stat_code || ' ' || spraddr_zip FROM spriden INNER JOIN spraddr ON spriden_pidm = spraddr_pidm WHERE spraddr_atyp_code = 'MA' and spraddr_zip between '000%' and '1%' and spraddr_natn_code is null …spriden_change_ind is null ORDER BY spriden_last_name, spriden_first_name
SELF Join The left table and the right table in an inner join might be the same table. This is called a SELF Join. When would this be useful?
SELF Join (franz) ‘spiffy’ table: Syntax: SELECT e.first_name EMPLOYEE_FIRST, e.last_name EMPLOYEE_LAST, m.first_name MANAGER_FIRST, m.last_name MANAGER_LAST FROM spiffy e INNER JOIN spiffy m ON e.manager_id = m.employee_id
SELF Join Previous Inner Join examples: • instances of two different tables joining on the same column in both tables (e.g. pidm) The Self Inner Join example: • two instances of the same table joining on different columns of the same table (e.g. manager_id and employee_id) SELECT e.first_name EMPLOYEE_FIRST, e.last_name EMPLOYEE_LAST, m.first_name MANAGER_FIRST, m.last_name MANAGER_LAST FROM spiffy e INNER JOIN spiffy m ON e.manager_id = m.employee_id
SELF Join (franz) Result:
Laboratory Six • Objectives: • Develop competence with inner joins • Steps: • First Query
Laboratory Six • Problem: Find the area code and phone number for every one whose last name is “Holliday.” • Hint: Use the spriden and sprtele tables.
Laboratory Six • Solution One: • SELECT sprtele_phone_area, sprtele_phone_number • FROM spriden INNER JOIN sprtele • ON spriden_pidm = sprtele_pidm • WHERE spriden_last_name = ‘Holliday’ • Solution Two: • SELECT sprtele_phone_area, sprtele_phone_number • FROM spriden, sprtele • WHERE spriden_pidm = sprtele_pidm and • spriden_last_name = ‘Holliday’
OUTER Join (franz) Notice that ‘spriden’ has records for Bagnall, Baker and Barksdale. ‘Spbpers’ has records for Bagnall and Barksdale, but not Baker.
OUTER Join (franz) If we join the tables by an inner join on pidm, we would only receive rows for Bagnall and Barksdale. To pick up all the records in ‘spriden’, and join it with the data in ‘spbpers’ we would use an outer join.
OUTER Join (franz) OUTER Joins can be split into three types: • LEFT outer join • RIGHT outer join • FULL outer join “Left” or “Right” designates what your base table is for the join. “Full” returns all rows in both tables.
LEFT OUTER Join (franz) SELECT spriden_last_name, spriden_first_name, spbpers_sex, spbpers_birth_date FROM spriden LEFT OUTER JOIN spbpers ON spriden_pidm = spbpers_pidm WHERE spriden_change_ind is null and spriden_entity_ind = 'P' ORDER BY spriden_last_name, spriden_first_name
LEFT OUTER Join (franz) Result:
RIGHT OUTER Join (franz) SELECT spriden_last_name, spriden_first_name, spbpers_sex, spbpers_birth_date FROM spriden RIGHT OUTER JOIN spbpers ON spriden_pidm = spbpers_pidm WHERE spriden_change_ind is null and spriden_entity_ind = 'P' ORDER BY spriden_last_name, spriden_first_name
RIGHT OUTER Join (franz) Result:
FULL OUTER Join (franz) SELECT spriden_last_name, spriden_first_name, spbpers_sex, spbpers_birth_date FROM spriden FULL OUTER JOIN spbpers ON spriden_pidm = spbpers_pidm WHERE spriden_change_ind is null and spriden_entity_ind = 'P' ORDER BY spriden_last_name, spriden_first_name
FULL OUTER Join (franz) Result: PIDM added to indicate which rows were selected.