380 likes | 388 Views
Transactions, Relational Algebra, XML. February 11 th , 2004. Transactions. Transaction = group of statements that must be executed atomically Transaction properties: ACID ATOMICITY = all or nothing CONSISTENCY = leave database in consistent state
E N D
Transactions, Relational Algebra, XML February 11th, 2004
Transactions • Transaction = group of statements that must be executed atomically • Transaction properties: ACID • ATOMICITY = all or nothing • CONSISTENCY = leave database in consistent state • ISOLATION = as if it were the only transaction in the system • DURABILITY = store on disk !
Transactions in SQL • In “ad-hoc” SQL: • Default: each statement = one transaction • In “embedded” SQL: BEGIN TRANSACTION [SQL statements] COMMIT or ROLLBACK (=ABORT)
Transactions: Serializability Serializability = the technical term for isolation • An execution is serial if it is completely before or completely after any other function’s execution • An execution is serializable if it equivalent to one that is serial • DBMS can offer serializability guarantees
Serializability • Enforced with locks, like in Operating Systems ! • But this is not enough: User 1 User 2 LOCK A [write A=1] UNLOCK A . . .. . .. . .. . . LOCK B [write B=2] UNLOCK B LOCK A [write A=3] UNLOCK A LOCK B [write B=4] UNLOCK B time What is wrong ?
Serializability • Solution: two-phase locking • Lock everything at the beginning • Unlock everything at the end • Read locks: many simultaneous read locks allowed • Write locks: only one write lock allowed • Insert locks: one per table
Isolation Levels in SQL • “Dirty reads” SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED • “Committed reads” SET TRANSACTION ISOLATION LEVEL READ COMMITTED • “Repeatable reads” SET TRANSACTION ISOLATION LEVEL REPEATABLE READ • Serializable transactions (default): SET TRANSACTION ISOLATION LEVEL SERIALIZABLE Reading assignment: chapter 8.6
Relational Algebra • Formalism for creating new relations from existing ones • Its place in the big picture: Declartivequerylanguage Algebra Implementation Relational algebraRelational bag algebra SQL,relational calculus
Relational Algebra • Five operators: • Union: • Difference: - • Selection: s • Projection: P • Cartesian Product: • Derived or auxiliary operators: • Intersection, complement • Joins (natural,equi-join, theta join, semi-join) • Renaming: r
1. Union and 2. Difference • R1 R2 • Example: • ActiveEmployees RetiredEmployees • R1 – R2 • Example: • AllEmployees -- RetiredEmployees
What about Intersection ? • It is a derived operator • R1 R2 = R1 – (R1 – R2) • Also expressed as a join (will see later) • Example • UnionizedEmployees RetiredEmployees
3. Selection • Returns all tuples which satisfy a condition • Notation: sc(R) • Examples • sSalary > 40000(Employee) • sname = “Smithh”(Employee) • The condition c can be =, <, , >,, <>
Find all employees with salary more than $40,000. s Salary > 40000(Employee)
4. Projection • Eliminates columns, then removes duplicates • Notation: P A1,…,An(R) • Example: project social-security number and names: • PSSN, Name (Employee) • Output schema: Answer(SSN, Name)
5. Cartesian Product • Each tuple in R1 with each tuple in R2 • Notation: R1 R2 • Example: • Employee Dependents • Very rare in practice; mainly used to express joins
Relational Algebra • Five operators: • Union: • Difference: - • Selection: s • Projection: P • Cartesian Product: • Derived or auxiliary operators: • Intersection, complement • Joins (natural,equi-join, theta join, semi-join) • Renaming: r
Renaming • Changes the schema, not the instance • Notation: rB1,…,Bn (R) • Example: • rLastName, SocSocNo (Employee) • Output schema: Answer(LastName, SocSocNo)
Renaming Example Employee Name SSN John 999999999 Tony 777777777 • LastName, SocSocNo (Employee) LastName SocSocNo John 999999999 Tony 777777777
Natural Join • Notation: R1 ⋈ R2 • Meaning: R1 ⋈ R2 = PA(sC(R1 R2)) • Where: • The selection sC checks equality of all common attributes • The projection eliminates the duplicate common attributes
Employee Dependents = PName, SSN, Dname(sSSN=SSN2(Employee x rSSN2, Dname(Dependents)) Natural Join Example Employee Name SSN John 999999999 Tony 777777777 Dependents SSN Dname 999999999 Emily 777777777 Joe Name SSN Dname John 999999999 Emily Tony 777777777 Joe
Natural Join • R= S= • R ⋈ S=
Natural Join • Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R ⋈ S ? • Given R(A, B, C), S(D, E), what is R ⋈ S ? • Given R(A, B), S(A, B), what is R ⋈ S ?
Theta Join • A join that involves a predicate • R1 ⋈q R2 = sq (R1 R2) • Here q can be any condition
Eq-join • A theta join where q is an equality • R1 ⋈A=B R2 = s A=B (R1 R2) • Example: • Employee ⋈SSN=SSN Dependents • Most useful join in practice
Semijoin • R ⋉ S = PA1,…,An (R ⋈ S) • Where A1, …, An are the attributes in R • Example: • Employee ⋉ Dependents
Semijoins in Distributed Databases • Semijoins are used in distributed databases Dependents Employee network Employee ⋈ssn=ssn (s age>71 (Dependents)) T = PSSNs age>71 (Dependents) R = Employee ⋉ T Answer = R ⋈ Dependents
seller-ssn=ssn pid=pid buyer-ssn=ssn Complex RA Expressions Pname Person Purchase Person Product Pssn Ppid sname=fred sname=gizmo
Operations on Bags A bag = a set with repeated elements All operations need to be defined carefully on bags • {a,b,b,c}{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f} • {a,b,b,b,c,c} – {b,b,c,c,c,d} = {a,b} • sC(R): preserve the number of occurrences • PA(R): no duplicate elimination • Cartesian product, join: no duplicate elimination Important ! Relational Engines work on bags, not sets ! Reading assignment: 5.3 – 5.4
Finally: RA has Limitations ! • Cannot compute “transitive closure” • Find all direct and indirect relatives of Fred • Cannot express in RA !!! Need to write C program
XML • eXtensible Markup Language • XML 1.0 – a recommendation from W3C, 1998 • Roots: SGML (a very nasty language). • After the roots: a format for sharing data
Why XML is of Interest to Us • XML is just syntax for data • Note: we have no syntax for relational data • But XML is not relational: semistructured • This is exciting because: • Can translate any data to XML • Can ship XML over the Web (HTTP) • Can input XML into any application • Thus: data sharing and exchange on the Web
XML Data Sharing and Exchange application application object-relational Integrate XML Data WEB (HTTP) Transform Warehouse application relational data legacy data Specific data management tasks
From HTML to XML HTML describes the presentation
HTML <h1> Bibliography </h1> <p> <i> Foundations of Databases </i> Abiteboul, Hull, Vianu <br> Addison Wesley, 1995 <p> <i> Data on the Web </i> Abiteoul, Buneman, Suciu <br> Morgan Kaufmann, 1999
XML <bibliography> <book> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> … </bibliography> XML describes the content