1 / 35

Deleting LOTS of Data

Deleting LOTS of Data. Author: Randy Cunningham, OCP. Why Use DELETE?. Retention policies; good data hygiene Data de-duplication Corporate mergers & acquisitions Purge and archival processing Cleaning out temporary & work tables Reclaim storage space Obtain better performance?….

jemima
Download Presentation

Deleting LOTS of Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Deleting LOTSof Data Author: Randy Cunningham, OCP

  2. Why Use DELETE? • Retention policies; good data hygiene • Data de-duplication • Corporate mergers & acquisitions • Purge and archival processing • Cleaning out temporary & work tables • Reclaim storage space • Obtain better performance?…

  3. What Are The Issues? • DELETE never finishes • DELETE takes too long • Locking • Impact on application performance • Desired space not released by DELETE • UNDO space & errors

  4. What Happens During a DELETE X

  5. MVLog X 5 1 2 3 6 4 X X Undo Insert… Redo What Happens During a DELETE

  6. Alternatives to DELETE • TRUNCATE TABLE • CTAS, truncate and reinsert data to keep • Use partitioning: • Truncate partition • Drop partition • Exchange partition • DELETE confined to one, or a few, partitions

  7. TRUNCATE TABLE • Upside: typically many times faster than using an unqualified DELETE operation • Downsides: • Does not operate in transactional context (it’s DDL) • You can’t rollback a TRUNCATE • Cannot truncate parent rows in a R.I. Constraint • It is DDL and requires DROP TABLE privileges

  8. Delegating TRUNCATE rights CREATE PROCEDURE trunc_do (fqtn VARCHAR2) IS BEGIN EXECUTE IMMEDIATE'TRUNCATE TABLE '||fqtn; END trunc_do; GRANT EXECUTE ON trunc_do TOPUBLIC;

  9. Delegating TRUNCATE rights CREATE PROCEDURE trunc_do (fqtn VARCHAR2) IS BEGIN EXECUTE IMMEDIATE'TRUNCATE TABLE '||fqtn; END trunc_do; GRANT EXECUTE ON trunc_do TO PUBLIC;

  10. Delegating TRUNCATE rights CREATE PROCEDURE trunc_do (fqtn VARCHAR2) IS BEGIN EXECUTE IMMEDIATE'TRUNCATE TABLE '||fqtn; END trunc_do; CREATE PROCEDURE trunc (owner IN VARCHAR2 DEFAULT USER, table_name IN VARCHAR) AUTHID CURRENT_USER IS fqtn VARCHAR2(80) := owner || '.' || table_name; BEGIN EXECUTE IMMEDIATE-- Ensure DELETE rights 'DELETE FROM ' || fqtn || ' WHERE 1=2'; trunc_do (fqtn); END trunc; GRANT EXECUTE ON trunc TOPUBLIC;

  11. Create Table as Select… • Use where most of the data are deleted • Good for sidestepping UNDO issues • Provides quick & simple fault resilience • Downside: requires intermediate storage

  12. Create Table as Select: Example CREATE TABLE student_temp AS SELECT * FROM student WHERE status_flag IN ('A','N','T') OR enroll_date >= to_date (’02/15/2006’, ‘MM/DD/YYYY’); TRUNCATE TABLE student; INSERT /*+APPEND*/ INTO student SELECT * FROM student_temp; DROP TABLE student_temp;

  13. Create Table as Select: Observations • Run steps manually, or provide thorough error handling • All limitations of TRUNCATE TABLE apply • Ensure there is enough space for the CTAS temporary holding table • Be sure to use APPEND hint in the INSERT • Does not work for tables with LONG data type (use SQL*Plus COPY command instead)

  14. Table Partitioning • DROP PARTITION – ideal for temporal data • TRUNCATE PARTITION – for cyclic data • EXCHANGE PARTITION – for archival • DELETE … [PARTITION] – a full scan of the partition is quicker than a full table scan

  15. Dropping a Table Partition • Quickly eliminates an entire range or list • Example: ALTER TABLE enrollment DROP PARTITION Year2001 UPDATE GLOBAL INDEXES; 2002 2003 2004 2005 2006

  16. Jun May Apr Mar Feb Jan Dec Nov Oct Sep Aug Jul Truncating a Table Partition • Quick, easy way to manage cyclic partitions • Example: ALTER TABLE enrollment TRUNCATE PARTITION mar_data UPDATE GLOBAL INDEXES; X

  17. Exchanging a Table Partition • Quick way to excise data from the table, but retain it for archival or backup: CREATE TABLE enrolled03_2005 AS SELECT * FROM enrollment WHERE 1=2; ALTER TABLE enrollment EXCHANGE PARTITION mar_data WITH TABLE enrolled03_2005 UPDATE GLOBAL INDEXES;

  18. DELETE on a Table Partition • If the predicate filters the partition key, then only specific partitions are scanned. • In this case, it is not necessary to specify PARTITION in the DELETE command. • Look for these in the EXPLAIN PLAN: • PARTITION RANGE SINGLE • PARTITION RANGE ITERATOR

  19. If DELETE You Must… • Check the EXPLAIN PLAN for the DELETE! • DELETE in batches • Get indexes out of the way • Use ROWID to minimize interference • Use ROWID to implement parallelism

  20. Check EXPLAIN PLAN for DELETE • If most rows are being deleted, you will want to see a full table scan. • If most blocks are being visited, you will want to see a full table scan. • Optimize subquery form DELETE to perform well as a stand-alone SELECT.

  21. DELETE in batches • Delete a million (or so) rows at a time: DELETE FROM evt WHERE status = 'X' AND rownum <= 1000000; • Delete an unrelated partition at a time: DELETE FROM evt PARTITION div162 WHERE status = 'X';

  22. Get Indexes Out of the Way • Old way: Drop the indexes and build them from scratch following deletion • New way: Alter the indexes unusable and rebuild them following deletion: ALTER INDEXu_name_ixUNUSABLE; DELETE FROM namesWHERE … ALTER INDEXu_name_ixREBUILD;

  23. Before You Tamper With Indexes… • Be certain an index isn’t needed to facilitate the access path for your DELETE • Ensure that a significant proportion of the rows in the table are affected • Test to be sure that it is worthwhile • Be sure that other database operations are not relying on the index (best done during a scheduled maintenance window)

  24. ROWID is your friend • Isolates query operation from DELETE • Minimizes number of block changes • Is blazingly fast and efficient • Overcomes key preserved table restrictions • Can facilitate DELETE workload mgmt: • Restartability • Home-brew parallelization • Batching

  25. ROWID Case Study 1 • We have identified extraneous rows: SELECT * FROM ( SELECT domain_owner, latest_uce, MAX(latest_uce) OVER (PARTITION BY domain_owner) very_latest FROM spammers) WHERE latest_uce < very_latest

  26. ROWID Case Study 1 DELETE FROM ( SELECT * FROM ( SELECT domain_owner, latest_uce, MAX(latest_uce) OVER (PARTITION BY domain_owner) very_latest FROM spammers) WHERE latest_uce < very_latest)

  27. ROWID Case Study 1 Result: ORA-01752: cannot delete from view without exactly one key-preserved table

  28. ROWID Case Study 1 DELETE FROM spammers WHERE ROWID IN ( SELECT ROWID FROM ( SELECT domain_owner, latest_uce, MAX(latest_uce) OVER (PARTITION BY domain_owner) very_latest FROM spammers) WHERE latest_uce < very_latest)

  29. ROWID Case Study 2 - Deduplication DELETEFROM Customers WHERE ROWID IN ( SELECT DISTINCT ROWID FROM (SELECT ROWID, MIN (ROWID) OVER (PARTITION BY Cust_Last_Name, Cust_First_name) Best FROM Customers) WHERE ROWID <> Best);

  30. ROWID Case Study 3 - Staging CREATE GLOBAL TEMPORARY TABLE Rowid_Tbl (Xrowid ROWID) ON COMMIT PRESERVE ROWS; INSERT INTO Rowid_Tbl SELECT DISTINCT ROWID FROM (SELECT ROWID, …FROMHuge_Table …); DELETEFROMHuge_Table WHERE ROWID IN (SELECT Xrowid FROMRowid_Tbl) ;

  31. ROWID Case Study 4 – Workload DEFINE N=8 -- # buckets desired CREATE TABLE Delete_Driver (rid ROWID, pctile NUMBER, PRIMARY KEY (rid)); INSERT /*+APPEND*/ INTO Delete_Driver SELECT ROWID rid, ntile(&N) OVER (ORDER BY ROWID) pctile FROM Warranty WHERE Warranty_Expiration < TO_DATE('01/01/2002','MM/DD/YYYY');

  32. ROWID Case Study 4 - Workload • Each batch, whether run in serial or parallel, operates this way: Define Job=4 -- set between 1 and 8 DELETE FROM (SELECT NULL FROM Warranty, Delete_Driver WHERE Warranty.ROWID = Delete_Driver.rid AND pctile=&Job); COMMIT;

  33. ROWID Caveats • Not for use in portable SQL… it is specific to Oracle • Don’t use stale ROWIDs… don’t keep them around permanently • Not for use where rows migrate constantly

  34. Questions?

  35. Questions

More Related