1 / 16

Extraction, Transformation, and Loading (ETL)

Extraction, Transformation, and Loading (ETL). Loading. Objectives. After completing this lesson, you should be able to implement the following methods that are available for loading data: SQL*Loader External tables OCI and direct-path APIs Data Pump Export/import. Data-Loading Mechanisms.

maine
Download Presentation

Extraction, Transformation, and Loading (ETL)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extraction, Transformation,and Loading (ETL) Loading

  2. Objectives • After completing this lesson, you should be able to implement the following methods that are available for loading data: • SQL*Loader • External tables • OCI and direct-path APIs • Data Pump • Export/import

  3. Data-Loading Mechanisms • You can use the following mechanisms for loading a data warehouse: • SQL*Loader • External tables • OCI and direct-path APIs • Export/import • Data Pump

  4. Loading Mechanisms • SQL*Loader loads a formatted flat file into an existing table. • It can perform basic transformations while loading. • Direct-path loading may be used to decrease the load time. • When you use this method, data in the flat file is not accessible until the data is loaded.

  5. SQL*Loader: Example • Control file used for loading the SALES table: • The fact table can be loaded with the following command: • LOAD DATA INFILE sh_sales.dat APPEND INTO TABLE sales • FIELDS TERMINATED BY "|" • (PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID, QUANTITY_SOLD, AMOUNT_SOLD) • $ sqlldr sh/sh control=sh_sales.ctl \ direct=true

  6. Loading Mechanisms • External tables are read-only tables where the data is stored outside the database in flat files. • The data can be queried like a virtual table, using any supported language inside the database. • No DML is allowed and no indexes can be created. • The metadata for an external table is created using a CREATE TABLE statement. • An external table describes how the external data should be presented to the database.

  7. Applications of External Tables • External tables: • Allow external data to be queried and joined directly and in parallel without requiring it to be loaded into the database • Eliminate the need for staging the data within the database for ETL in data warehousing applications • Are useful in environments where an external source has to be joined with database objects and then transformed • Are useful when the external data is large and not queried frequently • Complement SQL*Loader functionalities: • Transparent parallelism • Full SQL capabilities for direct-path insertion

  8. Example of Defining External Tables CREATE TABLE sales_delta_xt ( prod_id NUMBER(6), cust_id NUMBER, time_id DATE,unit_cost, unit_price ...) ORGANIZATION external ( -- External Table TYPE oracle_loader–- Access Driver DEFAULT DIRECTORY data_dir–- Files Directory ACCESS PARAMETERS –- Similar to SQL*Loader ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII BADFILE log_dir:'sh_sales_%p.bad' LOGFILE log_dir:'sh_sales_%p.log_xt' FIELDS TERMINATED BY "|" LDRTRIM ) location ( 'sales_delta.dat', data_dir2:'sales_delta2.dat' )) PARALLEL 5 –- Independent from the number of files REJECT LIMIT UNLIMITED;

  9. Populating External Tables with Data Pump CREATE TABLE emp_ext (first_name, last_name, department_name) ORGANIZATION EXTERNAL ( TYPE ORACLE_DATAPUMP DEFAULT DIRECTORY ext_dir LOCATION ('emp1.exp',) ) AS SELECT e.first_name,e.last_name,d.department_name FROM employees e, departments d WHERE e.department_id = d.department_id AND d.department_name in ('Marketing', 'Purchasing');

  10. Privileges for External Tables • Access to the external tables for other users requires: • SELECT on the table definition • READ access to the directory containing the file • WRITE access to the directory for the bad file and log file GRANT SELECT ON sh.sales_delta_xt TO oe; GRANT READ ON DIRECTORY data_dir TO oe; GRANT WRITE ON DIRECTORY log_dir TO oe;

  11. Defining External TablesUsing SQL*Loader • After creating a control file, SQL*Loader can generate a log file with the SQL commands to: • Create the metadata for the external table • Insert the data into the target table • Drop the metadata for the external table sqlldr sh/sh control=sales_dec00.ctl EXTERNAL_TABLE=GENERATE_ONLY LOG=sales_dec00.sql

  12. Data Dictionary Information for External Tables • DBA_EXTERNAL_LOCATIONS • OWNER • TABLE_NAME • LOCATION • DIRECTORY_OWNER • DIRECTORY_NAME • DBA_DIRECTORIES • OWNER • DIRECTORY_NAME • DIRECTORY_PATH • DBA_EXTERNAL_TABLES • OWNER • NAME • TYPE_OWNER • TYPE_NAME • DEFAULT_DIRECTORY_OWNER • DEFAULT_DIRECTORY_NAME • REJECT_LIMIT

  13. Changing External Data Properties • Using the ALTER TABLE command, you can change: • DEFAULT DIRECTORY • ACCESS PARAMETERS • LOCATION • REJECT_LIMIT • Degree of parallelism • Useful in situations where external files are changing: ALTER TABLE sales_delta_xt LOCATION ('newfile1.dat')

  14. Other Loading Methods • OCI and direct-path APIs: • Allow transformation and loading at the same time • Access an online source • Do not require an intermediary step such as a flat file • Export/import: • Is good for small loads • Allows for easy transfers between Oracle databases on different operating systems

  15. Summary • In this lesson, you should have learned how to implement the following methods that are available for loading data: • SQL*Loader • External tables • OCI and direct-path APIs • Data Pump • Export/import

  16. Practice 5: Overview • This practice covers the following topics: • Loading data from a flat file using SQL*Loader • Loading data from a flat file using external tables

More Related