Sapere On Line - basic courses, professional and advanced

LINUX

     Is Linux Really FREE?
     Getting Linux
     So What Do I Gain?
     What Do I Lose?
     Getting a Free Copy
     Dual Booting
Preserving Your Data
Windows on Linux
Getting Ready for Your Installation
Hardware Considerations
Dual Booting Revisited
An Installation Comparison
A Mandrake Linux Install
Putting On The Red Hat
Rambler's 

Top100

 

Linux

Data Loading and
Management

In Chapter 24, “High-Speed Data Movement,” we covered the use of Streams, Transportable
Tablespaces, and the new high-speed import/export mechanism in Oracle Database 10g. In this
chapter, we’ll discuss facilities more typically used in batch loading. As noted in the previous
chapter, the scheduling and maintenance of loading processes are often managed by a database
administrator. A database administrator and database architect are usually directly involved in the
design of such loading scripts, since they must perform well and provide a true representation of
the data that business analysts desire. This chapter provides an introduction into the capabilities
that Oracle has in this area. You’ll likely want to further consult data warehousing and database
administration books on this topic.
Batch loads are typically used when very large data volumes are loaded and where data
transformations are necessary and complex. Oracle includes SQL*Loader, External Tables, and
other extraction, transformation, and loading (ETL) extensions to the database. We describe these
capabilities and provide some example scripts in this chapter. Since ETL scripts are increasingly
more likely to be created using ETL tools instead of through manual programming, we also
describe how you can use Oracle’s Warehouse Builder (OWB) tool to generate ETL scripts.
Aclassic approach to ETL includes processing each needed transformation separately with creation
and loading of temporary staging tables at each step. This approach provides a means to easily
determine where problems might occur in a series of ETL processes. The alternative approach,
used where load times and available disk space are a concern, is to combine multiple transformations
into a single SQL statement.
In this chapter, we describe many techniques for speed-up, including combining of transformations
and leveraging ETL functionality that Oracle has introduced in the database. We discuss how
minimizing steps and Oracle’s features can enable parallel pipelined data transformations to occur.
Parallel pipelining is an important concept in speed-up. Parallel pipelining enables loading to begin
while other data is still going through the transformation process. Leveraging pipelining and other
features outlined here can make a dramatic improvement in reducing the time it takes to extract,
transform, and load your data.
We’ll also describe how partitions in Oracle’s Partitioning Option can be leveraged to reduce these load
windows, and we’ll touch on the usage of Enterprise Manager for scheduling jobs and maintenance of
partitions.SQL*Loader
SQL*Loader enables loading data from flat files into an Oracle database. In legacy relational databases
where ODBC is not supported, or where nonrelational databases are used, dumping the data into a flat
file may provide the best means to make the data available for movement. Of course, you might have
your own set of flat files containing data from spreadsheets or other sources that you also want to load
into Oracle.
Higher-speed direct-path loading provided through SQL*Loader can be used where data transformations
between source and target are limited to data type conversions and handling NULLs. Where more complex
transformations are needed, you can write the transformations in SQL and use the slower conventional-path
loader. For additional speed-up, SQL*Loader can be set up to load jobs in parallel, providing scalable
performance gains by taking advantage of multiple CPUs that are common in the database server computer
platforms of today.
Direct path loading does have some restrictions. Tables to be loaded cannot have any transactions pending
and the tables cannot be clustered. When loading a single partition, a global index must not have been
applied. Referential and check constraints must not be enabled on the table that the partition is a member
of. Finally, enabled triggers are not allowed.
For purposes of illustration, assume you have a sales data file from October of 2003 that you want to
load into the sales fact table that is defined in the sh schema provided as part of the Oracle Database 10g
standard installation. The name of the flat source data file in our illustration will be salesOct2003.dat,
with fields in this file delimited by semicolons (;). A control file would need to be created that could be
called from SQL*Loader. You will name the control file sh_sales_fact.ctl. The control file is created
containing syntax similar to the following:
LOAD DATA
INFILE salesOct2003.dat
APPEND
INTO TABLE sh.sales
FIELDS TERMINATED BY “;”
(PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID,
QUANTITY_SOLD, AMOUNT_SOLD)
The data could be direct-path loaded using the following command from a valid administrator account:
sqlldr admin/password control=sh_sales_fact.ctl direct=true
640
Chapter 25
Alternatively, through Oracle Enterprise Manager (installed as part of a standard Oracle Database 10g
installation), you could run the control file by selecting the Maintenance tab and then the Load Data
from File link. This wizard-driven utility first asks you for the control filename (see Figure 25-1), the data
filename (if not specified in the control file), the load method, and options governing row skipping, load
termination, and indexes. The wizard then provides you with a summary of your selections. You can
then choose to submit the job or schedule it.SapereOnLine
 


ORACLE

Oracle Data Dictionary
Installing Oracle
Introduction to SQL
Extended SQL
Indexes
Constraints
Other Database Structures
Functions
Distributed Queries, Transactions, and Databases
PL/SQL Basics
PL/SQL and SQL
PL/SQL Packages
Introduction to Java Database Programming
Triggers
Regular Expressions and Expression Filter
Object Relational Interactions with Oracle
Oracle XML DB
HTML-DB
High-Speed Data Movement
Data Loading and Management
Business Intelligence Query
Business Intelligence Analysis
Optimization
© 2008 The Company, Inc. All rights reserved. Terms of Use and Disclaimer


-