 |
 |
|
 |
 |
 |
Distributed Queries,
Transactions, and Databases |
|
|
In organizations where ownership of data is distributed, data can reside in several different databases
and on different systems or nodes. These distributed databases might be built on database products
from a single vendor or from multiple vendors. Where data must be obtained from multiple database
sources to make business decisions, distributed queries are used. When updates of data in multiple
databases are needed, distributed transactions can be implemented.
In this chapter, we cover some of the key technologies used to link together distributed data and
databases. For example, distributed Oracle databases can be tied together by establishing a database
link from one database to another. As a programmer, you can make access to these distributed sources
of data appear to be location-transparent by incorporating views, procedures, and synonyms. Where
distributed updates occur across multiple Oracle databases as transactions, there are additional considerations
that we cover. We also cover the linkage to distributed databases that are non-Oracle through
the use of Heterogeneous Services, Open Database Connectivity (ODBC), or gateways.
Management of distributed databases can be handled in an autonomous manner such that each
database is managed through its own management interface. In fact, as of Oracle Database 10g,
each Oracle database installation includes an installation of Enterprise Manager Database Control
on that server. To simplify management of multiple Oracle instances, distributed Oracle databases
can be managed through a single interface called Enterprise Manager Grid Control. Since this is a
programmers’ book, we touch on Enterprise Manager in various portions of this chapter, as it relates
to enabling distributed databases, but we suggest you consult administrators’ documentation and
other sources for more details regarding managing distributed Oracle databases.
An alternative to deploying distributed databases used by many organizations is to instead consolidate
the data to a single database. Companies often build operational data stores and enterprise data warehouses
as consolidation databases to guarantee data consistency and to create a repository representing
a single version of the truth. Other advantages to this approach include tuning simplification, since optimizer
performance can be more predictable for a single consolidated database. Of course, the transaction
management is also entirely handled by the single database engine. We cover methods for data movement
used in loading such repositories elsewhere (Chapters 24 and 25) in this book. In those chapters,
we describe high-speed data movement using the Data Pump, Transportable Tablespaces, and Streams,
and more traditional data loading and management techniques that are useful in moving data to a single
database, including the use of SQL*Loader, External Tables, and Change Data Capture.
We’ll begin by exploring how to link distributed Oracle databases and some programming considerations
in using these links.Linking Distributed Oracle Databases
Distributed Oracle databases are linked for queries by establishing and leveraging database links. A
database link is most often created by a database administrator responsible for managing access to the
distributed sources of data.
Creation of the database link uses the name of the database to be linked to and the name of the server or
domain it resides on. This combination of database name and server name is represented in the global
database name. The global name better ensures uniqueness, since in many organizations it is very possible
to have common database names on more than one node (e.g., a database named sales could be
deployed in two different departments for two different purposes). The naming parameters can be initialized
through Oracle Enterprise Manager’s Database Control and are found in the Initialization
Parameters under the Administration Tab. Figure 14-1 shows the SPFile interface where a database
domain name is being added. Domain names must follow standard Internet naming conventions. In the
example, a db_domain name is provided for a database at the Stack company.Once a domain name enabling global names has been established, the next step is to create the database
link from one database to another. Links are typically created as private or public links depending on the
privileges assigned to create such links and the access that is to be granted to the users. By default, link
users and their passwords will be the same as at the originating database. Users are restricted where
private links are available, whereas any user can access a public link.
Links can also be created with specified usernames and passwords. This is called an authenticated link
and can be used for private or public connections.
A private link would be created vian SQL using the global database name for the database named orcl
on the us.stack.com domain as follows:
CREATE DATABASE LINK orcl.us.stack.com;
A public link could be created vian SQL using a similar syntax, but specifying public:
CREATE PUBLIC DATABASE LINK orcl.us.stack.com;
Adatabase link can also be created using Oracle Enterprise Manager. Figure 14-2 shows the interface through
which a DBA could create a link (found under the Administration tab in Enterprise Manager Database Control)
by providing a database link name, Oracle*Net service name, connection type, username, and password
information.Once the link is created, you can connect through the link by specifying the global database name. For
example, a query might be formulated as follows:
SELECT * FROM sh.products@orcl.us.stack.com;
317
Distributed Queries, Transactions, and Databases
Views, synonyms, and procedures are useful in distributed database programs, since they can create the
appearance of location transparency hiding the global naming that is used. In the following example, a view
named supply would be formed and provide location transparency by specifying the global database name
in a SELECT from a remote products table while also performing a SELECT from a local inventory table:
CREATE VIEW supply AS
SELECT i.product_id, i.warehouse_id, i.quantity_on_hand, p.prod_name, p.prod_status
FROM oe.inventories i, sh.products@orcl.us.stack.com p
WHERE i.product_id = p.prod_id;
Synonyms are references to objects that can be created for tables, types, views, materialized views,
sequences, procedures, functions, and packages. A synonym might be created for the previous products
table in the orcl.us.stack.com database using the following syntax:
CREATE SYNONYM products FOR sh.products@orcl.us.stack.com;
Alternatively, a DBA could also create a synonym using Enterprise Manager Database Control (under
the Administration tab) as shown in Figure 14-3:Procedures can provide transparency by referencing linked data and embedding the global name in the
reference or by leveraging synonyms. As a programmer, you will likely work with your DBAs to understand
the alternatives available to you for building access to distributed data into your applications.
Distributed applications using database links do require some considerations. Since you may be retrieving
large quantities of data across your network, you should also design your applications such that only
necessary data should be transferred. For best optimization, you should use the cost-based optimizer
318
Chapter 14
regardless of Oracle database version. (Note that for Oracle Database 10g and subsequent versions, the
cost-based optimizer is the only optimizer supported by Oracle.)
The cost-based optimizer will rewrite a distributed query using collocated inline views where possible.
A collocated inline view enables selection of data from multiple tables in a single local database, thus
reducing remote database access. The optimization includes merging all mergeable views, creating the
collocated inline view, performing a collocated query block test, and rewriting the query using the collocated
inline views. Of course, the optimization is completely transparent to the person or the application
that submits the query. Such optimization is generally not as effective in situations where there are
aggregates, subqueries, or complex SQL present.
In addition to leveraging the optimizer, a method sometimes employed to improve performance and
help optimization is by building views. For example, you might build a view over multiple remote
tables. Building procedural code is always a possibility, though not often needed for distributed queries.
In situations where updates occur across distributed databases composed of multiple separate
databases, failure of part of a distributed statement could trigger an integrity constraint violation. We’ll
next discuss two-phase commit, since your applications will leverage this capability but should also
handle unusual occurrences when transactions are updated.SapereOnline |
|
|
|
|
|
|
 |
ORACLE
|
|
|
|
|
| |
|
 |
|
 |
|
 |