 |
 |
Business Intelligence
Analysis |
|
|
As business intelligence can be defined as the retrieval and manipulation of data needed to produce
information required for business decisions, data analysis often plays a critical role. This analysis
of data increasingly occurs in the database and leverages analysis functions provided by the database.
Until recently, most business analysts extracted data out of the database into spreadsheets, client/server
business intelligence tools on the desktop, and special online analytical processing (OLAP) servers.
The growth of browser-based deployment strategies created the desire to perform more analysis
where the data is—in the database. As a result, a growing amount of analysis now uses extensions
to SQL and Java APIs provided in Oracle. Many popular business intelligence tools, such as Oracle
Discoverer, Business Objects, and Cognos, provide support for the extended SQL and the APIs.
This chapter describes the use of standard database analytic capabilities and also describes some
of the optional components of the Oracle database: the OLAP Option and Data Mining Option.
These solutions provide a range of analysis sophistication, from simple mathematical functions to
trending and forecasting to the application of sophisticated algorithms and mathematical models
used in making predictions. Note that some of these examples are date specific and that Oracle
periodically revises the sample data set to make the data appear more current or more interesting.
If your query runs but the results vary, chances are that this is due to such changes in the data set.
Simply changing the year you query to one that is relevant to your data set will yield results. We’ll
begin by examining the simpler techniques possible using analytic SQL extensions in Oracle’s database.
SQL Analytics and Statistics
Oracle’s family of SQL analytic functions was first introduced in Oracle9i. The two releases of Oracle9i
focused on providing the library of analytic functions defined in the 1999 ANSI SQL extensions.
Oracle Database 10g adds many more types of analytics and statistics beyond those in the standard.
Some of the commonly used functions include ranking and percentiles, cumulative and moving
aggregate windows, reporting, lag/lead analysis, first/last analysis in an ordered group, linear
regression and other statistics, inverse percentile, and hypothetical rank and distribution.A multistep processing flow is defined when programs are written to use these functions. First, joins,
WHERE, GROUP BY, and HAVING clauses are performed. The GROUP BY is used to create analytic partitions
in order to divide query results into groups of rows. (Note: These analytic partitions are not be confused
with Oracle’s table partitions.) In the next step of the process, analytic functions are applied to the results
set. When a query has an ORDER BY clause at the end, that processing then occurs.
In the following sections, we provide examples showing usage of many of these functions in SQL
statements and expected results using the sample schema (sh and oe) and data that are part of the
standard installation of an Oracle Database 10g release. This chapter serves as an introduction to these
functions. Additional examples of these and other analytic functions are provided in the Oracle Data
Warehousing Guide, part of the standard database documentation set provided by Oracle.
Ranking and Percentiles
Ranking functions sort values from smallest to largest (or vice versa) and provide a numeric indication
of where a value falls on the list of values. The default for ranking is ascending order (the smallest value
ranked first, followed by the second smallest, the third smallest, and so on) though you can also specify
rank in descending order. Simple ranking (the RANK function) handles ties by giving equal values the
same rank and then giving the next value a ranking based on where it falls on the list (so if two values
are tied for fifth, the next value would be ranked seventh). Alternatively, you could use Oracle’s DENSE_RANK,
which would instead rank the next value sixth if a tie occurs in the fifth ranking. Null values are considered
larger than any other value, though you can override this for either ascending or descending order by
specifying a NULLS FIRST or NULLS LAST clause.
Let’s look at some examples. You might want to see a ranking of the products you have in inventory
from fewest to most. The following syntax would enable you rank the quantity_on_hand using the
ranking function:
SELECT product_id,
TO_CHAR (quantity_on_hand, ‘9,999,999’) QUANTITY,
RANK() OVER (ORDER BY quantity_on_hand) AS quantity_rank
FROM oe.toronto_inventory;
The highest ranked results (lowest quantity in inventory) reported using Oracle’s sample data set would
appear as in Figure 27-1 in iSQL*Plus, showing the product ID, quantity, and quantity rank. |
|
|
|
|
|