Approximate Query Processing appears to be a major feature of Oracle Database 12c Release 2 SQL. In 12cR1 patchset 2 (12.1.0.2), we saw one approximate query function appear – APPROX_COUNT_DISTINCT to use instead of COUNT(DISTINCT). In 12.2 there are several APPROX functions introduced:
APPROX_COUNT_DISTINCT_DETAIL
APPROX_COUNT_DISTINCT_AGG
TO_APPROX_COUNT_DISTINCT
APPROX_MEDIAN
APPROX_PERCENTILE
APPROX_PERCENTILE_DETAIL
APPROX_PERCENTILE_AGG
TO_APPROX_PERCENTILE
According to Oracle documentation, “Approximate query processing is primarily used in data discovery applications to return quick answers to explorative queries. Users typically want to locate interesting data points within large amounts of data and then drill down to uncover further levels of detail. For explorative queries, quick responses are more important than exact values.”
The interesting part is that you can utilize the Approximate functions without changing code. There are three initialization parameters introduced to control which functions should be treated as an approximate function during run time… cool!
The initialization parameters are:
approx_for_aggregation
approx_for_count_distinct
approx_for_percentile
To replace only the COUNT(DISTINCT)
function in queries with the APPROX_COUNT_DISTINCT
function, set the approx_for_count_distinct
initialization parameter to TRUE
for the current session or for the entire database.
To replace percentile functions with the corresponding functions that return approximate results, set approx_for_percentile
to PERCENTILE_CONT
, PERCENTILE_DISC
, or ALL
(replaces all percentile functions) for the current session or for the entire database.
To run all queries using the corresponding SQL functions that return approximate results instead of the specified SQL functions, set the approx_for_aggregation
initialization parameter to TRUE
for the current session or for the entire database.
Read more… Oracle Documentation
Thanks, Feature is good, but I discovered that returned approximate count is sometime greater than actual count of the rows. Do you know what logic Oracle has applied there?