Something new I learned about estimated cardinalities...
Another thing that has bothered me - is the question "why doesn't the optimizer learn from its mistakes". If it estimated a certain step in a given plan would return 5,000 rows - and it discovers through experience "it only returns 5 rows" - why doesn't it 'learn' from that.
Well, 11g is addressing both of these. I was aware of the 2nd issue being addressed - features like adaptive cursor sharing are all about that.
But I stumbled on a new feature (first available in 11.1 releases) that started affecting table functions (in 11.2) - that feature is called 'cardinality feedback'.
Normally, when you have a pipelined function - the estimated cardinality is computed based on your blocksize - the default number of rows that will come from it are based on your database block size. I have an 8k block size so....
.
ops$tkyte%ORA11GR2> create or replace type str2tblType
as table of varchar2(30);
2 /
Type created.
ops$tkyte%ORA11GR2> create or replace
2 function str2tbl( p_str in varchar2, p_delim in varchar2 default ',' )
3 return str2tblType
4 PIPELINED
5 as
6 l_str long default p_str || p_delim;
7 l_n number;
8 begin
9 loop
10 l_n := instr( l_str, p_delim );
11 exit when (nvl(l_n,0) = 0);
12 pipe row ( ltrim(rtrim(substr(l_str,1,l_n-1))) );
13 l_str := substr( l_str, l_n+1 );
14 end loop;
15 return;
16 end;
17 /
Function created.
ops$tkyte%ORA11GR2> column plan_table_output format a80 truncate
ops$tkyte%ORA11GR2> variable in_list varchar2(255)
ops$tkyte%ORA11GR2> exec :in_list := 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE';
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select *
2 from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t;
Execution Plan
----------------------------------------------------------
Plan hash value: 2407808827
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 29 (0
| 1 | COLLECTION ITERATOR PICKLER FETCH| STR2TBL | 8168 | 16336 | 29 (0
--------------------------------------------------------------------------------
The optimizer guesses 8168 rows. It all probability - the real number of rows is not anywhere near 8168. If we use this estimated row count in a bigger query - we'll probably end up with the wrong plan (I like to say - wrong card=wrong plan, right card=right plan - where card is cardinality).
Now, if I run the query and ask Oracle "what plan did you use", we get to see 'reality' - not an explain plan (explain plans are many times not representative of reality!) and reality said:
ops$tkyte%ORA11GR2> set autotrace off
ops$tkyte%ORA11GR2> set serveroutput off
ops$tkyte%ORA11GR2> select *
2 from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t;
COLUMN_VALUE
------------------------------
DBMS_PIPE
DBMS_OUTPUT
UTL_FILE
ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID cu030hs8vrjjn, child number 1
-------------------------------------
select * from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t
Plan hash value: 2407808827
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 29 (100
| 1 | COLLECTION ITERATOR PICKLER FETCH| STR2TBL | 8168 | 16336 | 29 (0
--------------------------------------------------------------------------------
13 rows selected.
Which is not any different from explain plan at this point. Reality says "I came up with a plan based on an estimated cardinality of 8168 rows"... However, the database actually *ran* the query this time - and the database has learned from its mistake:
.
ops$tkyte%ORA11GR2> select *
2 from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t;
COLUMN_VALUE
------------------------------
DBMS_PIPE
DBMS_OUTPUT
UTL_FILE
ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID cu030hs8vrjjn, child number 2
-------------------------------------
select * from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t
Plan hash value: 2407808827
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 29 (100
| 1 | COLLECTION ITERATOR PICKLER FETCH| STR2TBL | 6 | 12 | 29 (0
--------------------------------------------------------------------------------
Note
-----
- cardinality feedback used for this statement
17 rows selected.
Apparently, it hard parsed that query - it did not reuse the plan. We know that it hard parsed since it just created child #2 - and we can see the plan is different - the row count is much lower.
Now, in this example, onl y the estimated row counts changed - the actual plan is the same (the query is after all very simple). Will this affect real world queries?
Yes, it will :)
ops$tkyte%ORA11GR2> create table data.
2 as
3 select *
4 from all_objects;
Table created.
ops$tkyte%ORA11GR2> create index data_idx on data(object_name);
Index created.
ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'DATA' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> set serveroutput off
ops$tkyte%ORA11GR2> with T as
2 ( select distinct * from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t where rownum > 0 )
3 select * from data, t where data.object_name = t.column_value
4 /
....
10 rows selected.
ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor());
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID bv5ar1b8sft67, child number 0
-------------------------------------
with T as ( select distinct * from TABLE(cast( str2tbl( :in_list ) as
str2tblType) ) t where rownum > 0 ) select * from data, t where
data.object_name = t.column_value
Plan hash value: 892089582
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 319 (100)| |
|* 1 | HASH JOIN | | 13730 | 1528K| 319 (1)| 00:00:04 |
| 2 | VIEW | | 8168 | 135K| 30 (4)| 00:00:01 |
| 3 | HASH UNIQUE | | 8168 | 16336 | 30 (4)| 00:00:01 |
| 4 | COUNT | | | | | |
|* 5 | FILTER | | | | | |
| 6 | COLLECTION ITERATOR PICKLER FETCH| STR2TBL | 8168 | 16336 | 29 (0)| 00:00:01 |
| 7 | TABLE ACCESS FULL | DATA | 72236 | 6842K| 288 (1)| 00:00:04 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("DATA"."OBJECT_NAME"="T"."COLUMN_VALUE")
5 - filter(ROWNUM>0)
27 rows selected.
Ugh, one of those (tongue in cheek) nasty full scans and hash joins (they are good - when they are appropriate, they aren't appropriate here...)
as opposed to
ops$tkyte%ORA11GR2> with T as
2 ( select distinct * from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t where rownum > 0 )
3 select * from data, t where data.object_name = t.column_value
4 /
...
10 rows selected.
ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor());
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID bv5ar1b8sft67, child number 1
-------------------------------------
with T as ( select distinct * from TABLE(cast( str2tbl( :in_list ) as
str2tblType) ) t where rownum > 0 ) select * from data, t where
data.object_name = t.column_value
Plan hash value: 3947981921
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 48 (100)| |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 10 | 1140 | 48 (3)| 00:00:01 |
| 3 | VIEW | | 6 | 102 | 30 (4)| 00:00:01 |
| 4 | HASH UNIQUE | | 6 | 12 | 30 (4)| 00:00:01 |
| 5 | COUNT | | | | | |
|* 6 | FILTER | | | | | |
| 7 | COLLECTION ITERATOR PICKLER FETCH| STR2TBL | 6 | 12 | 29 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | DATA_IDX | 2 | | 2 (0)| 00:00:01 |
| 9 | TABLE ACCESS BY INDEX ROWID | DATA | 2 | 194 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter(ROWNUM>0)
8 - access("DATA"."OBJECT_NAME"="T"."COLUMN_VALUE")
Note
-----
- cardinality feedback used for this statement
33 rows selected.
Nice - it learned from the errors of its ways...
Note: this affects more than just table functions - read some of the links found on google to see other interesting examples.
Very nice.


