Wednesday, April 28, 2010

Something new I learned about estimated cardinalities...

I've used pipelined functions and other table functions many times in the past. One of the drawbacks with using them is the fact that the optimizer has no clue what the estimated cardinality will be.

Another thing that has bothered me - is the question "why doesn't the optimizer learn from its mistakes". If it estimated a certain step in a given plan would return 5,000 rows - and it discovers through experience "it only returns 5 rows" - why doesn't it 'learn' from that.

Well, 11g is addressing both of these. I was aware of the 2nd issue being addressed - features like adaptive cursor sharing are all about that.

But I stumbled on a new feature (first available in 11.1 releases) that started affecting table functions (in 11.2) - that feature is called 'cardinality feedback'.

Normally, when you have a pipelined function - the estimated cardinality is computed based on your blocksize - the default number of rows that will come from it are based on your database block size. I have an 8k block size so....


ops$tkyte%ORA11GR2> create or replace type str2tblType
as table of varchar2(30);
2 /

Type created.

ops$tkyte%ORA11GR2> create or replace
2 function str2tbl( p_str in varchar2, p_delim in varchar2 default ',' )
3 return str2tblType
4 PIPELINED
5 as
6 l_str long default p_str || p_delim;
7 l_n number;
8 begin
9 loop
10 l_n := instr( l_str, p_delim );
11 exit when (nvl(l_n,0) = 0);
12 pipe row ( ltrim(rtrim(substr(l_str,1,l_n-1))) );
13 l_str := substr( l_str, l_n+1 );
14 end loop;
15 return;
16 end;
17 /

Function created.

ops$tkyte%ORA11GR2> column plan_table_output format a80 truncate
ops$tkyte%ORA11GR2> variable in_list varchar2(255)
ops$tkyte%ORA11GR2> exec :in_list := 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE';

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2> set autotrace traceonly explain
ops$tkyte%ORA11GR2> select *
2 from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t;

Execution Plan
----------------------------------------------------------
Plan hash value: 2407808827

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8168 | 16336 | 29 (0
| 1 | COLLECTION ITERATOR PICKLER FETCH| STR2TBL | 8168 | 16336 | 29 (0
--------------------------------------------------------------------------------

.
The optimizer guesses 8168 rows. It all probability - the real number of rows is not anywhere near 8168. If we use this estimated row count in a bigger query - we'll probably end up with the wrong plan (I like to say - wrong card=wrong plan, right card=right plan - where card is cardinality).

Now, if I run the query and ask Oracle "what plan did you use", we get to see 'reality' - not an explain plan (explain plans are many times not representative of reality!) and reality said:


ops$tkyte%ORA11GR2> set autotrace off
ops$tkyte%ORA11GR2> set serveroutput off
ops$tkyte%ORA11GR2> select *
2 from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t;

COLUMN_VALUE
------------------------------
DBMS_PIPE
DBMS_OUTPUT
UTL_FILE

ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID cu030hs8vrjjn, child number 1
-------------------------------------
select * from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t

Plan hash value: 2407808827

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 29 (100
| 1 | COLLECTION ITERATOR PICKLER FETCH| STR2TBL | 8168 | 16336 | 29 (0
--------------------------------------------------------------------------------


13 rows selected.


Which is not any different from explain plan at this point. Reality says "I came up with a plan based on an estimated cardinality of 8168 rows"... However, the database actually *ran* the query this time - and the database has learned from its mistake:


ops$tkyte%ORA11GR2> select *
2 from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t;

COLUMN_VALUE
------------------------------
DBMS_PIPE
DBMS_OUTPUT
UTL_FILE

ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID cu030hs8vrjjn, child number 2
-------------------------------------
select * from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t

Plan hash value: 2407808827

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 29 (100
| 1 | COLLECTION ITERATOR PICKLER FETCH| STR2TBL | 6 | 12 | 29 (0
--------------------------------------------------------------------------------

Note
-----
- cardinality feedback used for this statement


17 rows selected.

.
Apparently, it hard parsed that query - it did not reuse the plan. We know that it hard parsed since it just created child #2 - and we can see the plan is different - the row count is much lower.

Now, in this example, onl y the estimated row counts changed - the actual plan is the same (the query is after all very simple). Will this affect real world queries?

Yes, it will :)
ops$tkyte%ORA11GR2> create table data
2 as
3 select *
4 from all_objects;

Table created.

ops$tkyte%ORA11GR2> create index data_idx on data(object_name);

Index created.

ops$tkyte%ORA11GR2> exec dbms_stats.gather_table_stats( user, 'DATA' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> set serveroutput off
ops$tkyte%ORA11GR2> with T as
2 ( select distinct * from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t where rownum > 0 )
3 select * from data, t where data.object_name = t.column_value
4 /
....
10 rows selected.

ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID bv5ar1b8sft67, child number 0
-------------------------------------
with T as ( select distinct * from TABLE(cast( str2tbl( :in_list ) as
str2tblType) ) t where rownum > 0 ) select * from data, t where
data.object_name = t.column_value

Plan hash value: 892089582

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 319 (100)| |
|* 1 | HASH JOIN | | 13730 | 1528K| 319 (1)| 00:00:04 |
| 2 | VIEW | | 8168 | 135K| 30 (4)| 00:00:01 |
| 3 | HASH UNIQUE | | 8168 | 16336 | 30 (4)| 00:00:01 |
| 4 | COUNT | | | | | |
|* 5 | FILTER | | | | | |
| 6 | COLLECTION ITERATOR PICKLER FETCH| STR2TBL | 8168 | 16336 | 29 (0)| 00:00:01 |
| 7 | TABLE ACCESS FULL | DATA | 72236 | 6842K| 288 (1)| 00:00:04 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("DATA"."OBJECT_NAME"="T"."COLUMN_VALUE")
5 - filter(ROWNUM>0)


27 rows selected.

.
Ugh, one of those (tongue in cheek) nasty full scans and hash joins (they are good - when they are appropriate, they aren't appropriate here...)

as opposed to

ops$tkyte%ORA11GR2> with T as
2 ( select distinct * from TABLE(cast( str2tbl( :in_list ) as str2tblType) ) t where rownum > 0 )
3 select * from data, t where data.object_name = t.column_value
4 /
...
10 rows selected.

ops$tkyte%ORA11GR2> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID bv5ar1b8sft67, child number 1
-------------------------------------
with T as ( select distinct * from TABLE(cast( str2tbl( :in_list ) as
str2tblType) ) t where rownum > 0 ) select * from data, t where
data.object_name = t.column_value

Plan hash value: 3947981921

----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 48 (100)| |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 10 | 1140 | 48 (3)| 00:00:01 |
| 3 | VIEW | | 6 | 102 | 30 (4)| 00:00:01 |
| 4 | HASH UNIQUE | | 6 | 12 | 30 (4)| 00:00:01 |
| 5 | COUNT | | | | | |
|* 6 | FILTER | | | | | |
| 7 | COLLECTION ITERATOR PICKLER FETCH| STR2TBL | 6 | 12 | 29 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | DATA_IDX | 2 | | 2 (0)| 00:00:01 |
| 9 | TABLE ACCESS BY INDEX ROWID | DATA | 2 | 194 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

6 - filter(ROWNUM>0)
8 - access("DATA"."OBJECT_NAME"="T"."COLUMN_VALUE")

Note
-----
- cardinality feedback used for this statement


33 rows selected.



Nice - it learned from the errors of its ways...


Note: this affects more than just table functions - read some of the links found on google to see other interesting examples.

Very nice.

Tuesday, April 27, 2010

That old restart problem again...

Not something entirely learned new today - but rather a revisited "something I learned" coupled with an "I didn't necessarily expect it in this case".

It is the old "statement restart" and "evil triggers" issue. I had an asktom question asking why a row seemed to be getting deleted twice. You should read that link, I'll be referring to it here.

I immediately knew what the issue was (I was pretty sure). Before I even read the test case - I had already pointed them to three articles I wrote on the subject a while ago...

But when I looked at their test case and modified it slightly to be smaller and easier to read - I was frankly surprised at what was happening - but ok with the behavior.

It takes a series of bad things to happen for this issue the poster was seeing to have happen, you have to

a) use a trigger (evil)
b) do something non-transactional in that trigger - for example modify a package global variable
c) do slow by slow processing (if they did a big mass operation - they could still see a restart, but it would be less likely perhaps)

Here is a snippet of the code again:


ops$tkyte%ORA11GR2> create or replace PACKAGE pacepack
2 as
3 type array is table of number index by varchar2(40);
4 g_data array;
5 g_cnt number;
6 end pacepack;
7 /

Package created.

That package just has some global variables (evil global variables - evil evil - they didn't use globals in their API, they had getter/setter functions - but they behaved like globals as well). We'll create a row trigger to save the rowids of the rows our trigger has processed - even if Oracle rolls back our firing statement - modifications we've made in our trigger to the globals won't rollback.

Now, we'll code the trigger:

ops$tkyte%ORA11GR2>
ops$tkyte%ORA11GR2> create table delete_data
2 as
3 select owner, object_name
4 from dba_objects
5 where rownum < 53001;

Table created.




ops$tkyte%ORA11GR2> create or replace trigger delete_data_bt
2 before delete or update on delete_data
3 for each row
4 begin
5 if ( pacepack.g_data.exists(rowidtochar(:old.rowid)) )
6 then
7 dbms_output.put_line( 'doing "' || :old.rowid ||
'" again was called ' || pacepack.g_cnt );
8 else
9 pacepack.g_data(rowidtochar(:old.rowid)) := 1;
10 end if;
11 pacepack.g_cnt := pacepack.g_cnt + 1;
12 end;
13 /

Trigger created.

The trigger checks to see if the PLSQL global table variable has the rowid already entered in it - if so, it says "I already did this one", else it sets a flag showing it processed that one for the first time. We also increment a count (you know, to count the rows we've deleted or updated).

Now we process the data:

ops$tkyte%ORA11GR2> declare
2 CURSOR L_DELETE_CSR IS
3 SELECT *
4 FROM delete_data
5 for update ;
6 l_cnt number := 0;
7 BEGIN
8 pacepack.g_data.delete;
9 pacepack.g_cnt := 0;
10 for l_delete_row in l_delete_csr
11 loop
12 update delete_data
13 set owner = lower(owner)
14 where current of L_delete_CSR;
15 l_cnt := l_cnt + 1;
16 end loop;
17 dbms_output.put_line( 'trigger count = ' || pacepack.g_cnt ||
' local count = ' || l_cnt );
18 END;
19 /
doing "AAAYMdAAEAABFrnABH" again was called 8827
doing "AAAYMdAAEAABFwiAAx" again was called 20140
doing "AAAYMdAAEAABFyeACk" again was called 31405
doing "AAAYMdAAEAABFzaABE" again was called 42670
trigger count = 53004 local count = 53000

PL/SQL procedure successfully completed.
As you can see - our trigger was fired 53,004 times - for 53,000 rows. Our counts do not match and we can see the four updates that were restarted.

Why they were restarted - not really relevant - the fact is - they CAN be restarted, we can show that they can be restarted, it is documented that they can be. Any statement can be restarted - we do a savepoint before executing the statement and if we deem necessary (which we did obviously), we rollback to that savepoint and do it over.

So, expect the restart - and stop using triggers. If you do use triggers - be very very very careful to make sure they are not subject to issues with restarts (eg: NO autonomous transactions - NO modifications of global variables - NO API calls that have side effects that don't roll back (setter functions for example)!

Saturday, April 24, 2010

Did you know...

It is Saturday (ok, you probably knew that) so I'll do something non-Oracle. Did you know - carrots aren't really supposed to be orange? They should be purple. And they are if you grow a few generations of the Orange ones (they revert back to their natural state).

So, much like the yellow, easy to peel bananas we enjoy - Orange carrots are a product of "us".

I wonder what Bugs Bunny would have to say about that...

Friday, April 23, 2010

A new thing about sql*plus

What I learned today was - a new feature in the venerable old tool SQL*Plus.

I was asked about trapping the SP2 errors in SQL*Plus, those you get when you have a bad SQL*Plus command - like this:
ops$tkyte%ORA11GR2> selct * from dual;
SP2-0734: unknown command beginning "selct * fr..." - rest of line ignored.

In that case - neither of the SQL*Plus error handling bits:

OSERROR
SQLERROR

will work for you - it is not an OS error like "unable to open spool file", it is not a SQL error - selct is not SQL, it never got to the SQL layer. that SP2 error is uncatchable. And I wrote as much.

Things change - my answer is dead on correct, for version 10gR2 and before (hey, the question was from someone using 10gR2 - so technically, I was correct ;) ). It is not technically true in 11g and above.

SQL*Plus in 11g added an "error logging" facility. A session may issue:

SQL> set errorlogging on

and have any SQL, OS or SP2 errors logged into a logging table, similar to DML error logging. Additionally - you can have your errors tagged with an identifier, making it easy to find your error records. So, you can now check (using SQL) at various points in time to see if you've hit an error - or your program that runs sqlplus and runs a script can check to see if any errors occurred in your session easily.

thanks to Enrique Aviles for pointing it out and thanks to Arup Nanda for writing it up (ctl-f for SQL*Plus Error Logging on that page).

Note that you need the 11g SQL*Plus, not just an 11g database with an old sqlplus connected to it! This is a feature of SQL*Plus.

On the flipside though, this means it is available for older database releases! You can connect to 9i with 11g SQL*plus and use this:

[tkyte@dellpe ~]$ sqlplus scott/tiger@ora9ir2

SQL*Plus: Release 11.2.0.1.0 Production on Fri Apr 23 15:36:51 2010

Copyright (c) 1982, 2009, Oracle. All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production

scott%ORA9IR2> set errorlogging on
scott%ORA9IR2> selct * from dual;
SP2-0734: unknown command beginning "selct * fr..." - rest of line ignored.
scott%ORA9IR2> select timestamp, username, script, statement, message
2 from sperrorlog;

TIMESTAMP
---------------------------------------------------------------------------
USERNAME
-------------------------------------------------------------------------------
SCRIPT
-------------------------------------------------------------------------------
STATEMENT
-------------------------------------------------------------------------------
MESSAGE
-------------------------------------------------------------------------------
23-APR-10 03.37.02.000000 PM
SCOTT

selct * from dual;
SP2-0734: unknown command beginning "selct * fr..." - rest of line ignored.

Wednesday, April 21, 2010

Something I recently unlearned...

This is how many of the things I learn everyday come into being. They are actually things I have to "unlearn" because what used to be true has changed over time.

Once upon a time ago - I remember the day I learned this, it was during a benchmark in 1993, I learned that UNINDEXED foreign keys had some locking implications. Specifically if you:
  • update the parent primary key (which does happen, some 'frameworks' update every column even if the value did not change)
  • delete from parent
Then you should probably index the foreign key in the child table - else there will be a full table lock placed on the child table - for the duration of the transaction.

Then Oracle 9i was released and I had to relearn the rule. The rule in 9i was as above still - just modified as to the duration of the lock (many people think the restriction actually went away - but it did not, it was changed). In 9i and above, if you update the parent or delete from the parent with an unindexed foreign key - the child table is still locked - just for the duration of the update or delete! The lock is released after the statement processed - not when you commit. This was "better", but the lock still exists.

Sometime during 9i - I learned yet another modification to the rule above. The rule in 9i now has to include:
  • if you merge into the parent table
in addition to update and delete. As I was getting ready to add that to the 2nd Edition of Expert Oracle Database Architecture - I learned something new, the rule has changed again. The MERGE doesn't always lock the table anymore in 11g Release 1 and above - so we are back to just update and delete (sort of!).

Here is the small test case you can use to verify - the set up is:
ops$tkyte%ORA9IR2> create table p ( x int primary key, y int );
Table created.

ops$tkyte%ORA9IR2> insert into p values ( 1, null );
1 row created.

ops$tkyte%ORA9IR2> insert into p values ( 2, null );
1 row created.

ops$tkyte%ORA9IR2> create table c ( x references p );
Table created.

ops$tkyte%ORA9IR2> create or replace
procedure modify_p( p_what in varchar2 )
2 as
3 pragma autonomous_transaction;
4 deadlock exception;
5 pragma exception_init( deadlock, -60 );
6 begin
7 if ( p_what = 'DELETE' ) then delete from p where x = 2;
8 elsif ( p_what = 'UPDATE' ) then update p set x = 2 where x = 2;
9 elsif ( p_what = 'MERGE' ) then
10 merge into p using (select 2 x, 42 y from dual) d
11 on (p.x=d.x)
12 when matched then update set y = d.y
13 when not matched then insert(x,y) values (d.x,d.y);
14 end if;
15 rollback;
16 dbms_output.put_line( p_what || ': successful...' );
17 exception
18 when deadlock then
19 dbms_output.put_line( p_what ||
': we deadlocked, we needed full table lock');
20 rollback;
21 end;
22 /
Procedure created.
So, a parent table with two rows - 1 and 2. An empty child table with an unindexed foreign key. A stored procedure that runs as an autonomous transaction - so it cannot share the locks of the parent transaction, if the parent transaction has anything locked - the autonomous_transaction will NOT be able to also lock it. The autonomous transaction attempts to either
  • delete row X=2 from parent
  • update row X=2 in parent
  • merge into row x=2 in parent using when matched then update, when not matched then insert
and if it deadlocks - prints out a message telling us that and rolls back. If successful, prints out a message telling us that and likewise rolls back.

To test, we just insert into the child table a record that points to row x=1 in parent (we'll never touch that row in the parent table) and then try the three DML opertions:
ops$tkyte%ORA9IR2> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA9IR2> exec modify_p( 'DELETE' );
DELETE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> rollback;
Rollback complete.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA9IR2> exec modify_p( 'UPDATE' );
UPDATE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> rollback;
Rollback complete.

ops$tkyte%ORA9IR2> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA9IR2> exec modify_p( 'MERGE' );
MERGE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> rollback;
Rollback complete.

There you go, you can see it deadlocked on all three - they all needed to lock the child table before doing their work.

If you run that in 10gr1 and 10gr2 - you'll see the same results - all three lock. However, starting in 11g Release 1 - you'll see this:
ops$tkyte%ORA11GR1> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA11GR1> exec modify_p( 'DELETE' );
DELETE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1> rollback;
Rollback complete.

ops$tkyte%ORA11GR1> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA11GR1> exec modify_p( 'UPDATE' );
UPDATE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1> rollback;
Rollback complete.

ops$tkyte%ORA11GR1> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA11GR1> exec modify_p( 'MERGE' );
MERGE: successful...
PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1> rollback;
Rollback complete.


That merge no longer locks the child table. Curious as to this change - I wanted to see if it was official or not and found bug 5970280 - from which I learned that it was officially changed and that some of you running 10.2 might see a different result for this test case (the fix was backported and is available for 10.2).

The fix is more complex than appears (aren't they always?) It is not just "turn off lock for MERGE", it is "analyze the merge and
  • if the merge just inserts - treat as an insert
  • if merge does an update or update and insert (and we are NOT updating the primary key!) treat as an update to non-primary key columns
  • if merge does an update or update and insert (and we are updating the primary key) lock child table
  • if merge includes a delete - treat as a delete and lock child table

So, if your merge was:

10 merge into p using (select 2 x, 42 y from dual) d
11 on (p.y=d.x)
12 when matched then update set x = d.x
13 when not matched then insert(x,y) values (d.x,d.y);


then you would see:

ops$tkyte%ORA11GR1> exec modify_p( 'MERGE' );
MERGE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.


or if your merge included a possible delete branch, you would see the same.

So, I guess the rule in 11gR1 and above is, if you
  • update parent primary key
  • delete from parent
  • use a merge that does either of the above
and you have an unindexed foreign key - you should expect a full table lock on the child table. If you index that foreign key - no untoward locking will take place.

Things change over time :)

Tuesday, April 20, 2010

What will happen if...

So, something else I learned recently...

Say you have a table T:
ops$tkyte%ORA10GR2> create table t ( x int );
Table created.
ops$tkyte%ORA10GR2> insert into t values ( 1 );
1 row created.
ops$tkyte%ORA10GR2> commit;
Commit complete.
and you update a row in that table:
ops$tkyte%ORA10GR2> update t set x = x+1;
1 row updated.
and using an evil autonomous transaction you try to lock that row (in the same session, but a new transaction - one that cannot 'see' the effects of the parent transaction)
ops$tkyte%ORA10GR2> declare
2 pragma autonomous_transaction;
3 l_rec t%rowtype;
4 begin
5 select * into l_rec from t for update wait 5;
6 commit;
7 end;
8 /
declare
*
ERROR at line 1:
ORA-30006: resource busy; acquire with WAIT timeout expired
ORA-06512: at line 5
So far, it all seems normal. You asked to wait for 5 seconds, you did - and you time out. But, what happens if you wait longer?
ops$tkyte%ORA10GR2> declare
2 pragma autonomous_transaction;
3 l_rec t%rowtype;
4 begin
5 select * into l_rec from t for update wait 6;
6 commit;
7 end;
8 /
declare
*
ERROR at line 1:
??????????????????????????????????
What error are you expecting that time - hint, it is not ORA-30006 it is
ops$tkyte%ORA10GR2> declare
2 pragma autonomous_transaction;
3 l_rec t%rowtype;
4 begin
5 select * into l_rec from t for update wait 6;
6 commit;
7 end;
8 /
declare
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at line 5
apparently the self deadlock code kicks in before the timeout happens and the deadlock detection code doesn't see the "we will eventually time out"

And yes, this applies to multi-session cases as well - run this script to see that:
drop table t;
set echo on
create table t ( x int );
insert into t values ( 1 );
insert into t values ( 2 );
commit;
select * from t where x = 1 for update;
set echo off
prompt in another session issue:
prompt select * from t where x = 2 for update;;
pause
prompt in another session issue:
prompt select * from t where x = 1 for update wait 10;;
set echo on
select * from t where x = 2 for update;
It will deadlock one of the sessions - without waiting for the "wait" timeout.

So, deadlock detection trumps a waiting period. If the waiting period expires before the deadlock routines kick in - you get the ora-30006, if the deadlock routines kick in before the timeout - you get ora-60.

Monday, April 19, 2010

NO_DATA_NEEDED - something I learned recently

I'll be writing about this in Oracle Magazine shortly as part of the asktom column - but thought I'd mention it here too.

Recently on asktom.oracle.com, I was asked a question about the pre-defined exception NO_DATA_NEEDED. At first I thought that is was a typo – they really meant NO_DATA_FOUND – since I hadn’t heard of or read about that exception. But in looking a little deeper, I discovered what it was.

If you ever write a pipelined function - there is a good chance you need to be aware of it. Don't go searching for it in the documentation (it will be there in the next dot release - but it isn't there yet), you won't find it. Don't google for it - you won't find much about it yet. But it is something we've needed, probably knew we needed, just never thought about it.

What if you have a pipelined function that does something like:

a) open file
b) read line - pipe row
c) when no more data, close file and return

It works perfectly - if you read all records from the file. However, if you call it from a query such as:

select * from table(pipelined_function( '/tmp/foo.dat' )) where rownum = 1;


What happens - what happens if there was zero records in /tmp/foo.dat to read? one record? More than one record?

Well, "it depends". In all likelihood - if there was one or more records - you would leave a file handle open - and if you called this function over and over, you would leak an open file each time and eventually run out of file handles. So, you would have an error that would "sometimes happen" and "sometimes not happen". In other words - one of those strange non-reproducible bugs that only happens when it rains on a Tuesday after midnight but before 8am.

Enter NO_DATA_NEEDED - an exception that doesn't behave like any other exception. An exception that is raised - but does not cause failure. An exception that can be caught, but if it isn't - everything is still OK.


Say you have code like this:


create or replace function
foo( inputs ... )
return some_type
PIPELINED
as
/* declaration */
begin
/* initialization */

/* process a loop */
pipe row(i);
end loop;

/* clean up */
return;
end;
/

The clean up code would execute and do the right thing if you exit the loop - but not so if you just stopped calling this function. The NO_DATA_NEEDED exception is there for just such a case. Here is a concrete example:


SQL> create or replace function
2 generate_data( n in number )
3 return sys.odciNumberList
4 PIPELINED
5 as
6 begin
7 dbms_output.put_line
8 ( '===>>> INITIALIZE' );
9 for i in 1..generate_data.n
10 loop
11 dbms_output.put_line
12 ( '===>>> PROCESS' );
13 pipe row(i);
14 end loop;
15 dbms_output.put_line
16 ( '===>>> CLEAN UP' );
17 return;
18 end;
19 /

Function created.


That is a pretty straightforward PL/SQL pipelined function – if we run it to completion – we would see this output:


SQL> select *
2 from table(generate_data(2));

COLUMN_VALUE
------------
1
2

===>>> INITIALIZE
===>>> PROCESS
===>>> PROCESS
===>>> CLEAN UP
SQL>


Which is what we expect – but what if we don’t fetch two rows from that function, what if we only fetch one?


SQL> select *
2 from table(generate_data(2))
3 where rownum = 1;

COLUMN_VALUE
------------
1

===>>> INITIALIZE
===>>> PROCESS
SQL>


As you can see – we did the initialize and one process bit of our code, but the rest – it was just skipped over, because the invoking SQL statement did not need it. We didn’t see any error (we would expect an unhandled exception to raise an error!), it just appears to have worked.

There was however, an exception raised – an exception that does not have to be caught. It will be ignored entirely if it is not caught. It differs from every other exception in that regard – we would expect an unhandled exception to propagate to the client and appear as “an error”. Lets see what happens with out code if we implement this error handler:

SQL> create or replace function
2 generate_data( n in number )
3 return sys.odciNumberList
4 PIPELINED
5 as
6 begin
7 dbms_output.put_line
8 ( '===>>> INITIALIZE' );
9 for i in 1..generate_data.n
10 loop
11 dbms_output.put_line
12 ( '===>>> PROCESS' );
13 pipe row(i);
14 end loop;
15 dbms_output.put_line
16 ( '===>>> CLEAN UP' );
17 return;
18 exception
19 when no_data_needed
20 then
21 dbms_output.put_line
22 ( '***>>> CLEAN UP' );
23 return;
24 end;
25 /

Function created.

On line 19 we catch the predefined exception NO_DATA_NEEDED and on line 21 announce that we are cleaning up (releasing any resources that need be released). Now when we run this pipelined function without exhausting it we see:

SQL> select *
2 from table(generate_data(2))
3 where rownum = 1;

COLUMN_VALUE
------------
1

===>>> INITIALIZE
===>>> PROCESS
***>>> CLEAN UP
SQL>

As you can see – our special cleanup code (we used ***>>> to announce it) was executed and we could clean up any resources we allocated.

Saturday, April 17, 2010

What I learned today :)

That I grew my beard for a reason.

I am apparently more trustworthy looking simply because of it.

And here I thought I was doing it to save money on razors. Good side effects are always nice.

Friday, April 16, 2010

Never Imagined...

I never imagined that I would have to cancel a seminar, because of....

A volcano.

I was supposed to be in Hungary for a two day seminar Monday and Tuesday - which I had to cancel since the airlines told me the earliest I would be able to leave for Europe would be .... Tuesday night!

I also had to cancel a planned in person appearance at the Bulgaria Oracle User Group - we are going to try to make it a 'virtual' appearance for some technical sessions if possible. That'll take place next Friday the 23rd.

So, to anyone that was signed up for the seminar in Budapest - I apologize, but there is no way to get there from here right now. We'll be looking for a reschedule date now...

Evaluating an expression, like a calculator...

Today, I learned a new 11g Release 1 and above 'trick' that I wasn't aware of. This is pretty cool.

A frequently asked question in the past has been:

I have a string, with some calculation in it - like "1+2/3". I would like to evaluate that string and get the result. How do I do that.

Historically - the answer has been "dynamic SQL, but please be careful to not flood the shared pool with tons of literal SQL and be really careful about SQL Injection!"

Now, I learned a new way. It came from this post, thanks to the frequent asktom poster "Sokrates"


ops$tkyte%ORA11GR1> select * from v$version;

BANNER
-------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
PL/SQL Release 11.1.0.7.0 - Production
CORE 11.1.0.7.0 Production
TNS for Linux: Version 11.1.0.7.0 - Production
NLSRTL Version 11.1.0.7.0 - Production

ops$tkyte%ORA11GR1> variable x varchar2(40)
ops$tkyte%ORA11GR1> exec :x := '55+42*123/3'

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1>
ops$tkyte%ORA11GR1> select xmlquery(replace( :x, '/', ' div ' )
returning content ) .getNumberVal()
2 from dual
3 /

XMLQUERY(REPLACE(:X,'/','DIV')RETURNINGCONTENT).GETNUMBERVAL()
--------------------------------------------------------------
1777

It works for very very simple expressions - XQuery arithmetic expressions are very simple. They support +, -, *, div, idiv (integer division), and mod


ops$tkyte%ORA11GR1> exec :x := '(55+42-124) idiv 3 div 2 mod 5*2'

PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1> /

XMLQUERY(REPLACE(:X,'/','DIV')RETURNINGCONTENT).GETNUMBERVAL()
--------------------------------------------------------------
-9


See http://www.xquery.com/tutorials/guided-tour/xquery-operators.html for a guided tour of what you can do.

Thursday, April 15, 2010

to_char(dt,'yyyy') is almost as good as trunc(dt,'y')...

I'm in the middle of updating Expert Oracle Database Architecture Edition 1 to Edition 2. It'll cover up through Oracle Database 11g Release 2. While doing it, I've been discovering that "things change over time". Well, Ok, that is not a new discovery - I've said that before...

If you are interested in the new edition, you can read the unedited, probably incorrect, updates - you can buy the "alpha book" in pdf, which will entitle you to a full pdf of the final copy. See http://apress.com/book/view/1430229462 for details.


Anyway, while updating the book - I was re-running all of my 'test' scripts - the scripts I use to demonstrate points - and of course since things change - the results of some of the scripts have changed as well. Things like private redo strands affected my measurements in the Redo/Undo chapter (rendering some of the old examples useless...). And other small changes pop up and make themselves apparent.

For example, to_char(dt,fmt) has changed, for the better. It is faster...

In the first edition, I wrote:

For example, they will use
Where to_char(date_column,'yyyy') = '2005'
instead of
Where trunc(date_column,'y') = to_date('01-jan-2005','dd-mon-yyyy')
The latter is a far more performant and less resource-intensive approach.

Note: in the following - you would not use literals in the query, you would use bind variables to represent the dates! Just an example...


While I would still *prefer* you to use trunc(date_column'y') - it says so much more than the to_char does semantically - it cannot be for the reason I wrote. In the first edition - I showed by executing a query against a copy of ALL_OBJECTS that it was more efficient:

select count(*)
from
t where to_char(created,'yyyy') = '2005'

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 4 0.01 0.05 0 0 0 0
Execute 4 0.00 0.00 0 0 0 0
Fetch 8 0.41 0.59 0 372 0 4
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 16 0.42 0.64 0 372 0 4

select count(*)
from
t where trunc(created,'y') = to_date('01-jan-2005','dd-mon-yyyy')

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 4 0.00 0.00 0 0 0 0
Execute 4 0.00 0.00 0 0 0 0
Fetch 8 0.04 0.16 0 372 0 4
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 16 0.04 0.16 0 372 0 4

but when I re-ran that in 11g Release two - I discovered:


select count(*)
from
t where to_char(created,'yyyy') = '2005'

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 4 0.00 0.00 0 0 0 0
Execute 4 0.00 0.00 0 0 0 0
Fetch 8 0.09 0.10 0 364 0 4
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 16 0.09 0.10 0 364 0 4

select count(*)
from
t where trunc(created,'y') = to_date('01-jan-2005','dd-mon-yyyy')


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 4 0.00 0.00 0 0 0 0
Execute 4 0.00 0.00 0 0 0 0
Fetch 8 0.07 0.07 0 364 0 4
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 16 0.07 0.07 0 364 0 4

*note - I used the same number of rows in 11g as 10g in the above

Well, that isn't too different anymore, is it? They made it a bit more efficient than it used to be in the past - the to_char and trunc are now about the same. In fact that change happened with 10g Release 2! (edition 1 of Expert Oracle Database Architecture was up through 10g Release 1)

Fortunately though, my ultimate conclusion - that:


select count(*)
from
t where created between to_date('01-jan-2005','dd-mon-yyyy') and
to_date('01-jan-2006', 'dd-mon-yyyy')-1/24/60/60


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 4 0.00 0.00 0 0 0 0
Execute 4 0.00 0.00 0 0 0 0
Fetch 8 0.00 0.00 0 364 0 4
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 16 0.00 0.00 0 364 0 4


is the right way to code that query still stands - that is from 11g Release 2 - it is significantly faster than to_char or trunc. And, it'll be infinitely more "index friendly" in general that either of to_char or trunc.


So, things change - the tuning advice of yesterday might well be meaningless or - at least less meaningful - today...

Glad I have my test scripts :) It would be much harder to update the book without them - I wouldn't have a big red flag staring at me in the face without them.

That is how today's myths get started - most of the myths surrounding the database have their origins in some nugget of truth from the past. But unless you have some method of seeing when things change - it'll be hard to figure out what's changed.

Wednesday, April 14, 2010

Been a while - new theme coming...

Been a while since I've written here. I've decided to start posting again from time to time, but I'm going to follow a theme for a while.

The theme will be "I didn't know that..." - or "What I learned new today..."

I am constantly amazed as what I don't know - given that I've been using Oracle for 23 years now - you might think that would be a small set of things. It isn't, it is quite large.

Here is one thing I learned new today. It is about SQL*Plus. A pretty simple tool, been using it for a long long long time. Been using it since it was an extra cost option to the database!

Anyway, did you know that you could put @file into a SQL command?


ops$tkyte%ORA10GR2> !cat test.sql
select * from dual;

ops$tkyte%ORA10GR2> explain plan for
2 @test

Explained.

ops$tkyte%ORA10GR2> l
1 explain plan for
2* select * from dual
ops$tkyte%ORA10GR2>


Neat - I did not know that.


I have more, I'll let them leak out day by day (or so)...