Thursday, April 26, 2007

Version 42 has gone production...

Based on my decidedly tongue in cheek answer to question 13, Doug Burns showed me a link to the first inkling of version 42 of Oracle :)

...

The Oracle Divining Kit

For Accessing Psychic-Intuitive Information. Seven Tools (or more) in One Kit Fits in Pocket or Purse (3.25" x 5.25" x .5")

...

And we have the first truly "self healing" feature...  The irony is funny on that page.  If you listen to some advice on the internet as to how to tune Oracle - that "healing kit" almost makes sense - here is an excerpt!  I know I've read this same advice for database tuning!  Remember -usually three breaths is more than enough! and when you get really good - just one will do :)

Steps In Developing Divining/Diagnosis Skill:

  1. Disassociate by using the psychic breath. (Focus your awareness between your eyes, take a deep breath and move backward into the middle of your head, your psychic center, focusing inward, letting go of the conscious mind controller as you exhale.) Usually three breaths is enough; one breath will do it when you become skilled. When your conscious mind-chatter stills, and you feel yourself expand, you are in your psychic-subconscious space.
  2. Ask Permission First. Operate within the boundaries of spiritual law. Use your divining device to determine:
    • Can I Ask This Question? (Have I cleared my vested interest in the outcome of the question?)
    • May I Ask This Question? (Is it appropriate for me to have this information, will I use it correctly?)
    • Should I Ask This Question (Will I be intruding in someone else's privacy? Will this information injure me or someone else? Is this question to the highest and best spiritual good of all concerned? Is the information available to me?)
    • Do I Have Permission To Perform Healing Services For This Person? Sometime people have a vested interest in their illness, and it blocks the healing energies. Or, is it to their highest good for me to heal them?
  3. Be Explicit In Your Question: The subconscious mind is very literal, and operates like a computer.
  4. Avoid Habitual Responses: The subconscious mind will deliver information gathered in previous experiences first. Avoid this by ordering it to ignore all previous divining experiences concerning the subject. Ask the question a different way.
  5. Avoid Influences From Other People: Your subconscious mind picks up erroneous information from others telepathically, particularly if they have strong convictions on a subject. Block them out, focus on your own search, rely on yourself.
  6. Use Energy Sensitive Divining Devices! (Most important!) Any tool can be used in divining, just as in dowsing, that will indicate a positive or negative response from the subconscious mind. Pendulums, (any object suspended from a chain or string), L-rods, wands, etc. all may be used, as in dowsing. However, the more movement required by the hand to operate the device, the less reliable is the answer. Exaggerated, or slow movements (as with most dowsing tools) gives the conscious mind opportunity to over-ride the subconscious with preconception, habitual response, and vested interest.

but wait - that's not all - if you order right now.... :)  very funny.

How not to ask 20 questions...

Yesterday I pointed to a thread on otn that was purportedly started by a really obnoxious question.  While today I think it might have been a troll (since the query was in fact cut and pasted from an older thread), it is not out of the realm of possibility.  It very easily could be "real".  I see them all of the time.

These sorts of demands come in various shades of "over bearingness" - from the "almost ok, but a tad pushy" to "way over the top".  The example yesterday was in the "way over the top" category - and they happen quite often.  The example I bring to you today is in the middle somewhere.  Polite almost - but very "entitled" feeling.  I got home quite late last night from traveling and checked my email - the email at the bottom was delivered at 1:30am and was sent by someone whose questions I had answered on asktom earlier that day.  I answered their multipart question (something I ask them not to do on the form they submit questions with - do not ask more than one question) using a "search" link on asktom (eg: I answered by providing a link that just searched asktom using their question terms as the input...).  I also pointed out how "insufficiently framed" the other questions were and sort of guessed at their intent.  I guess I over achieved because now, it was time to play 20 questions.....  I answered their email with "hahahaha, a few questions - funny.  Amazing". 

I wonder how many of them have answers that could be found via search...  They have pretty much all been asked and answered (a couple of times..)

Just for grins, I was almost tempted to answer with amusing (to me anyway) responses, like this one for #13:

13 - That is a feature planned for version 42 of Oracle.  It will be part of the prescient option under design now.  Using this new feature, you won't actually have to load any data into the database - it'll just know the answers to your questions.

I was too tired though :)

Hi Thomas,

Here I've few more questions. Please send me the answers. Its quite
urgent.

1.How do I return a result set in batches, e.g. get rows 1 - 10, then
11-20, then next 10 and so on ?

2.How do I report the top 10 product members of the sales team in each
state/county ?

3.How do I embed single-quote marks in an SQL statement

4. Oracle seems to be very slow reporting large execution plans through
explain plan. Is there any way to speed this up ?

5.What is the most efficient way to update one table with values from
another table ? (Update/Delete with subquery) (170K file)

6.How do I find my SID (session id) from SQL*Plus

7.In a 'start with / connect by' query, how do I get child rows under
each parent to come out in sorted order ?

8.What bitwise operations does Oracle support ?

9.How can I convert numbers from Hex to Decimal (or any other bases) ?

10.What's the quickest way of deleting all (or a large fraction of) the
data in a table ?

11.I am using to_char(num_col,'9999') to get a 4-character output, but
is keeps coming out one column too wide. What's wrong ?

12.How do you write a query that ignores the effects of upper and lower
case ?

13.Can I find out how many rows will be affected by an SQL statement
without actually running it ?

14.How can I select a random collection of rows from a table ?

15.How do I export a database table to a flat file ?

16.What is the best way to identify and eliminate duplicate rows in a
table ?

17.How can I find out how many rows were affected by an SQL statement
after I have executed it ?

18.Is it possible to write a trigger referencing more than one table - I
want a trigger to audit a master/detail pair.

19.Is it possible to easily convert existing SQL scripts into a format
that cannot be read or altered? This is to improve security on an
existing site.

20.Is there a way to read the names of a set of files with a given
extension from a directory as if from a SQL cursor

Wednesday, April 25, 2007

How not to ask a question...

A friend of mine sent me this link.... Ouch, that hurts just to read it.

Monday, April 23, 2007

Share and Share alike...

I learned two new things today :) 

First, one about a combination of a PLSQL memory optimization that takes place on the 2000 byte boundary that affected the findings of a test case...

And second, was something Alberto Dell'Era said on the same thread that made me "think and go, hmmmm" for a minute.

I did not realize that running my test backwards would come to a totally different outcome, very cool.  Another bit of trivia than can be used to explain the sometimes unexplainable.  The number of child cursors you'll see due to a length bind mismatch depends on the initial order of execution after an instance is started.  The child cursors are created with "small bind spaces" that can grow by spawning new child cursors - but if the first entry is big and the rest get smaller - that won't happen.  Read this entry, and compare this simple test to it: 

ops$tkyte%ORA10GR2> create table t ( x varchar2(2000) );
Table created.

ops$tkyte%ORA10GR2> declare
2 a varchar2(1) := 'x';
3 b varchar2(100) := rpad('x',100,'x');
4 c varchar2(500) := rpad('x',500,'x');
5 d varchar2(1000) := rpad('x',1000,'x');
6 begin
7 insert into t big_to_small values(d);
8 insert into t big_to_small values(c);
9 insert into t big_to_small values(b);
10 insert into t big_to_small values(a);
11 insert into t small_to_big values(a);
12 insert into t small_to_big values(b);
13 insert into t small_to_big values(c);
14 insert into t small_to_big values(d);
15 end;
16 /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> column sql_text format a45
ops$tkyte%ORA10GR2> column address new_val ADDR
ops$tkyte%ORA10GR2> select sql_text, address, child_address
2 from v$sql
3 where sql_text like 'INSERT%INTO%T%VALUES(%:B1%)'
4 /

SQL_TEXT ADDRESS CHILD_AD
--------------------------------------------- -------- --------
INSERT INTO T BIG_TO_SMALL VALUES(:B1 ) 29A77AEC 29AEC3DC
INSERT INTO T SMALL_TO_BIG VALUES(:B1 ) 260543F8 2993E05C
INSERT INTO T SMALL_TO_BIG VALUES(:B1 ) 260543F8 2994BB4C
INSERT INTO T SMALL_TO_BIG VALUES(:B1 ) 260543F8 29B3AA68

Sharing SQL...

A question that pops up from time to time is:

I'm using binds, just like you said, but there are still multiple copies of my SQL in the shared pool.  Why is that?

There could be many reasons for that and if you peek at the documentation for V$SQL_SHARED_CURSOR - you'll see many of the reasons.

For example, this morning - I received an email that said in part:

 

For my post at AskTom, I am confused, after changing the code, it is not shared 100%, see this
SQL_TEXT                                                     PARSE_CALLS EXECUTIONS ADDRESS
------------------------------------------------------------ ----------- ---------- --------
SELECT ROWID,TITLE_NO,MATERIAL_NO,EDITION_NO,TITLE_NAME,AUTH         424        423 B040A9B0
OR,SUBJECTS,ED_PUB,CLASS_NO,MATERIAL_NAME,PUB_DATE,LANGUAGE_
NAME,PAGES,LOCATIONS FROM VM_LIB_MASTER WHERE TITLE_NAME LIK
E :1
SELECT ROWID,TITLE_NO,MATERIAL_NO,EDITION_NO,TITLE_NAME,AUTH         174        176 B040A9B0
OR,SUBJECTS,ED_PUB,CLASS_NO,MATERIAL_NAME,PUB_DATE,LANGUAGE_
NAME,PAGES,LOCATIONS FROM VM_LIB_MASTER WHERE TITLE_NAME LIK
E :1


The same statement was repeated.

What I think this will almost certainly be will be a "BIND MISMATCH".  They are binding :1, it is probably a string - and Oracle will reserve cursor space for the bind variables with the cursor.  Oracle does this in "steps" - so if you bind between 0 and 32 bytes - that is one bind size.  Greater than 32, less than or equal to 128 - another.  You can observe this behavior (while trashing your shared pool of course :) do not do this on your real system please!)

ops$tkyte%ORA10GR2> create table t ( x varchar2(4000) );

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
2 l_last number := 0;
3 l_curr number := 0;
4 begin
5 for i in 1 .. 4000
6 loop
7 execute immediate
8 'declare
9 a varchar2('||i||') := rpad(1,'||i||',1);
10 begin
11 insert into t look_for_me values(a);
12 end;';
13
14 select count(*) into l_curr
15 from v$sql
16 where sql_text like 'INSERT%INTO%T%LOOK_FOR_ME%VALUES(%:B1%)';
17
18 if ( l_last <> l_curr )
19 then
20 dbms_output.put_line( 'New version at byte: ' || i );
21 l_last := l_curr;
22 end if;
23 end loop;
24 end;
25 /
New version at byte: 1
New version at byte: 33
New version at byte: 129
New version at byte: 2001

PL/SQL procedure successfully completed.

 So, I dynamically executed a PLSQL snippet that would read:

declare
a varchar2(I) := rpad(1,I,1);
begin
insert into t look_for_me values (a);
end;

with different values for I at runtime.  I looked in v$sql to see when a new version would pop up - and just report that out. So, for that on insert statement - that will always appear the same in V$SQL - there could be four versions:

ops$tkyte%ORA10GR2> select sql_text from v$sql
2 where sql_text like 'INSERT%INTO%T%LOOK_FOR_ME%VALUES(%:B1%)';

SQL_TEXT
------------------------------------------------------------
INSERT INTO T LOOK_FOR_ME VALUES(:B1 )
INSERT INTO T LOOK_FOR_ME VALUES(:B1 )
INSERT INTO T LOOK_FOR_ME VALUES(:B1 )
INSERT INTO T LOOK_FOR_ME VALUES(:B1 )

ops$tkyte%ORA10GR2>

Now, that doesn't sound so bad - but - what happens when there are TWO binds, how many possible SQL statements could we end up with then?

ops$tkyte%ORA10GR2> create table t2 ( x varchar2(4000), y varchar2(4000) );

Table created.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> declare
2 type array is table of number;
3 l_data array := array(1,33,129,2001);
4 l_last number := 0;
5 l_curr number := 0;
6 begin
7 for i in 1..l_data.count
8 loop
9 for j in 1..l_data.count
10 loop
11 execute immediate
12 'declare
13 a varchar2('||l_data(i)||') := rpad(1,'||l_data(i)||',1);
14 b varchar2('||l_data(j)||') := rpad(1,'||l_data(j)||',1);
15 begin
16 insert into t2 look_for_me values(a,b);
17 end;';
18 dbms_output.put_line( l_data(i) || ', ' || l_data(j) );
19 select count(*)
20 into l_curr
21 from v$sql
22 where sql_text like 'INSERT%INTO%T2%LOOK_FOR_ME%VALUES(%)';
23 if ( l_last <> l_curr )
24 then
25 dbms_output.put_line( 'change at ' || l_data(i) || ', ' || l_data(j) );
26 l_last := l_curr;
27 end if;
28 end loop;
29 end loop;
30
31 end;
32 /
1, 1
change at 1, 1
1, 33
change at 1, 33
1, 129
change at 1, 129
1, 2001
change at 1, 2001
33, 1
change at 33, 1
33, 33
33, 129
33, 2001
129, 1
change at 129, 1
129, 33
129, 129
129, 2001
2001, 1
change at 2001, 1
2001, 33
2001, 129
2001, 2001

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2>
ops$tkyte%ORA10GR2> select sql_text
2 from v$sql
3 where sql_text like 'INSERT%INTO%T2%LOOK_FOR_ME%VALUES(%)';

SQL_TEXT
--------------------------------------------------
INSERT INTO T2 LOOK_FOR_ME VALUES(:B2 ,:B1 )
INSERT INTO T2 LOOK_FOR_ME VALUES(:B2 ,:B1 )
INSERT INTO T2 LOOK_FOR_ME VALUES(:B2 ,:B1 )
INSERT INTO T2 LOOK_FOR_ME VALUES(:B2 ,:B1 )
INSERT INTO T2 LOOK_FOR_ME VALUES(:B2 ,:B1 )
INSERT INTO T2 LOOK_FOR_ME VALUES(:B2 ,:B1 )
INSERT INTO T2 LOOK_FOR_ME VALUES(:B2 ,:B1 )

7 rows selected.

Sort of interesting and maybe not what you would expect - you might have expected 16 (or 4).  But, it is seven.


So, how to prevent this type of bind mismatch?  By using consistent bind lengths.  In 10g - this is easy, just use PL/SQL and a consistent variable length:

ops$tkyte%ORA10GR2> create table t ( x varchar2(2000) );

Table created.

ops$tkyte%ORA10GR2> declare
2 a varchar2(1000) := 'x';
3 b varchar2(1000) := rpad('x',100,'x');
4 c varchar2(1000) := rpad('x',500,'x');
5 d varchar2(1000) := rpad('x',1000,'x');
6 begin
7 insert into t values(a);
8 insert into t values(b);
9 insert into t values(c);
10 insert into t values(d);
11 end;
12 /

PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> column address new_val ADDR
ops$tkyte%ORA10GR2> select parsing_user_id puid, parsing_schema_id psid,
2 sql_text, address, child_address
3 from v$sql
4 where sql_text like 'INSERT%INTO%T%VALUES(%:B1%)'
5 /

PUID PSID SQL_TEXT ADDRESS CHILD_AD
---------- ---------- ------------------------------ -------- --------
125 125 INSERT INTO T VALUES(:B1 ) 40D926A8 40D925C4


Unfortunately, in 9i - that won't work the same:

ops$tkyte%ORA9IR2> select parsing_user_id puid, parsing_schema_id psid,
2 sql_text, address, child_address
3 from v$sql
4 where sql_text like 'INSERT%INTO%T%VALUES(%:B1%)'
5 /

PUID PSID SQL_TEXT ADDRESS CHILD_AD
---------- ---------- ------------------------------ -------- --------
72 72 INSERT INTO T VALUES(:B1 ) 5BAD9938 5BE36514
72 72 INSERT INTO T VALUES(:B1 ) 5BAD9938 5BB81798
72 72 INSERT INTO T VALUES(:B1 ) 5BAD9938 5BC2E21C


In 9i PLSQL, we have to do a bit more work - padding the bind out to the max length and using rtrim to pull it back in (so, you would do this ONLY if you had big bind mismatch problems - any varchar2 less than 128 would never need this treatment and any varchar2 that is "almost always between 200 and 500 characters" would probably not need this either

ops$tkyte%ORA9IR2> create table t ( x varchar2(2000) );

Table created.

ops$tkyte%ORA9IR2> declare
2 a varchar2(1000) := 'x';
3 b varchar2(1000) := rpad('x',100,'x');
4 c varchar2(1000) := rpad('x',500,'x');
5 d varchar2(1000) := rpad('x',1000,'x');
6 begin
7 a := rpad(a,1000);
8 insert into t values(rtrim(a));
9 b := rpad(b,1000);
10 insert into t values(rtrim(b));
11 c := rpad(c,1000);
12 insert into t values(rtrim(c));
13 d := rpad(d,1000);
14 insert into t values(rtrim(d));
15 end;
16 /

PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> column address new_val ADDR
ops$tkyte%ORA9IR2> select parsing_user_id puid, parsing_schema_id psid,
2 sql_text, address, child_address
3 from v$sql
4 where sql_text like 'INSERT%INTO%T%VALUES(%:B1%)'
5 /

PUID PSID SQL_TEXT ADDRESS CHILD_AD
---------- ---------- --------------------------------- -------- --------
72 72 INSERT INTO T VALUES(RTRIM(:B1 )) 5BAEDDEC 5BAED6B8

Now, that said - bind mismatch is only one of many possible reasons for this.  So, take a peek at V$SQL_SHARED_CURSOR next time you see sql that looks like it should be shared - but isn't - to see why...

A quick Advert...

Check out https://conference.oracle.com/imtapp/app/conf_enrollment.uix?mID=73399764 if you have time around 2pm today (east coast US time).  The key is oracle2

Thursday, April 19, 2007

A reader asks...

I was asked this question on asktom, but it seems more appropriate here than over there.

I would like to ask a bit different question than the one that you are answering on this site because I am really really wondering on this for quite a while..  How long you are working on SQL and would like to know how you go about answering questions in such different dimension be in related to tuning , DBA activities,SQL queries , backups.... name the topic and tom has answer for it..  some times the questions themselves are so vague and confusing that one needs to scratch his/her head to make out what is being asked, but still you seems to answering these correctly, how you go about all this...  pls ignore this in case you feel like this question does not match with motto of this site ..

To me, it all comes down to one thing, well - ok, two things:

  • Experience
  • Specialization

Maybe the two are really the same - if you are really experienced, you must have specialized.  If you specialize - you can become very experienced.

It takes years, double digit number of years sometimes - to become truly experienced.  I still remember learning SQL in 1987 (using SQL/DS on VM/CMS and DB2 on MVS).  In 1993, when I first joined Oracle, I knew SQL - but I didn't know SQL yet.  I can say the same thing about C however (it is not just SQL, any technology/language can fit there).  I've never achieved the knowledge of C that I current have of SQL in fact.  But I am really really good at programming C.

After a while, you get to see so many things, you can appear "magical" sometimes.  Just this morning someone came to asktom and asked about some "sqlplus settings".  Turns out - the problem they were experiencing had nothing whatsoever to do with sqlplus at all - it was a server setting.  It was a problem however that I've seen time and time and time again.  I recognized it immediately.  But if you look at the question - it is pretty vague, not much to go on.  But since it followed a pattern I had seen so many times before - I felt pretty confident I knew what it was.  Other experienced people (Michel Cadot there - on that thread for example) didn't pick up on it - but only because they had never seen it before.  Which in this case - could be a good thing for them (it is good for Michel that they weren't aware of it).  The only people that would be aware of the stated issue would be those that were subjected to the evil cursor_sharing=force/similar setting.  It could well be that Michel just was never exposed to that!  That would put them into the "very luck" category.

Which brings up another point the breadth of experience.  Prior to coming to Oracle in 1993 - I had one employer.  All of my experiences were based on that one employer and what I did there.  I was a PL/1 and SQL programmer on a mainframe.  I was a SAS programmer (they called me "sas master" way back when).  I was a C and embedded SQL programmer, I did some Ada on the side.  I became the lead database guy (Developer/DBA was my role).  So, in six years I had 4 evolutionary roles working on 4 different projects.  That gave me a fairly broad look at things - or so I thought.  When I came to Oracle - all of a sudden I was working with dozens of customers on dozens and dozens of systems.  I really became "broad" then - it was very much like consulting (but I didn't charge, I am in pre-sales).  The more experiences you can get yourself exposed to - the more experienced you become.  In dealing with so many different people, using so many different tools, on so many different platforms, under so many different circumstance, with so many different sets of standard operating procedures - I learned a lot.

What I learned most of all - if I did things in the database - man, was I portable, debuggable, tunable, flexible, whatever-able :)  It was those early experiences that really made me database centric, to treat the database as my operating system.  I would work on Windows or maybe use a Solaris server - but not really know what I was going to be deployed on.  As long as I did 99% of the work in the database - I was pretty confident it would roll out on whatever I was going to have to work on.

So, in short, it all comes down to experience, your experiences (work on the same project for 10 years and - well.... I'd rather work on 10 projects in 10 years or 20 projects or more), and whether you were able to really focus in on what you enjoy doing.

 And - and I will refer people to:

The fine art of asking a question in a sensible way that at least has a chance of getting an intelligent response is one many of us need to work on....

Why understanding...

Why understanding matters.  I think many will find this interesting.  Interesting because of the ways to keep dependent data within a group that are demonstrated... and mostly interesting because of the implications of read consistency and the effect of that on recursive SQL (SQL executed on behalf of your SQL - in this case - SQL hidden in PL/SQL called from SQL)...

Friday, April 13, 2007

What is old is new again...

So, a year and a half ago - I wrote about what I term 'IM speak'.  The thread had been idle for a long time - when all of a sudden yesterday, it started getting tons of comments.  So far - 40 new comments.  From a readership I do not normally get :)  High school students, from here.

All I know is I got linked to from their private discussion forum, so I cannot see the discussion that is causing them to come and read that entry - but come and read they are, a good 5% of my traffic is coming from their discussion board this morning.

Interesting to read their comments - but I will say they need to understand what hyperbole is all about (the 'IM speak' example I got from the aoltranslator site is of course an exaggeration).

 

updated a bit later

Neat - I see it became part of a school assignment to read that entry.  Small world - I just spoke at American University as part of a class.  Last week during spring break - the local Oracle Academy class came in and I got to speak to them (brought my kids Alan and Megan to listen in as well - the talk was about careers and how mine progressed - they had never seen me do my job, so this was a good first presentation to hear).  Next month, I'm doing a career day in Brooklyn as well. 

Development DBA...

Oxymoron or a good thing?  That is the question - one that I have a definite opinion on.  A database application under development will succeed or fail based largely on whether it was developed by a database savvy group of people - a DBA/Developer and a Developer/DBA set of people.  We don't need a DBA, we don't need coders (developers) - we need Developers with DBA'ish knowledge, we need DBAs with knowledge of the needs of developers.  If you have that - you will increase the likelihood of success in a large way.

Doug Burns has just formally written this up - warning, the link I'm pointing will will actually include "SQL Server" content :)  He wrote "What use is a Development DBA" - check it out.

Tuesday, April 10, 2007

Hooray...

I really liked this tongue in cheek article.

And I have a feeling it will generate some level of debate...  I mostly agree with it (sort of like a diatribe about generic coding in general would get a thumbs up from me).

Friday, April 06, 2007

When the explanation doesn't sound quite right...

I was asked recently

Under what conditions, autotrace & explain plan can not give the correct execution plan of a sql?

The question came in email - not via asktom.  But I found it of enough general interest to write about it here.  As an aside, I rarely, if ever, answer questions emailed to me directly.  It just isn't a scalable solution.  This is my attempt to make this answer "scale"...

To start with the answer to this - we need to understand that autotrace is just a feature of SQL Plus that automates an explain plan for us - so, autotrace and explain plan are sort of synonymous in this regard.  I'll be using Oracle 10g Release 2 in these examples and will be using autotrace or sql_trace=true and TKPROF - you can get the same results in 9i and later using EXPLAIN PLAN and DBMS_XPLAN.DISPLAY to see the results.

Explain Plan is in the here and now...

This is the first problem with explain plan - it is in the "here and now".  It uses the current optimizer environment, the current set of statistics and so on.  That means the explain plan you see in a tkprof could differ from the REAL PLAN used 5 minutes ago (when performance was 'bad').  For example:

ops$tkyte%ORA10GR2> create table t
2 as
3 select a.*, 1 id
4 from all_objects a
5 where rownum = 1;
Table created.

ops$tkyte%ORA10GR2> create index t_idx on t(id);
Index created.

ops$tkyte%ORA10GR2> alter session set sql_trace=true;
Session altered.

ops$tkyte%ORA10GR2> select id, object_name from t where id = 1;
ID OBJECT_NAME
---------- ------------------------------
1 ICOL$

ops$tkyte%ORA10GR2> insert into t select a.*, 1 from all_objects a;
50338 rows created.

ops$tkyte%ORA10GR2> select id, object_name from t where id = 1;
ID OBJECT_NAME
---------- ------------------------------
1 ICOL$
...
1 WRH$_TABLESPACE_STAT
50339 rows selected.

ops$tkyte%ORA10GR2> alter session set sql_trace=false;
Session altered.


Now, running TKPROF:

$ tkprof /home/ora10gr2/admin/ora10gr2/udump/ora10gr2_ora_12173.trc ./tk.prf 
aggregate=no sys=no explain=/

We will discover (in 10g, where dynamic sampling will kick in!) this conundrum:

select id, object_name from t where id = 1

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 3357 0.33 0.28 0 7490 0 50339
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3359 0.33 0.28 0 7490 0 50339

Rows Row Source Operation
------- ---------------------------------------------------
50339 TABLE ACCESS BY INDEX ROWID T (cr=7490 pr=0 pw=0 time=402830 us)
50339 INDEX RANGE SCAN T_IDX (cr=3478 pr=0 pw=0 time=151058 us)(object id 70390)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
50339 TABLE ACCESS (FULL) OF 'T' (TABLE)

Note how the "Row Source Operation" (what I like to call 'reality') differs from the "Execution Plan"  (I'll call that the 'guess').  What happened here was that the row source operation is captured in the trace file at the time of execution - it reflects what REALLY took place as that query executed.  We followed this sequence of operations:



  1. loaded a single row into the table T
  2. ran a query against T - that did a hard parse.  At the time of the hard parse, Oracle 10g dynamically sampled the table and found it to be very small - and id=1 to be rather selective.  Based on that - it said "let's range scan the index"
  3. loaded a lot more data into the table.  All with the same ID=1 value however. 
  4. ran the same query from step 2 - this was a soft parse and just reused the plan generated back when only one row existed in the table.  As you can see however - that lead to inefficient execution.  We read every row from that table via the index.
  5. Executing TKPROF with explain= shows an entirely different plan.  That is because explain plan always does a hard parse, it evaluated the query plan "in the here and now, as of this moment in time".  It dynamically sampled the table again - found it to be large and ID=1 to not be selective.  Explain plan shows us that if we hard parsed that query right now - it would full scan.  However, all executions of that query in "real life" will index range scan as long as that plan is cached in the shared pool...

An important note for this example - the placement of the ALTER SESSION SET SQL_TRACE=TRUE is important.  I needed to set it before running the query the first time.  As an exercise - move it to just be before the second execution of the query and you'll find (from the tkprof) that the query is hard parsed the second time - and the row source operation in the tkprof will be a full scan.  That is because the first time a query is executed with sql_trace=true (as opposed to the default of false), it will be hard parsed - as of right now


 Explain plan is blind to the bind


Explain plan does not "bind peek".  This is pretty easy to observe:

  2  as
3 select a.*, 1 id
4 from all_objects a
5 where rownum <= 5000;
Table created.

ops$tkyte%ORA10GR2> update t
2 set id = 99
3 where rownum = 1;
1 row updated.

ops$tkyte%ORA10GR2> create index t_idx on t(id);
Index created.

ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats
( user, 'T', method_opt=> 'for all indexed columns size 254' );
PL/SQL procedure successfully completed.
So we have created some skewed data.  If we say "where id=1", we would expect a full scan (index would be inefficient).  If we say "where id = 99", we would expect an index range scan - as id=99 returns a single row.  Using two queries that differ only in bind names (which is sufficient to prevent cursor sharing - these are two DIFFERENT queries to Oracle!), we'll execute a query with a bind set to the value 99 and then another with a bind set to 1.   
ops$tkyte%ORA10GR2> variable x_is_99_first number
ops$tkyte%ORA10GR2> variable x_is_1_first number
ops$tkyte%ORA10GR2> exec :x_is_99_first := 99; :x_is_1_first := 1;
PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> alter session set sql_trace=true;
Session altered.

ops$tkyte%ORA10GR2> select id, object_name from t where id = :x_is_99_first;

ID OBJECT_NAME
---------- ------------------------------
99 ICOL$

ops$tkyte%ORA10GR2> select id, object_name from t where id = :x_is_1_first;

ID OBJECT_NAME
---------- ------------------------------
1 I_USER1
....
1 USER_SCHEDULER_PROGRAM_ARGS
4999 rows selected.

 


Now we'll just flip flop the values and re-execute the queries.  Note that they will soft parse, just reuse the existing plans generated from above.

ops$tkyte%ORA10GR2> exec :x_is_99_first := 1; :x_is_1_first := 99;
PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> select id, object_name from t where id = :x_is_99_first;

ID OBJECT_NAME
---------- ------------------------------
1 I_USER1
...
1 USER_SCHEDULER_PROGRAM_ARGS
4999 rows selected.

ops$tkyte%ORA10GR2> select id, object_name from t where id = :x_is_1_first;

ID OBJECT_NAME
---------- ------------------------------
99 ICOL$

ops$tkyte%ORA10GR2> alter session set sql_trace=false;
Session altered.

Reviewing the TKPROF report first for the "x is 99 first" query we see:

select id, object_name from t where id = :x_is_99_first

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 3 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 3 0 1

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 120 (OPS$TKYTE)

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS BY INDEX ROWID T (cr=3 pr=0 pw=0 time=59 us)
1 INDEX RANGE SCAN T_IDX (cr=2 pr=0 pw=0 time=32 us)(object id 70394)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
1 TABLE ACCESS MODE: ANALYZED (FULL) OF 'T' (TABLE)
********************************************************************************
select id, object_name from t where id = :x_is_99_first

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 335 0.02 0.02 0 739 0 4999
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 337 0.02 0.02 0 739 0 4999

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 120 (OPS$TKYTE)

Rows Row Source Operation
------- ---------------------------------------------------
4999 TABLE ACCESS BY INDEX ROWID T (cr=739 pr=0 pw=0 time=50042 us)
4999 INDEX RANGE SCAN T_IDX (cr=344 pr=0 pw=0 time=30018 us)(object id 70394)


Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
4999 TABLE ACCESS MODE: ANALYZED (FULL) OF 'T' (TABLE)


 


So, the "real plan" used is an index range scan - both times.  But, explain plan - which cannot, does not bind peek - will say "full scan".  The reason?  explain plan is optimizing "select * from t where id = ?" - and it says "5,000 rows, 2 values of id, id is not selective, full scan".  But the optimizer is optimizing the query "select * from t where id = 99" - because it peeked at the bind the first time!  The soft parse won't peek (else it would be a hard parse!) and just reused the existing plan - the inefficient range scan to read every row out.


On the other hand, looking at the "x is 1 first" query:

select id, object_name from t where id = :x_is_1_first

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 335 0.01 0.01 0 398 0 4999
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 337 0.01 0.01 0 398 0 4999

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 120 (OPS$TKYTE)

Rows Row Source Operation
------- ---------------------------------------------------
4999 TABLE ACCESS FULL T (cr=398 pr=0 pw=0 time=15094 us)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
4999 TABLE ACCESS MODE: ANALYZED (FULL) OF 'T' (TABLE)
********************************************************************************
select id, object_name from t where id = :x_is_1_first

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 67 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 67 0 1

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 120 (OPS$TKYTE)

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS FULL T (cr=67 pr=0 pw=0 time=82 us)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
1 TABLE ACCESS MODE: ANALYZED (FULL) OF 'T' (TABLE)

Explain plan appears to have gotten it right - but only by accident.  It is just a coincidence that the plans "match" - they were arrived at by very different thought processes.  The optimizer optimized 'where id=1' and said "about 5,000 rows, about 4,999 will be returned, full scan".  The explain plan optimized "where id=?" and said "about 5,000 rows in the table, two values for ID, about 50% of the table will be returned, full scan".


So, that example shows explain plan "getting it wrong" because it is blind to the bind - and shows the effect of bind variable peeking (which you can read more about on asktom using the link above...)


Explain plan doesn't see your datatype...


The last bit about explain plan I'll look at is the fact that explain plan doesn't see your bind datatype.  It presumes all binds are varchar2's regardless of how the developer is binding.  Consider:

ops$tkyte%ORA10GR2> create table t
2 ( x varchar2(10) primary key,
3 y date
4 );
Table created.

ops$tkyte%ORA10GR2> insert into t values ( 1, sysdate );
1 row created.

ops$tkyte%ORA10GR2> variable x number
ops$tkyte%ORA10GR2> exec :x := 1
PL/SQL procedure successfully completed.

ops$tkyte%ORA10GR2> alter session set sql_trace=true;
Session altered.

ops$tkyte%ORA10GR2> select * from t where x = :x;

X Y
---------- ---------
1 06-APR-07

ops$tkyte%ORA10GR2> alter session set sql_trace=false;
Session altered.

So, we have a table with a varchar2 datatype for the primary key - but we only stuff numbers in there.  End users and developers know it is always a number and then presume the type is a number (makes sense) - but someone used the wrong datatype (just in case maybe....).  When we look at the TKPROF we'll see the explain plan mismatch:

select * from t where x = :x

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 1 0 0
Fetch 2 0.00 0.00 0 7 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 8 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 120 (OPS$TKYTE)

Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS FULL T (cr=7 pr=0 pw=0 time=76 us)

Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT MODE: ALL_ROWS
1 TABLE ACCESS (BY INDEX ROWID) OF 'T' (TABLE)
0 INDEX (UNIQUE SCAN) OF 'SYS_C0013586' (INDEX (UNIQUE))



 


Explain plan - the 'execution plan' shows an index unique scan, but reality (the row source operation) shows we full scanned.  DBMS_XPLAN (autotrace in 10gr2 uses these new package introduced in 9ir2, you can use it directly if you like) shows us why we are full scanning:

ops$tkyte%ORA10GR2> select * from t where x = to_number(:x);

Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 16 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 16 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(TO_NUMBER("X")=TO_NUMBER(:X))

Note
-----
- dynamic sampling used for this statement

So, when I told explain plan "hey, we have a NUMBER here" using to_number(), we can see what happened.  In the predicate information, we see that when you compare a number to a string, Oracle will TO_NUMBER(the string).  That of course makes using the index on the string not possible!


So, this example shows two things.  Firstly, that explain plan assumes varchar2 (so use to_number or literals!! and to_date to get the right type conveyed to explain plan).  Secondly, that implicit conversions are evil and should always be avoided.  

Tuesday, April 03, 2007

So, in your opinion...

I was asked to review a SQL feature under consideration. It looks pretty neat - pattern matching over partitioned, ordered rows. Answer questions like "find me the stocks that are exhibiting a 'W' pattern" (trying to catch them on the up-swing). Or maybe "find me things that started out expensive, but have (rapidly) over time become very cheap". Or network usage / audit trail analysis (looking for specific patterns of behavior).

So, I got the paper (you can get it here) and read it. I gave my feedback, but I'd like to hear yours - specifically:

  • Do you think you'd find a SQL feature like this useful, truly useful.
  • Do you think the suggested implementation is "complete" enough.
  • Anything else you'd like to say

I can assure you the people working on this will be reading your feedback - definitely.

A word of warning though - the paper is a bit "dense", it might well take more than a single reading to 'get it'. A working knowledge of the existing set of analytic functions and their syntax would be really useful. If I didn't know analytics as well as I do - I'm not sure how far I'd make it through the paper and still be able to understand it.

I can say - I've seen the SQL needed to recognize a "W" using analytics... It is not "pretty", this proposed language feature would make it pretty easy - once you 'got it'.

You just have to hope it isn't really true...

Sometimes, you read something and just think to yourself "this cannot really have happened". but in the back of your head, you just know "of course it did".


Unique passwords system wide. Stored in clear text. Used as primary/foreign keys even (a password??!?!). Massive update cascade problems - just to update your password. Never mind that once you know that passwords must be unique and that you get an error message to the effect "that password is already in use", you can start building a dictionary to attack with pretty easily.


You probably cannot make that story up :)


On another note


Here is an interesting approach to phishing. Instead of trying to "trick" me out of money, this fellow is just asking me for some, very politely and all - but still. Does this stuff actually work? I mean - who would actually give them something. I have to think that someone smart enough to initiate an electronic payment like that would be/should be smart enough to just say "no" :)



Good day to you.
I am XXXXXX.


I work in a telecommunication company and to be specific,I work as an information system Auditor in the company.


I am a holder of certificates in the field of Information Technology and Accounting. I have registered for CISA examination{Certified information system auditor}. A friend of mine living in US have assisted me by sending a total of 390 dollars into the CISA account due to the difficulty faced in my country in sending money out of the country.It is not easy sending money out of the country.I am expected to send 480 dollars.It means I have a balance of 90 dollas to send to CISA in US.The deadline for the payment is 11th of Afril. Please I will be glad if I could be assisted by you paying in 90 dollas into the account.The bank name:Lasalle national Bank Chicago,Illinois USA. ABA numb.000000000. ISACA a/c 00-0000-00. SWIFT code:XXX0000X. My ID:00000000. The three names are, XXXXXXXXX.I will be glad to hear from you even if it will not be possible for you to send so that I can look for another alternative.


Thanks. XXXXXXXX.

Sunday, April 01, 2007

Neat Error Message...

 

WordPress database error: [MySQL server has gone away]
SELECT DISTINCT YEAR(post_date) AS `year`, MONTH(post_date) AS `month`, count(ID) as posts FROM wp_posts WHERE post_date < '2007-04-01 20:12:25' AND post_date != '0000-00-00 00:00:00' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC

 

Neat error message - "has gone away"....

Bummed about the lack of bind variables, I rather suspect that by hard coding the date in there, they might tend to generate lots of unique SQL statements...

Funny, I wonder how you can order by post_date, after stripping the day, hour, minute and second away - that part doesn't make sense to me...  Post_date doesn't exist anymore after the GROUP BY!  How could you possibly sort by it?