10 Oracle SQL features you probably didn’t know

10 Oracle SQL features you probably didn’t know

I can not guarantee that you never heared of this. But the majority of developers doesn’t know or doesn’t seem to know. And honestly –  most of this information is fairly useless – and mostly harmless.

10. Why we have the CONCAT function

There is a concat function that can be used instead of the concat operator ||.

Nobody uses that! But there is a reason why it exists.

This is from the 8.1.7 Oracle documention

On most platforms, the concatenation operator is two solid vertical bars, as shown in Table 3-3. However, some IBM platforms use broken vertical bars for this operator. When moving SQL script files between systems having different character sets, such as between ASCII and EBCDIC, vertical bars might not be translated into the vertical bar required by the target Oracle environment. Oracle provides the CONCAT character function as an alternative to the vertical bar operator for cases when it is difficult or impossible to control translation performed by operating system or network utilities. Use this function in applications that will be moved between environments with differing character sets.

So we do have this function, because in the past this || was not consistent over all the different platforms using different character sets. Since nobody seems to use IBM EBCDIC anymore, there is no real need to use the CONCAT function.

9. select UNIQUE

Instead of doing a SELECT DISTINCT you can do SELECT UNIQUE.

But you shouldn’t. None of them. Forget I mentioned this.

Ok here is an example.

Example

select unique e.deptno from scott.emp e;

DEPTNO
-----
30
20
10

It is also possible to do SELECT COUNT(UNIQUE …) . Not sure when that one was introduced, but it seems to work now.

Sven says: “SELECT DISTINCT|UNIQUE should be considered a bug in real production code.”

I have yet to find an example where SELECT DISTINCT is needed. More likely there is a bug in the data model or missing joins in the where clause. GROUP BY or sometimes EXISTS are the better long term alternatives.

Using SELECT DISTINCT is absolutly fine for developer ad-hoc queries.

I feel this whole concept is a bit inconsistent. 12c introduced the new approximation function APPROX_COUNT_DISTINCT, but there is no APPROX_COUNT_UNIQUE!

8. there is an ANSI datatype DATE

The ANSI datatype DATE does not hold any time information (no hours, minutes, seconds). That is what the ANSI datatype TIME is for. Oracle does not officially know these two datatypes.

However:

EXTRACT function

Purpose

EXTRACT extracts and returns the value of a specified datetime field from a datetime or interval expression. The expr can be any expression that evaluates to a datetime or interval data type compatible with the requested field:

  • If YEAR or MONTH is requested, then expr must evaluate to an expression of data type DATE, TIMESTAMP, TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH LOCAL TIME ZONE, or INTERVAL YEAR TO MONTH.
  • If DAY is requested, then expr must evaluate to an expression of data type DATE, TIMESTAMP, TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH LOCAL TIME ZONE, or INTERVAL DAY TO SECOND.
  • If HOUR, MINUTE, or SECOND is requested, then expr must evaluate to an expression of data type TIMESTAMP, TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH LOCAL TIME ZONE, or INTERVAL DAY TO SECOND. DATE is not valid here, because Oracle Database treats it as ANSI DATE data type, which has no time fields.
  • If TIMEZONE_HOUR, TIMEZONE_MINUTE, TIMEZONE_ABBR, TIMEZONE_REGION, or TIMEZONE_OFFSET is requested, then expr must evaluate to an expression of data type TIMESTAMP WITH TIME ZONE or TIMESTAMP WITH LOCAL TIME ZONE.

So essentially the doc says, that EXTRACT uses an ANSI DATE as an expression input.

Proof

select extract(day from sysdate) from dual;
21

select extract(second from sysdate) from dual;
ORA-30076: invalid extract field for extract source

select extract(second from systimestamp) from dual;
4.823

The error appears, because sysdate is kind of converted into a ANSI DATE and this does not hold time information.

 

Problem is: We just can not use this ANSI date type for normal activities.

Alternative for the ANSI date: TRUNC(datetime). In 12c as a virtual column.

Problem solved. ✔

 

7. group by ()

You can group by an empty parenthesis. Or let’s call it group by the FULL SET. This is in fact useful in some very specific cases.

It gives us the ability to get a NO_DATA_FOUND error if there is no data at all. Oracle guarantees that an aggregation select without a group by will always return a record. GROUP BY () can be added if you want to change this default behaviour on purpose.

Example

Let’s say we want to count how many employees are in one specific department.

select count(e.empno) emp#
from scott.dept d
left join scott.emp e on e.deptno=d.deptno
where d.dname = 'OPERATIONS';

EMP#
----
0

Since there are no employees in operations, this returns 0. This is correct and the information I want.

More specifically this query returns one row, but the e.empno is null and therefore is not counted. Count(*) would have returned 1!

But if we use a department name that does not even exists, then this will also return 0.

select count(e.empno) emp#
from scott.dept d
left join scott.emp e on e.deptno=d.deptno
where d.dname = 'FANCY_CLOUD_SPACE';

EMP#
----
0

Not correct! In this case I prefer to get a no data found error.

Grouping on the full set gives us this option.

select count(e.empno) emp#
from scott.dept d
left join scott.emp e on e.deptno=d.deptno
where d.dname = 'FANCY_CLOUD_SPACE'
group by ();

no rows found.

Btw: the SQL Developer syntax (pre-)parser identifies that () as an syntax error. But it will execute it successfully.

I have to admit, that in this case it would be much better to group by the department. But maybe it would be possible to get the count for multiple departments using an IN list. Then the group on department level would not help much.

 

Btw: GROUP BY NULL does the same thing, and might be less confusing for some developers. I personally prefere GROUP BY () because the () also is used in GROUPING_SETS.

 

6. + vs. –

In some edge cases using a different operator (+|-) leads to disturbing results.

Example: Substracting a tiny number from 1 does not equal 1. But adding a tiny number from 1 does equal 1!

select * from dual
where 1 = 1-0.000000000000000000000000000000000000001;

No data found!

This is expected behaviour…

select * from dual
where 1 = 1+0.000000000000000000000000000000000000001;

DUMMY
-----
X

Wat? Why do we get a result?!

The solution is simple. We overstepped the maximum precision of the number datatype when doing the addition. Max precision is 38 digits. The addition would require a precision of 39. Instead the result is rounded (or truncated) to a precision of 38 digits, which happens to equal 1.  The substraction result still is in the 38 precision range and therefore exact (very slightly less than 1).

Don’t confuse precision with maximum value! Precision is the number of different digits that we can store. This can then be moved by an exponent to the left or right for very large or very small values.

 

To put this into perspective

That precision is so extremly high, that we have major troubles visualizing it. So here is a comparsion. The comparison data comes from this hugely interesting site: http://money.visualcapitalist.com/all-of-the-worlds-money-and-markets-in-one-visualization/

The complete monetary assets in the whole word is estimated about 80 trillion $ (broad money=coins, banknotes, savings accounts, time deposits, …). That is a 80,000,000,000,000  or an 8 with 13 zeros at the end or 8e13 in scientific exponential notation.

select 8e13 from dual;

Considering all the national and private debts is even higher. Here we reach about  200 trillion $.

And if we consider derivates (=high risk gambles)  then the high estimations go as far as 1.2 quadrillion $.  That is a 1 followed by 24 more digits (mostly zeros) or 1.2e24.

Since all this is expressed in dollars, we can increase the number even further by chooseing a different currency.  For example the Nepalease Rupee is worth a little less than 0.01 dollar. So all the worlds derivatees could be expressed in Nepalease Rupees using a number as big as 1.2e26.

That means using a number with a precision of 38 we can easily express all the money in the world including derivates up to the very last Nepalease Rupee. And we still have only used 2/3rds of the maximum available precision. Or to say it differently. If we duplicate the planet Earth 1,000,000,000,000 (1 trillion) times – we could still give the exact amount of money on all planets in Napalease Rupees up to the very last Rupee using the number datatype.

That’s quite impressive!

Btw. The biggest number that can be expressed using the NUMBER datatype is 999…(38 9’s) x10125 . Which is way way bigger than anything we discussed so far.

 

5. instead of IN we can use = ANY

select 'TRUE'
 from dual
 where 1 = ANY (1,2,3,4,5)
TRUE

There are several similar unusual “replacement” options.

e.g.

  • instead of != we can also use ^=.
  • NOT IN is the same as  != ALL
  • SOME is a synonym for ANY – but only in sql!
  • <=LEAST | >=GREATEST can be replaced by <|>= ALL
  • >=LEAST | <=GREATEST can be replaced by >|<= SOME

I expect that SOME people do know about these operators, since questions about them are part of the SQL expert exam. So from ALL experts who took the exam at LEAST the few who passed should know SOMEthing about ANY.

The last replacement differs how NULL values are treated. In theory there could be cases where this comes in handy. I so far never had such an encounter.

Example

with testdata
as (select 1 a, 2 b, 3 c, 4 d from dual union all
   select 5 a, 4 b, 3 c, 2 d from dual union all
   select 1 a, null b, 2 c, null d from dual union all
   select null a, null b, null c, null d from dual)
select * from testdata
where 3 >= LEAST(a,b,c,d);

 

A B C D
-------
1 2 3 4
5 4 3 2

 

with testdata
as (select 1 a, 2 b, 3 c, 4 d from dual union all
    select 5 a, 4 b, 3 c, 2 d from dual union all
    select 1 a, null b, 2 c, null d from dual union all
    select null a, null b, null c, null d from dual)
select * from testdata
where 3 >= ANY(a,b,c,d);

 

A B C D
-------
1 2 3 4
5 4 3 2
1 - 2 -

4. external table on multiple files

External tables can load multiple files at once! As long as they have the same structure of cause. So here is an example.

CREATE TABLE EXT_DUMMY
(
    "RECORDTYPE" VARCHAR2(100 BYTE), 
    "COL1" VARCHAR2(100 BYTE), 
    "COL2" VARCHAR2(100 BYTE) 
)
ORGANIZATION EXTERNAL
(
    TYPE ORACLE_LOADER DEFAULT DIRECTORY "IMPORT_BAD_FILE" 
    ACCESS PARAMETERS ( 
             records delimited BY newline 
             FIELDS TERMINATED BY ';' 
             MISSING FIELD VALUES ARE NULL 
               ( RECORDTYPE CHAR
               , COL1 CHAR 
               , COL2 CHAR 
               ) 
    ) 
    LOCATION ( 'Testfile1.txt, Testfile2.txt' )
)
    reject limit 10
;

When we then do a select on the external table, we will see the data from both files.

We can also switch the external table to a different file.

alter table EXT_DUMMY location ('Testfile3.txt' )

 

For the experts: We can use the ROWID to find out which file was used to load a specific record. Special thanks go to OTN-forum member Odie_63 who found this solution. More info in this old OTN forum thread.

with ext_loc as (
      select position-1 as pos
           , name as filename
      from sys.external_location$
      where obj# = ( select object_id
                     from user_objects
                     where object_name = 'EXT_DUMMY' )
    )
select x.filename,
       t.*
from EXT_DUMMY t
join ext_loc x 
on x.pos = to_number(regexp_substr(dump(t.rowid,10,9,1),'\d+$'))
;

It’s a very clever piece of software. Essentially it extracts the filenumber from the rowid, looks up the file number in the data dictionary and combines that with our data set.

This select was done in a 10g database. In 12c we can probably use dbms_rowid to do the same, instead of regexp_substr(dump(rowid)).

3. insert default values

We can insert into a table using the “default” keyword to force default behaviour for this column.

Example

-- setup
create table swe_default_test 
(col1 varchar2(10) not null,
 col2 varchar2(10) default 'TEST' not null
 );
Table SWE_DEFAULT_TEST created.

-- test
insert into swe_default_test (col1) values ('X');
1 row inserted.

insert into swe_default_test (col1,col2) values ('X',null);
ORA-01400: cannot insert NULL into ("MYUSER"."SWE_DEFAULT_TEST"."COL2")

insert into swe_default_test (col1,col2) values ('X',default);
1 row inserted.

-- double check
select * from swe_default_test;
COL1 COL2 
---------- ----------
X TEST 
X TEST 

-- cleanup
drop table swe_default_test;
Table SWE_DEFAULT_TEST dropped.

 

In 12c we now have a new and I think better option to do the same. The default column can be defined additionally with “ON NULL”. Which has the effect, that inserting a NULL value, will lead to using the default value instead. Of cause the “default” keyword still works too.

-- setup
create table swe_default_test12c
(col1 varchar2(10) not null,
 col2 varchar2(10) default on null 'TEST' not null
 );
Table SWE_DEFAULT_TEST12C created.

-- test
insert into swe_default_test12c (col1,col2) values ('X',null);
1 row inserted.

insert into swe_default_test12c (col1,col2) values ('X',default);
1 row inserted.

-- doublecheck
select * from swe_default_test12c;
COL1 COL2 
---------- ----------
X TEST 
X TEST 

-- cleanup
drop table swe_default_test12c;
Table SWE_DEFAULT_TEST12C dropped.

As we can see both cases now work. The one using a NULL value and also useing the DEFAULT keyword.

This new 12c “default on null” feature can be used to replace the typical BEFORE ROW INSERT trigger. More info how to do this in Sequence and Audit columns with Apex 5 and 12c

 

2. (1,2) = ((1,2))

We can compare expression lists using the = operator. But the right side of the comparison needs an extra set of parenthesis.

select 'TRUE' checked from dual
 where (1,2,3,4,5) = ((1,2,3,4,5));
CHECKED
-------
TRUE

This is just a shorthand form of

1=1 and 2=2 and 3=3 and 4=4 and 5=5

If we do not use the second set of parenthesis on the right hand side of the comparison, then we get an error.

select 'TRUE' checked from dual
 where (1,2,3,4,5) = (1,2,3,4,5) ;
ORA-00920: invalid relational operator

This is documented. Although a tiny bit difficult to read in the syntax diagrams.

See: SQL Reference – Simple Comparison

syntax_compare_lists_1syntax_compare_lists_2

 

1. DUAL is a real table

Although there are some special optimizations in various tools, dual is a real table residing in the sys schema.

select owner, table_name
 from dba_tables
 where table_name = 'DUAL';
OWNER TABLE_NAME
----------------
SYS DUAL

 

In older database versions it was possible to do an insert into the DUAL table.

Warning! This is extremly dangerous. It will probably break ALL applications in your database.

insert into dual values ('Y');
select mysequence.nextval into v_id from dual;

TOO_MANY_ROWS error!

And nobody captures the TOO_MANY_ROWS exception for selects from dual.

Btw: A working safety mechanism is to add rownum = 1 to your query.

select mysequence.nextval into v_id from dual where rownum = 1;

Fortunately manipulating DUAL does not work anymore in recent versions of the database. I think 12c introduced it. Not even DBAs have the privs anymore to do this.

ORA-01031: insufficient privileges
*Cause: An attempt was made to perform a database operation without
the necessary privileges.

0. The Oracle SQL Syntax is documented!

Bam! There you have it! You wouldn’t have thought that. Well, you are not alone. So many developers seem to have no clue that this exists or where it exists.

When googeling use “oracle 12c doc” or “oracle 12.2 SQL ref” in front of your search term. It helps to find the proper pages.

12.2 doc portal

12.2 SQL introduction

12.2 SQL Language reference

 

 

Special thanks go out to Stefanie R., who gave me the idea for this blog post!

 

adaptive cursor sharing and DBMS_SQL

A recent post in the OTN mentioned that DBMS_SQL does not use bind peeking for binded variables. I couldn’t believe that, so I decided to do some tests for myself. The findings are strange…

This is potentially relevant for Apex developers, since the Apex engine uses DBMS_SQL. I still have to do further testing to check the behaviour in Apex.

First I setup some test to show bind peeking and adaptive cursor behaviour using normal statements in SQL*Plus or SQL Developer. After that we move to dynamic SQL, especially DBMS_SQL, and try the same again.

scenario setup

create skewed testdata

--drop table demo_big;
create table demo_big as
select level as id, 
       case when mod(level,10000)=0 
            then 'VALID' 
            else 'INVALID' 
       end as status
from dual
connect by level <= 1000000;

desc demo_big;

Name   Null Type
------ ---- -----------
ID          NUMBER
STATUS      VARCHAR2(7)

select status, count(*) 
from demo_big 
group by rollup(status);
STATUS     COUNT(*)
INVALID    999900
VALID      100
           1000000

So we have a few VALID values and a lot of INVALID ones.

Even if we have only two different values an index will be useful on this column. The data distribution is so skewed that any access trying to read the VALID values would profit from an index. However if we access the INVALID column we don’t want to use the index and instead want a full table scan.

-- create indexes on all the important columns
create unique index demo_big_id_ix on demo_big(id);
create index demo_big_status_ix on demo_big(status);

create statistical data(histograms)

First we create the statistics so that the optimizers knows what is in that table and how the data looks like.

-- create statistics and test histogram
execute dbms_stats.gather_table_stats(user, 'DEMO_BIG', method_opt=>'for all indexed columns size skewonly');

Then we check the data dictionary checks to see what has been created so far.
The hist_numtochar2 function is copied from Martin Widlake (Source: https://mwidlake.wordpress.com/2009/08/11/). It just helps to do a crude translation of the numerical histogram bucket endpoints. The code of the function can be found at the end of this post.

I don’t show the results from all selects but the last one. The other selects are here just as references. They are helpful to see what kind of statistics are in place.

select table_name, num_rows, blocks, last_analyzed
from user_tables
where table_name = 'DEMO_BIG';

select table_name, column_name, num_distinct, histogram, num_buckets, sample_size 
from user_tab_columns
where table_name = 'DEMO_BIG';

select *
from user_histograms
where table_name = 'DEMO_BIG' and column_name = 'STATUS';

select table_name, column_name, endpoint_number, endpoint_value, hist_numtochar2(endpoint_value) as translated_value
from user_histograms
where table_name = 'DEMO_BIG' and column_name = 'STATUS';

Here we see a frequency histogram with two buckets for the column STATUS.

TABLE     COLUMN  ENDPOINT_NUMBER    ENDPOINT_VALUE           TRANSLATED_VALUE
DEMO_BIG  STATUS  999900     380626532452853000000000000000000000    INVALJ*
DEMO_BIG  STATUS  1000000    447861930473196000000000000000000000    VALID

The first bucket holds 999900 values where status= INVALID.
The next bucket holds 1000000-999900 = 100 where status = VALID.

This of cause matches exactly what we created. So the statistical info in the dictionary is absolutly correct.

Tests

Now that our setup is in place, we can do some basic testing to see different plans.

check execution plan with LITERALS

-- test different cursor/execution plan using plain selects
select count(*) from demo_big where status = 'VALID';
select * from table(dbms_xplan.display_cursor);
----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |       |       |     3 (100)|          |
|   1 |  SORT AGGREGATE   |                    |     1 |     8 |            |          |
|*  2 |   INDEX RANGE SCAN| DEMO_BIG_STATUS_IX |   100 |   800 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("STATUS"='VALID')
select count(*) from demo_big where status = 'INVALID';
select * from table(dbms_xplan.display_cursor);
-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |   701 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     8 |            |          |
|*  2 |   TABLE ACCESS FULL| DEMO_BIG |   999K|  7811K|   701   (2)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("STATUS"='INVALID')

Perfect! As expected one does an index access / index range scan, the other does a full table scan.

check execution plan with BIND parameters

select count(*) from demo_big where status = :P_ENTER_VALID;
select * from table(dbms_xplan.display_cursor);
----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |       |       |     3 (100)|          |
|   1 |  SORT AGGREGATE   |                    |     1 |     8 |            |          |
|*  2 |   INDEX RANGE SCAN| DEMO_BIG_STATUS_IX |   100 |   800 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("STATUS"=:P_ENTER_VALID)
select count(*) from demo_big where status = :P_ENTER_INVALID;
select * from table(dbms_xplan.display_cursor);
-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |   701 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     8 |            |          |
|*  2 |   TABLE ACCESS FULL| DEMO_BIG |   999K|  7811K|   701   (2)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("STATUS"=:P_ENTER_INVALID)

The two statements are not identical because the name of the bind parameter is different. Because of that we get two different cursors. Each with a different execution plan.
This test shows that bind peeking works. During the hard parse phase the value of the binded parameter was checked (peeked) so that the correct estimations for the resulting rows/cardinalities were made. Which led in turn to the correct plan for each of the two different statements. However this first parameter “freezes” the execution plan. So that if we change the binded value, then the same plan is reused.

This behaviour was enhanced in 11g with the introduction of adaptive cursor sharing and got steadily improved since then.

To test adaptive behaviour we run the first query again a few times (at least 4 times). But this time we do not pass VALID, but instead INVALID as a parameter.

After that we can see a new child cursor 1 for the sql_id “7rjdcm7v7hfrs”.

select is_bind_sensitive, is_bind_aware, sql_id, child_number, sql_text
from v$sql
where upper(sql_text) like 'SELECT%FROM DEMO_BIG WHERE%'  and sql_text not like '%v$sql%'
;
IS_BIND    IS_BIND SQL_ID        CHILD   SQL_TEXT
_SENSITIVE _AWARE                _NUMBER
Y          N       7rjdcm7v7hfrs 0       select count(*) from demo_big where status = :P_ENTER_VALID
Y          Y       7rjdcm7v7hfrs 1       select count(*) from demo_big where status = :P_ENTER_VALID
Y          N       5zkmtfj331xmc 0       select count(*) from demo_big where status = :P_ENTER_INVALID
select * from table(dbms_xplan.display_cursor('7rjdcm7v7hfrs',0));
----------------------------------------------------------------------------------------
| Id  | Operation         | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                    |       |       |     3 (100)|          |
|   1 |  SORT AGGREGATE   |                    |     1 |     8 |            |          |
|*  2 |   INDEX RANGE SCAN| DEMO_BIG_STATUS_IX |   100 |   800 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("STATUS"=:P_ENTER_VALID)

select * from table(dbms_xplan.display_cursor('7rjdcm7v7hfrs',1));
-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          |       |       |   701 (100)|          |
|   1 |  SORT AGGREGATE    |          |     1 |     8 |            |          |
|*  2 |   TABLE ACCESS FULL| DEMO_BIG |   999K|  7811K|   701   (2)| 00:00:01 |
-------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("STATUS"=:P_ENTER_VALID)

This is adaptive behaviour. After a few bad tries a second execution plan is created for the same cursor and used. How many tries are needed? Often it changes on the third try. But it can happen that more are needed.

Test with DBMS_SQL

Now comes the more difficult part. Setup a small plsql block to use DBMS_SQL to run the same statement again using binded parameters.

-- testcase for BIND peeking/aware using DBMS_SQL
declare
  curid    NUMBER;
  ret      INTEGER;
  sql_stmt VARCHAR2(200);
begin
  sql_stmt := 'select count(*) from demo_big where status = :P_STATUS';

  -- get cursor handle
  curid := DBMS_SQL.OPEN_CURSOR;
  DBMS_SQL.PARSE(curid, sql_stmt, DBMS_SQL.NATIVE);

  DBMS_SQL.BIND_VARIABLE(curid, 'P_STATUS', 'VALID');
  ret := DBMS_SQL.EXECUTE_and_fetch(curid);

  DBMS_SQL.PARSE(curid, sql_stmt, DBMS_SQL.NATIVE);
  DBMS_SQL.BIND_VARIABLE(curid, 'P_STATUS', 'INVALID');
  for i in 1..5 loop
    ret := DBMS_SQL.EXECUTE_and_fetch(curid);
  end loop;

DBMS_SQL.close_cursor(curid);
end;
/

The v$sql view has two interesting columns.
IS_BIND_SENSITIVE shows cursors where the execution plan can evolve.
IS_BIND_AWARE shows child cursors where a new plan was created, meaning that the cursor was evolved.

select is_bind_sensitive, is_bind_aware, sql_id, child_number, sql_text
from v$sql
where upper(sql_text) like 'SELECT%FROM DEMO_BIG WHERE%'
and sql_text not like '%v$sql%'
;
IS_BIND_SENSITIVE    IS_BIND_AWARE    SQL_ID    CHILD_NUMBER    SQL_TEXT
Y    N    7rjdcm7v7hfrs    0    select count(*) from demo_big where status = :P_ENTER_VALID
Y    Y    7rjdcm7v7hfrs    1    select count(*) from demo_big where status = :P_ENTER_VALID
N    N    3kpu54a461gkm    0    select count(*) from demo_big where status = :P_STATUS
N    N    3kpu54a461gkm    1    select count(*) from demo_big where status = :P_STATUS
Y    N    5zkmtfj331xmc    0    select count(*) from demo_big where status = :P_ENTER_INVALID
N    N    fjjm63y7c6puq    0    select count(*) from demo_big where status = :P_STATUS2
N    N    1qx03gdh8712m    0    select count(*) from demo_big where status = 'INVALID'
N    N    2jm3371mug58t    0    select count(*) from demo_big where status = 'VALID'

The two child cursors

-- find the cursor id
select sql_id, child_number, bucket_id, count, is_bind_sensitive, is_bind_aware, sql_text
from v$sql s
left join v$sql_cs_histogram h using (sql_id, child_number)
where upper(s.sql_text) like 'SELECT%FROM DEMO_BIG WHERE%'
and s.sql_text not like '%v$sql%'
;

-- check the execution plan for both child cursors
select * from table(dbms_xplan.display_cursor('3kpu54a461gkm',0));
select * from table(dbms_xplan.display_cursor('3kpu54a461gkm',1));

-- see the plans in the SGA
select * from v$sql_plan where sql_id = '3kpu54a461gkm';
select * from v$sql_plan where sql_id = 'fjjm63y7c6puq';

Now the strange thing is: The first cursor is using a FULL table scan. But the first execution was done using the VALID value and should have resulted in the index range scan. The second child cursor does not even have an execution plan!

NOTE: cannot fetch plan for SQL_ID: 3kpu54a461gkm, CHILD_NUMBER: 1
      Please verify value of SQL_ID and CHILD_NUMBER; 
      It could also be that the plan is no longer in cursor cache (check v$sql_plan)

What is going on here? v$sql has a column EXECUTIONS which tells us how often this child cursor was called. It is always 0 for the child 1 from the DBMS_SQL cursor!

I did several more tests using DBMS_SQL. Even a case where the cursor was closed and opened several times. All with the same result.

Interpreting the results

I’m still not yet exactly sure what is going on there. It seems as if bind peeking and adaptive cursor sharing does not work with DBMS_SQL. But why do we see then two child cursors? It seems as if the different parameter values at least have the effect that a new child is created. And this happens only when there is a need for a different execution plan. But where is the plan for that? I still have some doubts. Maybe the execution plan in v$sql is lying is this case? Since DBMS_SQL goes deep into the internals it might be that some of the normal behaviours are not reflected in some of the views.

The cursor itself is in the private SQL workarea and I never checked that. Another approach would be to setup a scenario where we can measure the perormance difference. The test case I used was too small to see a desicive difference between the two possible plans.

Also we have to remember that the need for DBMS_SQL is rare. A normal select with binded parameters is certainly not a case where need dynamic SQL. A more typical case would be a cursor | statement where we do not know at compile time what columns are returned. Then we can use DBMS_SQL to analyse the structure of such a cursor and react on that.

However if we build some kind of dynamic frameworks and think about using DBMS_SQL we should rethink our strategy. Maybe it is easier to provide all the possible cases as plsql apis and thereby compiling during creation, instead of building the statement in a completly dynamic fashion but suffering some essential drawbacks.

Recommendations

1) Avoid DBMS_SQL, consider to use native SQL (execute_immediate) instead
2) If you have a skewed data distribution, make sure your plans are bind_sensitive
3) If you can guarantee an even data distribution, consider to add the NO_BIND_AWARE hint. This should be needed only in some extrem situations (very high performance requirements or cursor cache issues)

Appendix

The function that I used previously:

create or replace function hist_numtochar2(p_num number
,p_trunc varchar2 :='Y') return varchar2
-- Author: Martin Widlake
-- Source: https://mwidlake.wordpress.com/2009/08/11/
is
  m_vc varchar2(15);
  m_n number :=0;
  m_n1 number;
  m_loop number :=7;
begin
  m_n :=p_num;
  if length(to_char(m_n))>36 then
    --dbms_output.put_line ('input too short');
    m_vc:='num format err';
  else
    if p_trunc !='Y' then
      m_loop :=15;
    else
      m_n:=m_n+power(256,9);
    end if;
    --dbms_output.put_line(to_char(m_N,'999,999,999,999,999,999,999,999,999,999,999,999'));
    for i in 1..m_loop loop
      m_n1:=trunc(m_n/(power(256,15-i)));
      --    dbms_output.put_line(to_char(m_n1));
      if m_n1!=0 then m_vc:=m_vc||chr(m_n1);
      end if;
      dbms_output.put_line(m_vc);
      m_n:=m_n-(m_n1*power(256,15-i));
    end loop;
  end if;
  return m_vc;
end;
/

News for developers from #OOW16 about 12.2

The following information I obtained during various sessions during OOW16. Sometimes they resemble just an impression I got while a certain feature was mentioned during a speak. So use the information with caution.

new db version 12.2

So the 12.2 version is out there now, but most of us won’t be able to use it very soon. Why? Because Oracle choose to do a cloud first approach. That means 12.2 is available for the Oracle Cloud but not for other installations yet. And my guess would be that first enterprise edition will get it and standard edition (SE2) might even get it a bit later.

Still there are many exciting new features in it and it makes sense to get ourselves familiar with them. So here are my favourites with a slight focus on developer relevance.

There is documentation about 12.2 new features out already, but it is extremely hard to find. It is hidden in the docs for the new Exadata Express Cloud service. So here is the quick link to see the new features guide.

To summarise many of the new features focus on improving availability of the database. Essentially giving developers and DBAs more options to keep the applications running even when encountering errors or while changes are takes place.

The second set of enhancements seem to be things that further extend capabilities of features that were added in 12.1.

And of cause a lot of performance improving possibilities had been added (this seems to by typical for an R2 version).

longer identifiers (128 chars)

Identifiers can now be put to 128 chars long. This applies for almost all objects including table, column, index and constraint names. Long awaited and finally there.

I suggest not to overdo it at the beginning, I suspect that many external tools that work with the Oracle database might not fully support such long identifiers yet. And of cause some common sense when naming your tables and columns should be applied as well. For example please do not repeat the table name again in the column name. However it will help greatly when you apply naming conventions on foreign key constraints or indexes.

There seems to be a new plsql constant Ora_Max_Name_Len that holds the maximum possible value in your database. It can be used at compile time to set the sizes of variables. There is an example in the plsql section later.

SQL functions

improved LISTAGG

LISTAGG now has an option not to error out when the list gets too long. Instead the list will be cut off and a default value (ellipsis “…”) is added. Additionally we can add the number of found results.

To do so add the new overflow clause, for example:

ON OVERFLOW TRUNCATE WITH COUNT

Unfortunatly LISTAGG is still not able to build distinct lists. If you want that, vote for this feature in the OTN database ideas space: https://community.oracle.com/ideas/12533

conversions with error handling

All conversion functions have now an added new clause that decides how an error is handled, when the conversion fails because of a formatting error. CAST, TO_NUMBER, TO_DATE and all the other TO_xxx functions got this improvement. The only exception is TO_CHAR. Because everything can be converted into char.

CAST_IN_DB12_2.gif

examples:

CAST(prod_id AS NUMBER DEFAULT 0 ON CONVERSION ERROR)
TO_DATE(last_processed_day DEFAULT date'2016-01-01' ON CONVERSION ERROR, 'DD-MON-RRRR','nls_date_language=AMERICAN')

So in the case where the conversion would result in an error, we can provide an alternative default value instead. This is highly useful!

Additionally there is a new function VALIDATE_CONVERSION which can be used to find values in a column where conversion would not be possible. It returns 0 for invalid values and 1 for correct ones.

new approximation functions

In 12R1 we already got APPROX_COUNT_DISTINCT
new are APPROX_PERCENTILE, APPROX_MEDIAN
These approximation functions can now be used in materialized views!
And there is a parameter that is able to switch the exact functions to the approximate versions.

alter session set approx_for_aggregation = 'TRUE';

Also new are APPROX_COUNT_DISTINCT_DETAIL, APPROX_COUNT_DISTINCT_AGG and TO_APPROX_COUNT_DISTINCT allowing to build hierarchies with those aggregation functions. Something we were not able to do in the past without rerunning the function on each aggregation level.

case insensitive linguistic queries

It is now possible to do searches using a case insensitive option, e.g.  BINARY_CI for various functions like LIKE. Those functions are able to use stemming. Not much detail about it yet, but it sounds like it can be used instead of putting UPPER around the columns.  At the moment I have no further information about possible performance impacts of that.

greatly enhanced JSON support

better simplified JSON

There is a complex JSON syntax using all the JSON functions and a simplified JSON syntax using dot notation. In general a lot of work was done to enhance the simplified JSON syntax. Like you can access elements of an array now.

JSON_EXISTS with predicates

If seen short examples where “and” expressions/filters using an && operator were done inside some SQL statement using JSON_EXISTS. Essentially JSON_EXISTS allows to add filters to the path expression using a solid set of operators.

build JSON with SQL/JSON

Using a similar syntax as the older sql/xml functions (XMLELEMENT, XMLAGG) we now have some new SQL functions to help us createing a JSON document. The output of those functions is VARCHAR2.

JSON_OBJECT, JSON_OBJECTAGG, JSON_ARRAY, JSON_ARRAYAGG

 

Views on JSON documents / Data Guide

The Data Guide allows to analyse JSON documents and automatically build views and virtual columns for elements in our JSON object.

The data dictionary has a list of all columns with an enabled data guide

USER|ALL|DBA_JSON_DATAGUIDE

We can access the dataguide in SQL with the functions JSON_DATAGUIDE or JSON_HIERDATAGUIDE or in PLSQL with the function DBMS_JSON.GET_INDEX_DATAGUIDE.

The plsql functions need an json search index  to work. If it exists then those functions should be preferred, since they work on persisted index data. If the json documents in the columns do all have a completly different structure, then it might be better to use the SQL functions and not use a json search index.

See also: Multiple Data Guides per Document Set

Based upon the data guide more operations are possible. For example we can easily create views and/or virtual columns that extract information from the JSON document as relational data.

To create a virtual column that shows data from our json column in a relational way we can use  DBMS_JSON.addVC.

To create a view that exposes json document data in a table like structure, we can use DBMS_JSON.createViewonDemand. This is based upon a JSON_TABLE function.

 

JSON search index

example

CREATE SEARCH INDEX po_dg_only_idx ON j_purchaseorder (po_document) FOR JSON;

 

GeoJSON support

If the JSON includes coordinates that are  conform with the GeoJSON standard, then it is possible to do geolocation searches on that JSON document.

GeoJSON also is supported during Spatial Queries and can be converted directly into Spatial Geometry.

more JSON enhancements

  • the JSON sql functions are now available in plsql as well => especially IS JSON will be useful
  • highly improved JSON search index
  • new predefined plsql object types JSON_OBJECT_T, JSON_ARRAY_T, JSON_ELEMENT_T, JSON_KEYLIST_T, JSON_SCALAR_T

MV enhancements

ON STATEMENT refresh for MV

We had ON DEMAND and ON COMMIT before. Now Materialised Views can be refreshed after DML changes to the base tables without having the need to wait for the commit

refresh statistics view

This was long overdue! We are now able to see statistics when the materialised view was refreshed. The name of the data dictionary view is not completely clear now, but I suspect DBMS_MVREF_STATS.

There is also a new package DBMS_MVIEW_STATS that can be used to organise the collection and the cleanup of those statistics.

improvements for partitioned objects

  • MOVE TABLE, SPLIT PARTITION and some other partitioned operations can be done online. That means they will not disrupt ongoing DML operations. And this includes automatic index maintenance.
  • CREATE TABLE FOR EXCHANGE prepares a non partitioned table to be partitioned.

 

superfast analytics

Join Groups

It is now possible to define columns as join groups. A JOIN GROUP is a new database object and it greatly increases performance of Join and analytical queries over those columns.

In Memory expressions

An INMEMORY expression is a special kind of virtual column that has an additional INMEMORY attribute attached to it. By that attribute special optimisations kick in that speed up access to this expression. This method is also used by some of the JSON optimisations.

A data dictionary view is added to help finding expressions that could profit from such virtual columns.

dba|all|user_expression_statistics

Analytical views

Analytical views are an easy way to define dimensional hierarchies and how they are rolled up. An Analytical View is a new object type.

This seems to be a completely new feature. Slightly based on the older dimensional cube possibilities. I’m not sure if this will be available for all editions later or if it will be an additional cost feature.

 

PL/SQL stuff

pragma deprecate

This feature was surprising to me, but I find it hugely interesting.

You can declare now plsql methods as deprecated by using this pragma. The “only” thing that happens is, if such a function is used you will get a compiler warning (PLW-something).

syntax example

function myFunc return varchar2
is
   pragma deprecate 'myFunc is deprecated. Use yourFunc instead!';
begin
  return 'x';
end myFunc;

So far I didn’t miss that feature, but I immediately have some projects in mind where I would use it.

static expressions instead of literals

All areas where a literal is to be used, can now be replaced by a so-called “static expression”. e.g.

declare
 myTab varchar2( Ora_Max_Name_Len + 2);
 myObject varchar2(2* (Ora_Max_Name_Len + 2));
begin
 myTab := '"Table"';
 myObject := '"ThisSchema".' || myTab ;
 ...

This works only as long as the expression can be resolved at compilation time.

As such it will not improve or change writing dynamic queries, however there is some impact for deployment scripts. So whenever you needed to use manually written compiler directives, this might be a new alternative.

minor changes

  • ACCESSIBLE BY for sub modules
  • bind plsql only datatypes to dbms_sql => this essentially just finishes what 12.1.0.2 already allowed for anonymous blocks and native SQL (execute immediate).
  • some enhancements for PL/scope: SQL_ID for static sql statements, reports where native SQL is used
  • slight enhancements for dbms_hprof: can now include SQL_IDs,  improved reporting   for sub cursors

new code coverage tool

It splits code into blocks and tells the developer if and how often those blocks are used during some sample run. Code that is not reached is coloured differently (red).

Blocks can be marked with a pragma

pragma coverage('NOT_FEASIBLE');

So that this block is not marked when it is not covered. The tool is currently only a new DBMS package, but it will be integrated into the next SQL Developer version.

dbms_plsql_coverage_….

 

developer.oracle.com

It is Oracles new landing page for developers. Strongly influenced by Steven Feuerstein.

screen-shot-2016-09-20-at-12-21-18

debugger enhancements

Can execute a sql now when the debugger encounters a breakpoint. Not implemented yet directly in SQL Developer, but it will be there in a future version. In general the new SQL developer already supports the enhanced debugging capabilities of the database, but not all is possible yet.

The debugger also is not available for the cloud yet. The protocol used, is not suited for cloud operations.

EBR (edition based redefinition)

Editioned objects no longer in use can be cleaned up automatically in the background

All we have to do is to drop the edition. Even in cases where this was not possible before.

CBO

CBO improved adaptive handling

The adaptive features of the CBO that are already in the database can now be better controlled. And the defaults were set in a way, that upgrades from 11g result in very similar behaviour.

OPTIMIZER_ADAPTIVE_FEATURES = TRUE or FALSE => deprecated

replaced by two new parameters

OPTIMIZER_ADAPTIVE_PLANS => default = TRUE

OPTIMIZER_ADAPTIVE_STATISTICS => default = FALSE

So if you switch from 12.1 to 12.2 and if you had OPTIMIZER_ADAPTIVE_FEATURES=TRUE then you might want to set the second parameter to TRUE again. If you switch from 11g to 12.2 you probably want the defaults.

The dbms_stats package got some improvements that go together with it.

Dbms_stats: auto_stat_extensions

At the moment not much is known about that yet.

Mview query rewrite

The CBO can now rewrite a query to a materialised view even if the view is stale at that moment. Materialized view logs are considered to return the correct results.

more CBO based things

  • new histogram types
  • GTT use session private statistics by default
  • automated SQL plan management
    Even after DB upgrade the capture of the old plan can be done by setting the parameter OPTIMIZER_FEATURES_ENABLED = ‘11.2.0.4’
    Plans can then evolve later
  • copy optimiser metadata from pre production to prod.
    EXPDP/IMPDP or dbms_stats.transfer_stats

application containers

Instead of just having a CDB and several PDBs in it, we can now define an Application Container. This serves as a kind of general repository for all PDBs that are part of that application container.

This is highly interesting.

The typical use case seems to be that you would roll out your application to various customers (in the cloud) by using a PDB for each customer. However all the common objects would be added into the application container, so that changes / enhancements can be rolled out to all customers at once or one after the other. The application container keeps track of you upgrade scripts an is able to “replay” those scripts in the specific PDBs. Only that is is not a replay of the script, instead the objects are linked from the PDB to the Application Container. I think this is the same mechanism as currently the CDB objects/metadata are made available on the PDBs.

Objects can be shared from the Application Container to the PDB by using the SHARING clause in the create statement.

SHARE NONE|METADATA|OBJECT

SHARE METADATA would only share the object definition, while SHARE OBJECT would also share the data.

It is possible to combine application containers with edition based redefinition. Essentially it seems as if the editions are copied from the Application Container to the PDB as all other Objects are copied/linked. The Application container just keeps track of what needs to be installed in a specific PDB.

shards

A new very special way to have locally distributed database. Seems to cover the same concept as shards in some noSql DBs. If the application is distributed over a set of databases, you can now declare them as sharded somehow. And if you define a table as sharded then its data will be split to the different databases. You use a shard key (very similar to a partitioned key) and that key defines where the data is going.

Exadata Express Cloud service

Exadata Express Cloud Service

Essentially you get a PDB with a few GB of space on an Exadata Machine running Oracle EE for only 175$ per month. Pricelist here.

SQL Developer version 4.1.5 supports the Exacta Express Cloud service. You can drag and drop objects (tables, etc.) directly onto the connection and get it moved there.

The Exadata Express Service includes Apex , ORDS and SODA. So it can also serve as a kind of JSON document storage using rest interface.

 

ORDS

Jetty is now supported for production mode. That means you can now run ORDS in standalone mode also for a production environment.

Since ORDS is not part of the database this does not depend on using database version 12.2.