Scalable sequences – size check

Motivation

In recent times I speculate where and when to use scalable sequences. Especially if it makes sense to use them in general for sequence driven ID columns of larger tables. I know this sounds a bit like an early sign of CTD (compulsive tuning disorder), but at least I’m aware of it.

Scalable sequences offer a performance benefit in certain special situations (hot block, index contention). This post is not about those benefits.

I assume the reader has a basic understanding what a scalable sequence is. If not, check out this older post of mine: All about sequences .

A scalable sequence provides larger numeric values than a normal sequence does.

A normal sequence => 3
A scalable sequence => 10176700000003

Here I only analyze and test the effects of those bigger numbers, especially for indexes.

Side note: The first 3 digits of the scalable sequence are the instance number + 100. The second 3 digits are the session id. The remainder is the normal sequence value. Different sessions provide different numbers and by this the sequence numbers are not monotonously increasing anymore. At least not when looking at multiple sessions. Which also means inserts are done all over the index and not only on the right hand side of the index.

Bigger numbers need more storage

The number 3 needs 2 bytes, the number 10176700000003 needs 8 bytes of storage. Use the DUMP function to measure it. Values provided by scalable sequences might use up to 18 bytes of storage.

This effects multiple areas, table segment size but also indexes tend to be bigger. We can safely assume the sequences are used to provide values for ID columns which are almost always supported by an unique index. But there are also foreign key columns using those numbers and there the index will not be unique for similar ID values.

Because values are bigger less index references fit into a single index block – meaning we need more index blocks. Here I focus explictly on the size of indexes and not on the size of columns. I believe that a few additional bytes for a whole table row is usually not relevant. In case of an index however the whole row is that single value plus references (index branch block) or rowids (index leaf block).

For comparison reasons I also check the size of an reverse key index since they were used in the past to combat index contention issues. Scalable sequences are the new kid in town that should replace reverse key indexes.

So let’s measure it.

Testscript 1 – single session insert

First we create a table with 3 number columns. Then we create a normal and a scalable sequence that provide some values. We add two normal and a reverse key index on those 3 columns and populate the table with values originating from the sequences.

------------------------------------------------------------------------------- Scenario: Test index size
--------------------------------------------------------------------------------
-- create a table with two columns files with IDs
create table test_sequence_effects 
  (id_normal number, 
   id_scalable number,
   id_reverse number);

-- create a normal and a scalable sequence
create sequence test_sequence_normal;
create sequence test_sequence_scale scale;

-- create 3 indexes
create index test_sequence_normal_ix on test_sequence_effects(id_normal); 
create index test_sequence_scale_ix on test_sequence_effects(id_scalable);
create index test_sequence_reverse_ix on test_sequence_effects(id_reverse) reverse ;

-- insert 1 million datapoints
insert into test_sequence_effects 
select test_sequence_normal.nextval, test_sequence_scale.nextval, test_sequence_normal.nextval 
from dual connect by level <= 1000000;

commit;

-- create statistics
execute dbms_stats.gather_index_stats(user, 'TEST_SEQUENCE_NORMAL_IX',estimate_percent => 100);
execute dbms_stats.gather_index_stats(user, 'TEST_SEQUENCE_SCALE_IX',estimate_percent => 100);
execute dbms_stats.gather_index_stats(user, 'TEST_SEQUENCE_REVERSE_IX',estimate_percent => 100);

result after single session insert

We can have a look at the index statistics and the segment sizes.

-- check index stats
select index_name, blevel, leaf_blocks, distinct_keys, clustering_factor 
from dba_indexes 
where index_name like 'TEST_SEQUENCE_%'; 
INDEX_NAMEBLEVELLEAF_BLOCKSDISTINCT_KEYSCLUSTERING_FACTOR
TEST_SEQUENCE_NORMAL_IX2199910000004271
TEST_SEQUENCE_SCALE_IX2337910000004271
TEST_SEQUENCE_REVERSE_IX 229651000000999996

Now all three indexes still have the same blevel, meaning for a single direct index access the performance should be identical. However when the number of entries increases eventually the ID values created by a scalable sequence might create a new level in the btree before the same happens for IDs from a normal sequence. I tested when this will happen. If you are curious check out the appendix section “blevel3 break point”, but try to guess the value before you check.

-- check index size
select segment_name, segment_type, segment_subtype, blocks, bytes/1024/1024 as size_in_MB
from user_segments
where segment_name like 'TEST_SEQUENCE_%';
SEGMENT_NAMESEGMENT_TYPESEGMENT_SUBTYPEBLOCKSSIZE_IN_MB
TEST_SEQUENCE_EFFECTS TABLE ASSM 4352 34
TEST_SEQUENCE_NORMAL_IX INDEX ASSM 2176 17
TEST_SEQUENCE_REVERSE_IX INDEX ASSM 3072 24
TEST_SEQUENCE_SCALE_IX INDEX ASSM 3584 28
table and index size

The scaleable index in this experiment is 65% larger than the normal index. The cause for this increment can be attributed directly to the bigger numbers that needs to be stored.

We have to keep in mind that the test scenario was run in a single session. In reality multiple sessions will be used to run the insert(s). This is where scalable sequences should give a performance boost, even at the cost of potentially slightly higher storage consumption.

Testscript 2 – parallel inserts from multiple sessions

Ok now repeat the same experiment but using multiple sessions instead of a single session.

I created a code block that repeatetly does single row inserts in blocks of 10 and waits 1 second. This was run in up to 100 parallel sessions using dbms_scheduler jobs until approximately 1 million rows were inserted. The code for this test can be found in the appendix section “Testscript 2 – code”. Because of the 1 sec. pause this will run for some time.

Index stats after parallel inserts

INDEX_NAMEBLEVELLEAF_BLOCKSDISTINCT_KEYSCLUSTERING_FACTOR
TEST_SEQUENCE_NORMAL_IX23649 1000198 943211
TEST_SEQUENCE_SCALE_IX26577 1000198 32329
TEST_SEQUENCE_REVERSE_IX23240 1000198 986633

The number of blocks increased for all indexes. But more importantly we see that the clustering factor for the normal index went through the roof and is now close to the number of rows in the table. Which is bad. It also shows that scalable sequences seem to keep the promise of avoiding index contention and hot blocks, even if this test was not set up to measure any performance implications.

If you want to know more about clustering factor, read appendix section “index clustering factor”.

Side note: There were a few more rows inserted than in the single session approach (0.02%). That was because how the stopping mechanism worked for the parallel jobs. It does not influence the comparison at all.

Index sizes after parallel inserts

SEGMENT_NAMESEGMENT_TYPESEGMENT_SUBTYPEBLOCKSSIZE_IN_MB
TEST_SEQUENCE_EFFECTSTABLEASSM4480 35
TEST_SEQUENCE_NORMAL_IXINDEXASSM3840 30
TEST_SEQUENCE_REVERSE_IXINDEXASSM3328 26
TEST_SEQUENCE_SCALE_IXINDEXASSM6784 53

So the size of a normal index rose from 17 to 30MB (76% increase) and for the scaleable index from 28 to 53MB (83%). The reason for both increaments in size should be because now we see more 50-50 index block splits resulting in a considerable amount of blocks not being full and rarely being reused.

Overall the size didn’t explode, but it is still a noticable amount.

Side note: It is possible to shrink the indexes after such parallel insert sessions, but the effect is not great. Shrink and coalesce really should be considered when frequent deletes are run against the table. Scalable sequences potentially can provide values that might fill some of the gaps created by deletes. An index rebuild is not needed or advisable as long as more inserts are to come.

Conclusion

Values created by scalable sequences are using more storage space than values from normal sequences. For indexes we can expect a growth in index size between 50% and 100% compared to a normal index. The growth is partially because of larger numbers but also because there will be more not yet used space in many of the index blocks (caused by 50-50 index block splits instead of 90-10).

The clustering factor for scalable sequences always is better or equal than any alternative.

For all customers that I work with such an increment in storage size would not be any problem at all, especially if there is a performance improvement.

For me the storage size increase is not an argument against scalable sequences.

Appendix

Testscript 2 – code

--------------------------------------------------------------------------------
-- Scenario: Test index size after parallel inserts
--------------------------------------------------------------------------------
-- cleanup logic
drop table test_sequence_effects purge;
drop sequence test_sequence_normal;
drop sequence test_sequence_scale;

execute dbms_scheduler.drop_program('test_sequence_do_inserts');
begin
 for j in (select job_name from all_scheduler_jobs where job_name like 'TEST_SEQUENCE%') loop
     dbms_scheduler.drop_job(j.job_name);
 end loop;    
end;
/
--------------------------------------------------------------------------------
-- create a table with three columns to be filled with IDs
create table test_sequence_effects 
  (id_normal number, 
   id_scalable number,
   id_reverse number);

-- create a normal and a scalable sequence
create sequence test_sequence_normal;
create sequence test_sequence_scale scale;

-- create 3 indexes
create index test_sequence_normal_ix on test_sequence_effects(id_normal); 
create index test_sequence_scale_ix on test_sequence_effects(id_scalable);
create index test_sequence_reverse_ix on test_sequence_effects(id_reverse) reverse ;

-- insert 1 million datapoints
-- run as 100 parallel scheduled jobs 
-- each job inserts a batch of 10 IDs using single row inserts 
-- each job waits for 10 seconds before the next batch is run
-- each job stops after the max number for the normal ID is reached.

--create a program for dbms_scheduler to be used in jobs later
begin
    DBMS_SCHEDULER.create_program (
    program_name   => 'test_sequence_do_inserts',
    program_type   => 'PLSQL_BLOCK',
    program_action => q'[declare
  last_id number;
  max_id number := 1000000;
  cnt_id number := 0;
  e_stop exception;
  pragma exception_init (e_Stop, -20022);
begin 
  for j in 1..max_id/10 loop
    -- single insert for 10 rows 
    for i in 1..10 loop
        insert into test_sequence_effects 
        values (test_sequence_normal.nextval -- ID_NORMAL
             , test_sequence_scale.nextval  -- ID_SCALABLE
             , test_sequence_normal.nextval -- ID_REVERSE
             )
        returning id_normal into last_id;
        cnt_id := cnt_id + 1;
        if last_id >= max_id then 
          commit;
          raise e_stop;
        end if;   
    end loop;
    commit; 
    -- pause for 1 sec  
    dbms_session.sleep(1);
  end loop;  
exception
 when e_stop then
   -- finish gracefully
   dbms_output.put_line('This session inserted '||to_char(cnt_id)||' rows');
   commit;
end;
]',
    enabled        => TRUE,
    comments       => 'Insert 10 rows, 1 sec pause.');


   -- activate the program. This does not run the logic yet.
   DBMS_SCHEDULER.enable (name => 'test_sequence_do_inserts');

end;
/

-- create multiple jobs
begin 
  for j in 1..100 loop
     dbms_scheduler.create_job
     (job_name      => 'test_sequence_do_inserts_job'||to_char(j,'FM000'),
      program_name  => 'test_sequence_do_inserts',
      start_date => systimestamp,
      enabled => true);

     dbms_scheduler.run_job (job_name => 'test_sequence_do_inserts_job'||to_char(j,'FM000'),
                              use_current_session => false);
   end loop;
end;
/



After executing this script the scheduled job runs for quite some time.

We can check the number of rows in the table to see if it is finished.

select max(id_normal) last_value,count(*) row_count from test_sequence_effects;

other interesting selects to monitor the scheduled jobs

-- check scheduler jobs
select * from all_scheduler_programs;
select * from all_scheduler_jobs where job_name like 'TEST_SEQUENCE%';
select * from all_scheduler_running_jobs where job_name like 'TEST_SEQUENCE%';
select * from all_scheduler_job_log where job_name like 'TEST_SEQUENCE%';
select * from all_scheduler_job_run_details where job_name like 'TEST_SEQUENCE%';

And of cause the statistics need to be gathered

-- compute statistics for all indexes
execute dbms_stats.gather_index_stats(user, 'TEST_SEQUENCE_NORMAL_IX',estimate_percent => 100);
execute dbms_stats.gather_index_stats(user, 'TEST_SEQUENCE_SCALE_IX',estimate_percent => 100);
execute dbms_stats.gather_index_stats(user, 'TEST_SEQUENCE_REVERSE_IX',estimate_percent => 100);

Index clustering factor

What is the clustering factor of an index? Essentially it tells us how good the index and the table data match. How many table blocks need to be read if each block is accessed via the index? The value itself needs to be compared with the number of blocks (min value) and rows (max value) in the table.

Here are some links that might improve your understanding of the clustering factor:

http://dbaparadise.com/2018/05/what-is-index-clustering-factor/

Ask Tom: https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:1032431852141

David Fitzjarrell in detail about the clustering factor and Oracles calculation of it

Blevel 3 break point

I run some further tests to see when the index based upon a scalable sequence will reach blevel3.

For a single session insert, blevel 3 was reached after approximately 38 million inserts.

INDEX_NAME                  BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR
--------------------------- ------ ----------- ------------- -----------------
TEST_SEQUENCE_NORMAL_IX          2       81148      38200000            173362
TEST_SEQUENCE_REVERSE_IX         2      125912      38200000          38199995
TEST_SEQUENCE_SCALE_IX           3      129055      38200000            173362

For multiple sessions the same amount of blocks/leaf blocks should be reached earlier. However the distribution would be considerably more even. Both effects will partially cancel each other out. So far I didn’t measure it.

lateral join – decorrelation gone wrong

A colleaque made me aware of the following misbehaviour of Oracles optimizer. Shoutout to Christine S. who discovered that problem.

demo

create table a 
as select 1 x from dual;

create table b 
as select 1 x, 1 y from dual union all 
   select 1 x ,2 y from dual;

select a.x, c.y
from  a,
     lateral (select b.x, max (b.y) y from b where b.x = a.x) c
where a.x = c.x;

result

         X          Y
---------- ----------
         1          2

At first glance this looks like what we intended to see. However a closer inspection of the code reveals that select b.x, max (b.y) y from b is not valid syntax. We do get an error message, when running it directly.

ORA-00937: not a single-group group function

We also get the same error message when we use the select with the lateral keyword to drive a materialized view.

create materialized view mv
as
select a.x, c.y
from a,
     lateral (select b.x, max (b.y) y from b where b.x = a.x) c
where a.x = c.x;
ERROR at line 5:
ORA-00937: not a single-group group function

The ORA-937 error message is correct and should be displayed.

That we get a result in the first case, is an error!

reason

Oracle transforms the lateral join into a decorrelated lateral join. During this transformation the syntax error disappears. We see this when checking the outline section of the execution plan (see below). This transformation is different when used in a materialized view.

select *
from dbms_xplan.display_cursor(sql_id=>'2c1nchvqtjt6s', format=>'+outline');
Plan hash value: 3244120894

----------------------------------------------------------------------------------------
| Id  | Operation            | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                 |       |       |     7 (100)|          |
|*  1 |  HASH JOIN           |                 |     1 |    29 |     7  (15)| 00:00:01 |
|   2 |   TABLE ACCESS FULL  | A               |     1 |     3 |     3   (0)| 00:00:01 |
|   3 |   VIEW               | VW_DCL_A18161FF |     1 |    26 |     4  (25)| 00:00:01 |
|   4 |    HASH GROUP BY     |                 |     1 |     6 |     4  (25)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| B               |     2 |    12 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      DB_VERSION('19.1.0')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$B1519A3D")
      DECORRELATE(@"SEL$2")
      OUTLINE_LEAF(@"SEL$13D91C5D")
      OUTER_JOIN_TO_INNER(@"SEL$71877063" "C"@"SEL$1")
      OUTLINE(@"SEL$2")
      OUTLINE(@"SEL$71877063")
      DECORRELATE(@"SEL$B1519A3D")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$B1519A3D")
      DECORRELATE(@"SEL$2")
      FULL(@"SEL$13D91C5D" "A"@"SEL$1")
      NO_ACCESS(@"SEL$13D91C5D" "C"@"SEL$1")
      LEADING(@"SEL$13D91C5D" "A"@"SEL$1" "C"@"SEL$1")
      USE_HASH(@"SEL$13D91C5D" "C"@"SEL$1")
      FULL(@"SEL$B1519A3D" "B"@"SEL$2")
      USE_HASH_AGGREGATION(@"SEL$B1519A3D")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."X"="C"."X")

The name of the view VW_DCL_xxx indicates that it was used for a decorrelated lateral operation. For a normal lateral operation the name of the view woud be VW_LAT_xxx.

opinion

I’m a big fan of lateral joins. For more insights what a lateral join does, check this older blog post by the Optimizer Development Group: http://optimizermagic.blogspot.com/2007/12/outerjoins-in-oracle.html

Since 12c (or maybe a little earlier than that) there were some optimizer enhancements to lateral joins, called decorrelation. In essence it means a lateral join, which can be seen as a kind of correlated subquery, is rewritten into a uncorrelated subquery under certain specific circumstances.

I believe this rewrite is almost always wrong. If the developer on purpose decides to use the lateral keyword (same goes for cross apply and outer apply) the optimizer should not try to correct this.

Other authors have written about problems with decorrelation before.
See

The case demonstrated here is just another example where decorrelation has gone wrong. An optimizer hint should not change the result of a query, and even more so should not change whether the query is formulated correctly or not. There are exceptions to this rule, where the intention of the hint is to change the outcome, for example ignore_row_on_dupkey_index or fresh_mv. But the (undocumented) DECORRELATE/NO_DECORRELATE hints are no such cases.

workaround

There are two ways to disable the decorrelation and get the expected ora- error message.

a) use the NO_DECORRELATE hint.

SQL> select /*+ no_decorrelate(@"SEL$2") */
  2   a.x, c.y
  3  from  a,
  4       lateral (select b.x, max (b.y) y from b where b.x = a.x) c
  5  where a.x = c.x;
     lateral (select b.x, max (b.y) y from b where b.x = a.x) c
                     *
ERROR at line 4:
ORA-00937: not a single-group group function

b) switch the decorrelation feature off using the hidden (underscore) parameter _optimizer_ansi_join_lateral_enhance

SQL> alter session set "_optimizer_ansi_join_lateral_enhance"=false;

Session altered.

SQL> select a.x, c.y
  2  from  a,
  3       lateral (select b.x, max (b.y) y from b where b.x = a.x) c
  4  where a.x = c.x;
     lateral (select b.x, max (b.y) y from b where b.x = a.x) c
                     *
ERROR at line 3:
ORA-00937: not a single-group group function

 

conclusion

Although LATERAL joins are a nifty feature, always be aware that such features often come with bugs.

plsql function: who_called – simplify your exception handler!

Does your error logging logic still look like this?

example 1 – old school handler

procedure doSomething is
begin
   ...
exception
  when others then
     pk_log.error(p_module    => $$PLSQL_UNIT, 
                  p_submodule => 'doSomething', 
                  p_message   => 'Something not done!'||sqlerrm);
     raise;
end;

Here is what I prefer!

example 2 – compact handler

procedure doSomething is
begin
   ...
exception
  when others then
     pk_log.error('Something not done!');
     raise;
end;

Such code is way more convinient! Of cause this will store the same information as the previous example 1.

To get there some parts have to be moved into the logging method (pk_log.error).

  1. sqlerrm
    is the error message
    When moving, it should also be improved (at least use dbms_utility.format_error_stack)
  2. $$PLSQL_UNIT
    is the name of the package (or of the standalone procedure or function)
  3. ‘doSomething’
    is the hardcoded name of the procedure
    It needs to be gathered dynamically at runtime inside the logging package.

The biggest issue is 3.) – to find the name of the module at runtime. The function who_called can solve that. It also gathers the package name on the way, so 2.) is also covered.

Side note: Is hardcoding the procedure name a problem?

Not really. However if the name changes one needs to change it in several lines. And in case one forgets to change the error handler, the compiler will not issue an error. Then the problem might go unnoticed for some time and could be misleading in case of exceptions.

I have two versions of the who_called function. The first standalone version is not recommend to use. Instead use the packaged version. However the first version demonstrates the implemented logic in a clear way.

Both functions will find the full name of the module that called this code block. The default behaviour is that modules from the same package are ignored. This allows easy usage inside packages ment for logging, auditing or testing. Everything that is instrumentation code can usually profit from this little helper.

Usecase 1 – demo purposes – standalone function

This logic assumes that we use the function to find out who called the procedure that is currently running. So not the name of the procedure itself, but the name of the caller.

Installation code

-- Cleanup
-- drop function who_called;

-- Usecase 1: standalone function - simple demo version

-- check the call stack to find the calling module.
create or replace function who_called return varchar2
  is
    v_caller utl_call_stack.unit_qualified_name;
    v_name      varchar2(128);
begin
    -- print the call stack just for demo purposes
    for i in 1..utl_call_stack.dynamic_depth  loop
      sys.dbms_output.put_line(
         rpad('+',utl_call_stack.dynamic_depth-i,'+')||
         utl_call_stack.concatenate_subprogram( utl_call_stack.subprogram(i) )
         );
    end loop;

    
    -- step backwards though the call stack 
    --      1 = current module = who_called function
    --      2 = the module that calls who_called 
    --      3 = the caller!
    v_caller := utl_call_stack.subprogram(3);

    v_name := utl_call_stack.concatenate_subprogram(v_caller);

    return v_name;

end who_called;
/

example 3 usage

set serveroutput on

-- Testcase 1.1 anonymous block
declare
  procedure DoSomething as
  begin
    dbms_output.put_line('I want to doSomething!');
    dbms_output.put_line('Who called? '|| who_called);
  end;

  procedure DoMore as
  begin
    doSomething;
    dbms_output.put_line('I want to doMore!');
    dbms_output.put_line('Who called? '|| who_called);
  end;
begin
  doMore;
end;
/

output

I want to doSomething!
+++WHO_CALLED
++__anonymous_block.DOSOMETHING
+__anonymous_block.DOMORE
__anonymous_block
Who called? __anonymous_block.DOMORE
I want to doMore!
++WHO_CALLED
+__anonymous_block.DOMORE
__anonymous_block
Who called? __anonymous_block


PL/SQL-Prozedur erfolgreich abgeschlossen.

Each time the function is called, it prints the full call stack to the screen. This is just for demonstration purposes.
The first time the function who_called is executed is from inside the submodule doSomething.
The call stack at this point looks like this

Position in StackModule
1who_called
2doSomething
3doMore
4__anonymous_block

Line 5 in the anonymous block is the line with the who_called function (call stack 1). It is part of the doSomething procedure (call stack 2). On position 3 of the call stack we always find the caller – here the procedure doMore that did execute doSomething.

The second time the function is called is in the doMore procedure. And the call stack looks like this:

Position in StackModule
1who_called
2doMore
3__anonymous_block

In this case the caller is the anonymous block.

The example clearly shows that we can fetch the name of any module in the call stack (including anonymous blocks). We just have to fetch the name from the proper level in the call stack.

Usecase 2 – error and trace logging – packaged version

This works a bit differently when used inside a larger package that is used to instrument the code base. Typically we can add error logging logic, and logging debug (trace) calls. In that scenario, we want to find out the name of the module where the trace or error logging call is made.

The logic assumes that we have a package used for this instrumentation purpose only. Which also means, that all calls inside this package can be ignored. So essentially we want to find out what is the full name of the procedure (or function) where the log error or the log trace call is done. The full name can be something like package.module.submodule.subsubmodule.

Installation code

When testing this, make sure you do not have/own a package pk_log already. This would overwrite it.

If you like it copy the code for the who_called function into your own instrumentation package.

-- Cleanup
-- drop function who_called;

-- who_called packaged version 
-- check the call stack to find the calling module.
-- calls from this package are not considered
create or replace package pk_log as
/* this is a very simplified example for error logging 
   only demonstration purposes of the WHO_CALLED function 
   your error logging logic must be better!
*/
  procedure error(message in varchar2);
  procedure trace(message in varchar2); 
end pk_log;
/

create or replace package body pk_log as
    function who_called return varchar2
      is
        -- author: Sven Weller, 2021, syntegris information soultions GmbH
        v_pkg    varchar2(128) := $$plsql_unit; -- name of the package that holds the who_called function
        v_caller utl_call_stack.UNIT_QUALIFIED_NAME;
        v_name varchar2(128);
    begin
        -- step backwards though the call stack 
        --      1 = current module = who_called function
        --      2 = the module that calls who_called 
        --      3 = the caller!
        -- we could start later (level 3 or even 4), if we knew exactly how many other submodules are always in the call stack 
        -- This might be risky if function inlineing is possible, but that can be prevented by a pragma. Example in proc error
        for i in 2..utl_call_stack.dynamic_depth  loop

          v_caller := utl_call_stack.subprogram(i);

          v_name := utl_call_stack.concatenate_subprogram(v_caller);

          -- at lexical depth 1 we find the name of the package
          if v_caller(1) != v_pkg then
            -- In some scenarios, we might want to go even one step upward the call stack. Typically not for error logging purposes.
            -- In such cases we could simply fetch one more name from the call stack
            -- v_caller := utl_call_stack.subprogram(i+1);
            
            -- build the full name, but only if we found the caller
            v_name := utl_call_stack.concatenate_subprogram(v_caller);
            -- no need to go further through the call stack
            exit;
          end if;
        end loop;

        return v_name;

    end who_called;

    procedure writeMessage(message in varchar2, message_type in varchar2) as
    begin
      case message_type 
      when 'E' then
        sys.dbms_output.put_line('ERROR at ' || who_called);
      when 'T' then
        sys.dbms_output.put_line('Info from ' || who_called);
      else  
        sys.dbms_output.put_line('ERROR! Unknown Message Typ ' || message_type || ' in '||who_called);
      end case;        
      sys.dbms_output.put_line(message);

      if message_type = 'E' then 
        -- Add full error stack
        sys.dbms_output.put_line(dbms_utility.format_error_stack);
        -- Add line where the error happened
        -- Only the last line from the backtrace is relevant, all other lines are already in the error stack
        -- compare output to sys.dbms_output.put_line(dbms_utility.format_error_backtrace);
        sys.dbms_output.put_line(' in '||nvl(UTL_CALL_STACK.backtrace_unit(utl_call_stack.backtrace_depth),'anonymous block')
                               ||' at line '||sys.utl_call_stack.backtrace_line(utl_call_stack.backtrace_depth)
                               );
      end if;
    end writeMessage; 

    procedure error(message in varchar2) as
      pragma autonomous_transaction;
    begin
      pragma inline(writeMessage,'NO'); -- we don't want to inline in case of optimization_level = 3.
      writeMessage(message,'E' );
    end error;
    procedure trace(message in varchar2) as
      pragma autonomous_transaction;
    begin
      writeMessage(message,'T' );
    end trace;   
end pk_log;
/

example 4 call package, log trace and log error

set serveroutput on

-- Testcase 2.1 anonymous block
declare
  procedure DoSomething as
  begin
    pk_log.trace('I want to doSomething!');
  end;

  procedure DoMore as
  begin
    doSomething;
    raise no_data_found;
  end;
begin
  doMore;
exception 
  when others then
    pk_log.error('I wanted to do more, but this happend!');
    raise;
end;
/

output

This is the dbms_output (the simplified trace and error logs).

Info from __anonymous_block.DOSOMETHING
I want to doSomething!
ERROR at __anonymous_block
I wanted to do more, but this happend!
ORA-01403: no data found
ORA-06512: at line 10

 in anonymous block at line 13

The output is followed by the exception from the final RAISE; of the block (from SQL developer).

Fehler beim Start in Zeile: 4 in Befehl -
declare
  procedure DoSomething as
  begin
    pk_log.trace('I want to doSomething!');
  end;

  procedure DoMore as
  begin
    doSomething;
    raise no_data_found;
  end;
begin
  doMore;
exception 
  when others then
    pk_log.error('I wanted to do more, but this happend!');
    raise;
end;
Fehlerbericht -
ORA-01403: no data found
ORA-06512: at line 17
ORA-06512: at line 10
ORA-06512: at line 13
01403. 00000 -  "no data found"
*Cause:    No data was found from the objects.
*Action:   There was no data from the objects which may be due to end of fetch.

In SQL Plus we get one more line, because of the additional raise in line 17.

There is something special in the code. The last line from the error backtrace is captured and returned. "in anonymous block at line 13" .

This last line is missing in the error stack, which means it is usually not returned when you only store dbms_utility.format_error_stack.

To add this last line utl_call_stack.backtrace_line(utl_call_stack.backtrace_depth) is executed. All other lines from the error backtrace are already in the error stack (depends on DB version) – so those can be ignored.

Conclusion

For my purposes it works fine and the resulting code is very clean.

I currently use it in an audit package based upon Connor McDonalds Audit Generator which was massively reworked for some custom needs. Maybe I publish this one day. But that is another story.

who_called advantages

  • Way less to type – less mistakes
  • much easier to see what the error handler does
  • concentrate coding effort on the important bits – mostly the additional information that should be added to the error

who_called disadvantages

  • A tiniest bit slower than hard coding the package and submodule name
    => For exception handling this is not relevant.
    => In huge debug or trace scenarios this time could add up if extreme performance is needed. Conditional compiling can be applied to remove instrumentation calls if you compile for extreme plsql performance.
  • since part of the logic is hidden other developers might wanted to add the module name into the message
  • module name is fetched at run time, not at compile time
    => this can be an issue, especially if optimization level 3 is used. Function inlineing (check the compiler warnings) is a method that the plsql optimizer can choose to speed up logic. However it also means that the function at runtime will not exist anymore. And utl_call_stack will not be able to report the name of the module correctly. Note that there is a compiler warning that inlining happend (which usually is a good thing).

Feel free to adapt this code snippet to your special needs. I’m convinced this will help you to write simpler and more efficient error handlers.

Small demo about sequences and identity columns

A recent twitter discussion lead me to (re)test the syntax a little more around auto-incrementing ID columns that work without triggers. I was especially curious about the LIMIT VALUE option, that I didn’t know before.

identity options

This option is documented (as opposed to RESTART which still seems not to made it into the docs)

Demo 1 – Attach a sequence to a table via the DEFAULT option

The comments explain what I’m trying to do/show.

-- Create some sequences to be used later
create sequence mySeq1;
Sequence MYSEQ1 created.

create sequence mySeq2;
Sequence MYSEQ2 created.

-- Table with PK not tied to a sequence
create table t1 (id  number not null primary key
               , txt varchar2(30)); 
Table T1 created.

-- fill PK using a Sequence
insert into t1 values (mySeq1.nextval, 'Test1');
1 row inserted.

-- set the sequence to a new start value 55
alter sequence mySeq1 restart start with 55;
Sequence MYSEQ1 altered.

-- insert more data
insert into t1 values (mySeq1.nextval, 'Test2');
1 row inserted.

-- check data
select * from t1;
ID TXT
1 Test1
55 Test2

Unfortunately we can not easily change our PK into an identity column.

alter table t1 modify id generated as identity;

ORA-30673: column to be modified is not an identity column

alter table t1 modify id generated by default on null as identity;

ORA-30673: column to be modified is not an identity column

Other syntax variants including using LIMIT VALUE also do not work.

However it is easily possible to use a default value for the column and populate it by an existing (user managed) sequence.

-- Use a default value column and a different sequence
alter table t1 modify id default on null mySeq2.nextval;
Table T1 altered.

-- Trying to insert a values give a UK error, 
-- because we used a fresh sequence which gave us value 1
-- which exists already.
insert into t1 (txt) values ('Test3');
ORA-00001: unique constraint (MYUSER.SYS_C00145531) violated

-- set the sequence to a different value using RESTART
alter sequence mySeq2 restart start with 70;
Sequence MYSEQ2 altered.

insert into t1 (txt) values ('Test3');
1 row inserted.

insert into t1 values (null, 'Test4');
1 row inserted.

insert into t1 values (default, 'Test5');
1 row inserted.

-- Check data
select * from t1;
ID TXT
1 Test1
55 Test2
70 Test3
71 Test4
72 Test5

So Demo 1 shows some commands around normal ID columns that are populated via a default value setting.

Demo 2 – Use identity column from the start

create table t2 (id  number generated always as identity primary key
               , txt varchar2(30)); 
Table T2 created.

-- Try to insert some rows
insert into t2 values (null, 'Test1');
ORA-32795: cannot insert into a generated always identity column

insert into t2 values (default, 'Test2');
1 row inserted.

insert into t2 (txt) values ('Test3');
1 row inserted.

-- Try to insert a record via plsql
declare 
  r t2%rowtype;
begin 
  r.txt := 'Test4';
  insert into t2 values r;
end;
/

ORA-32795: cannot insert into a generated always identity column

-- hide the ID column then try row insert by record again
alter table t2 modify id invisible;
Table T2 altered.

declare 
  r t2%rowtype;
begin 
  r.txt := 'Test5';
  insert into t2 values r;
end;
/
PL/SQL procedure successfully completed.

-- Check data
select * from t2;
TXT
Test2
Test3
Test5

The ID is missing? Well it is there.

select id, txt from t2;
ID TXT
1 Test2
2 Test3
3 Test5

Hiding the ID column is one possible way to make certain types of inserts work again. Not my preferred way thou.

-- make column visible again
alter table t2 modify id visible;
Table T2 altered.

-- logical column order now is changed. This can impact DML that depend on column order!
select * from t2;
TXT ID
Test2 1
Test3 2
Test5 3
-- restore original column order by toggling other columns invisible
alter table t2 modify txt invisible;
Table T2 altered.

alter table t2 modify txt visible;
Table T2 altered.

-- modify to allow other data values (instead of generated ALWAYS)
alter table t2 modify id generated by default on null as identity ;
Table T2 altered.

-- Try inserts again that didn't work previously
insert into t2 values (null, 'Test1');
1 row inserted.

declare 
  r t2%rowtype;
begin 
  r.txt := 'Test4';
  insert into t2 values r;
end;
/
PL/SQL procedure successfully completed.

-- Check data
select * from t2;
ID TXT
1 Test2
2 Test3
3 Test5
4 Test1
5 Test4
-- add custom IDs
insert into t2 values (-1, 'Test6');
1 row inserted.

insert into t2 values (100, 'Test7');
1 row inserted.

-- Check data
select * from t2;
ID TXT
1 Test2
2 Test3
3 Test5
4 Test1
5 Test4
-1 Test6
100 Test7

So now we can insert other ID values into that column, even thou, we have an identity column. Eventually this will lead to a UK error when the value created by the Oracle managed sequence reaches 100.

This can be solved with the LIMIT VALUE clause

-- Check current high water mark (last value) of sequence identity column
select identity_column, data_default from user_tab_columns where table_name = 'T2' and column_name = 'ID';

IDENTITY_COLUMN	DATA_DEFAULT
-------
YES	"MYUSER"."ISEQ$$_258212".nextval

-- using the sequence name from the previous select
select last_number, cache_size 
from user_sequences where sequence_name = 'ISEQ$$_258212';

LAST_NUMBER	CACHE_SIZE
-------
24	20

-- reset Identity column to the highest value that is currently in the table 
alter table t2 modify ID  generated by default on null as identity start with limit value;
Table T2 altered. 

-- Check high water mark again 
select last_number, cache_size 
from user_sequences where sequence_name = 'ISEQ$$_258212';

LAST_NUMBER	CACHE_SIZE
-------
101	20

-- insert 
insert into t2 values (default, 'Test8');
1 row inserted.

-- Check data
select * from t2;
ID TXT
1 Test2
2 Test3
3 Test5
4 Test1
5 Test4
-1 Test6
100 Test7
101 Test8

Success!

So Demo 2 showed commands how to work with an identity column. Including the possibility to switch it from generated always as identity to generated by default on null. Which opens up some ways to manipulate the data inside this managed ID column.

Remember we can not influence (alter) the Oracle managed sequence that drives the identity column directly. But we can change its properties via ALTER TABLE MODIFY .

cleanup code

-- cleanup
drop sequence mySeq1;
drop sequence mySeq2;
drop table t1 purge;
drop table t2 purge;

How to avoid misleading plsql warning PLW-07206: analysis suggests that the assignment to ‘V_DUMMY’ may be unnecessary

PLW-messages are warnings issued by the PL/SQL compiler.

The PLW-07206 warning tells us that we have some code (an assignment statement) that is not needed. Usually because the target of the assignment is not used later anywhere.

PLW-07206: analysis suggests that the assignment to <something> may be unnecessary

Problem demonstration

Example 1)

Here is a very reduced demonstration of the issue.

First we enable the PL/SQL warnings. I assume this setting is still in place for all following examples.

-- set plsql warning on
alter session set PLSQL_WARNINGS = 'ENABLE:ALL';

In SQL Developer we can set the warning level in the preferences, but this setting is not applied to code running in a SQL worksheet as a script.

Then a function is created.

create or replace function dummy_existsCheck return boolean 
authid definer
is
      v_dummy number(1);
begin
      select 1
      into v_dummy
      from all_objects
      where 1=2;
      return true;  
      -- Do something here if a record was found
exception  
     when no_data_found then
      -- Do something else here if NO record was found
      return false;
end dummy_existsCheck;
/  

Compiling the function shows the warning.

Function DUMMY_EXISTSCHECK compiled
LINE/COL ERROR

6/7 PLW-07206: analysis suggests that the assignment to 'V_DUMMY' may be unnecessary
Example 2)

Here is an example where the warning is correctly added.

create or replace function dummy_existsCheck return boolean 
authid definer
is
   v_dummy varchar2(1);
begin
   v_dummy := 'B';
     
   -- Do something here without using v_dummy
   return null; 
end dummy_existsCheck;
/ 
Function DUMMY_EXISTSCHECK compiled
LINE/COL ERROR

6/5 PLW-07206: analysis suggests that the assignment to 'V_DUMMY' may be unnecessary

The difference between the two examples is that a SELECT statement in PL/SQL needs an INTO clause. And because of that this dummy variable is needed. So the warning in the first example is wrong.

The compiler is aware of this situation already. For some strange reason the warning goes away when we change the datatype of the variable from number to varchar2. Which is also the suggested workaround.

Workaround for example 1)
create or replace function dummy_existsCheck return boolean 
authid definer
is
      v_dummy varchar(1);
begin
      select 'x'
      into v_dummy
      from all_objects
      where 1=2;
      return true;  
      -- Do something here if a record was found
exception  
     when no_data_found then
      -- Do something else here if NO record was found
      return false;
end dummy_existsCheck;
/  

Function DUMMY_EXISTSCHECK compiled

No warning issued.

Obviously this is a bug. Although a very minor one.

Example 3)

Here is another slightly more complex example. The same message appears when we assign the result of a function. Because a function always has a return value, we need to assign it to some variable, even if the variable later in the code is not used anymore.

create or replace function dummy_getID(p_table in varchar2) return number
authid definer
is
      v_result number(1);
begin
      select 1
      into v_result
      from user_tables
      where table_name = p_table;
      
      return v_result;
end dummy_getID;
/  

create or replace function dummy_existsCheck return boolean
authid definer
is
  v_tab varchar2(30) := 'DUMMY';
  v_dummy number(1);
begin
  v_dummy := dummy_getID(v_tab);
  return true;
exception
  when no_data_found then
    dbms_output.put_line('Table '||v_tab||' does not exist.');
    return false;
end dummy_existsCheck;
/ 

Function DUMMY_GETID compiled
Function DUMMY_EXISTSCHECK compiled
LINE/COL ERROR

7/3 PLW-07206: analysis suggests that the assignment to 'V_DUMMY' may be unnecessary
Workaround for example 3)

Again, changing the variable to a varchar2 data type, avoids the warning.

create or replace function dummy_existsCheck return boolean
authid definer
is
  v_tab varchar2(30) := 'DUMMY';
  v_dummy varchar2(1);
begin
  v_dummy := dummy_getID(v_tab);
  return true;
exception
  when no_data_found then
    dbms_output.put_line('Table '||v_tab||' does not exist.');
    return false;
end dummy_existsCheck;
/ 
Function DUMMY_EXISTSCHECK compiled

Datatype check

Surprisingly other numeric data types also avoid the warning. It seems that it happens mostly for number(x).

The following data types did not produce the warning:

  • varchar2(1)
  • varchar(1)
  • char(1)
  • pls_integer
  • binary_integer
  • number
  • float

The following data types did produce the warning:

  • number(1)
  • number(38)
  • decimal
  • integer
  • integer(1)
  • simple_integer

Conclusion

Compiler warnings are a good thing. Wrong or misleading warnings are not.

PLW-07206 can (sometimes) be avoided by changing the data type of the assignment target from number(x) to some other data type.

Cleanup demo code
drop function dummy_existsCheck;
drop function dummy_getID;

Warning! Hidden traps – comments in SQL*Plus

I am a big fan of comments (see also CDD Manifesto a bit further down). Comments are simple ways to add information to objects/code without changing code.

Comments that work in SQL or PLSQL might behave differently in SQL*Plus. This post is to create awareness about such traps.

The basics

comments in SQL and PLSQL

Comments in SQL and PLSQL come in two flavors. The one line comment and the multi line comment.

-- this is a one line comment. 

The double hyphen “–” starts the one line comment. It ends at the end of the line.

/* this
is a
mutli-line
comment */

A slash “/” followed by an asterisks “*” starts a multi line comment. And it is ended in the opposite way.

Comments do not stack.

-- this /* will
not work */

This also will not work.

/* line 1
   /* line 2
      /* line 3
      line 4 */
   line 5 */
line 6 */

The multi-line comment ends after the first time “*/” is encountered. So line 5 and 6 are not commented, instead line 5 will raise a syntax error.

In SQL a very special form of a comment is a hint. A hint tries to give the SQL optimizer additional instructions. A + indicates that this comment is a hint.

select /*+warp_speed(5) */ from big_table;

A hint could be compared to an annotation in other languages. The comment syntax is used to add some special meaning to the surrounding code. This post is not about such annotations.

comments in SQL*Plus

In general we run SQL statements, PLSQL blocks and SQL*plus commands in a SQL*Plus script. Typical examples are SELECT statements, CREATE OR REPLACE PACKAGE commands and SET commands..

The normal SQL and PLSQL comments do function in SQL*plus as well. Additionally we have a REM(ARK) command to add comments.

SQL> REM this is a SQL*plus comment

We can also use PROMPT to have a kind of echo inside a SQL*Plus script. Although it is not a real comment, it is used in a similar way.

SQL> PROMPT *** heading ***
*** heading ***

Surprising differences

SQL*Plus interprets comments slightly differently than SQL or PL/SQL. Here are some examples. Try to guess what happens before reading the result. The examples do work in SQL Developer, but behave differently in SQL*Plus.

Example 1

generate some DDL commands…

select 'drop '||object_type||';' 
from user_objects 
--where object_type = 'VIEW';
where object_type = 'TRIGGER';

Result 1 (real object names changed)
... many similar rows ...
'DROP'||OBJECT_TYPE||''||OBJECT_NAME||';'
drop SYNONYM PLSQL_PROFILER_DATA;
drop SYNONYM PLSQL_PROFILER_RUNNUMBER;
drop SYNONYM PLSQL_PROFILER_RUNS;
drop SYNONYM PLSQL_PROFILER_UNITS;
drop INDEX ABC_UK01;
drop PACKAGE MYPK;
drop PACKAGE BODY MYPK;
drop TABLE MYTAB;
drop DATABASE LINK DISTANT.WORLD.LOCAL;
drop DATABASE LINK REMOTE.WORK.LOCAL;
2243 rows selected.
SQL> SP2-0734: unknown command beginning "where obje…" - rest of line ignored.
SQL>

2242 rows selected? Yes there were some triggers, but not that many. When SQL*plus finds a semicolon at the end of a line, it interprets this as the end of the command. Even if that semicolon is commented out. In this example the statement produced a drop command for each object in the schema. But the intention was to only drop all triggers.

Example 2

Add plsql doc information…

create or replace function convertNum(p1 in varchar2) return number
is
/***************************
@usage: converts a string into a number. 
        If conversion is not possible, 0 is returned.
@author: Sven
@param : p1 : input string
@return : converted number
****************************/
begin
  return to_number(p1 default '0' on conversion error);
end convertNum;
/

Result 2
SQL>
create or replace function convertNum(p1 in varchar2) return number
is
/***************************
@usage: converts a string into a number.
SP2-0310: unable to open file "usage:.sql"
        If conversion is not possible, 0 is returned.
@author: Sven
SP2-0310: unable to open file "author:.sql"
@param : p1 : input string
SP2-0310: unable to open file "param.sql"
@return : converted number
SP2-0310: unable to open file "return.sql"
****************************/
begin
  return to_number(p1 default '0' on conversion error);
end convertNum;
  9  /

Function created.

SQL>

An @ at the start of a line is always interpreted as a call to a script even if it is inside a multi line comment. This lowers the usefulness of the “generate DB doc” functionality in SQL Developer a lot. Putting additional blanks before the @ will not change this behaviour.

Example 3

comment out optional params…

SQL> set serveroutput on -- size unlimited
Result 3
SP2-0158: unknown SET option "--"

This means the SET option is ignored. There is an error message, but it is a SQL*Plus (SP2) error message. This would not stop a script that has when WHENEVER SQLERROR EXIT set at the beginning.

Example 4

comment a statement…

select sysdate from dual; -- print current day

Result 4
SQL> select sysdate from dual; -- print current day
2
Do you notice the 2 at the beginning? The statement is not terminated. The buffer waits for the next line and the next command probably finishes the buffer and runs into a syntax error.

Warning

Personally I think that those examples are bugs. But it is documented behaviour, so Oracle probably disagrees. There is an extra section in the SQL*plus documentation, dedicated to comments in scripts where these and more problems are described.

Remember

If you intend to run some plsql code or a SQL statement from a SQL plus script, then

  • do not have semicolons in a comment, especially not at the end of the line
  • do not use @ in the comment, especially not at the beginning of a line
  • do not use — inside SELECTs (hints seem to work)
  • avoid to use — or /* */ to write sql*plus comments, instead use REMARK or PROMPT

More side effects

Comments near the beginning of a command can confuse sql*plus.

SQL> create /* testproc */ procedure test
2 is
3 begin null;
Warning: Procedure created with compilation errors.
SQL> end;
SP2-0042: unknown command "end" - rest of line ignored.
SQL> /

Blank lines in comments can break the code. This can be suppressed by using SET SQLBLANKLINES ON.

SQL> SET SQLBLANKLINES ON;
SQL> select sysdate from dual
2 /* this is a multi
3 line
4
5 comment */
6 ;
SYSDATE
25-AUG-20

SQL> SET SQLBLANKLINES OFF;
SQL> select sysdate from dual
2 /* this is a multi
3 line
4
SQL> comment */
2 ;
comment */
*
ERROR at line 1:
ORA-00969: missing ON keyword
SQL>

An ampersand “&” in comments is tried to be substituted by a value of a variable. This can be suppressed using SET DEFINE OFF. Not a big deal unless we do not want to suppress it.

SQL> select sysdate from dual
2 -- current day at marks&spencers
3 ;
Enter value for spencers:
old 2: -- current day at marks&spencers
new 2: -- current day at marks
SYSDATE
25-AUG-20

A comment inside a PROMPT is not a comment, but will be printed to the screen.

SQL> prompt /* comment or not comment that is the question */
/* comment or not comment that is the question */

If you want a “;” to be printed to a spool file using PROMPT, use two “;;”.

SQL> prompt test;;
test;

Additional thoughts

Good developers can get a solid overview over unknown code, just by reading the comments.

Set your syntax highlighting in such a way that makes it easy to read comments. Unfortunately Oracle SQL Developer some time ago switched from green comments to grey comments. My recommendation is to change this setting.

Comment Driven Development (CDD) Manifesto

Comments are good!

Every code needs comments!

Comment first, code later!

Comments state code intention.

Comments describe why a technical decision was made

or not made.

Comments can give examples.

The world needs more comments!

A quirk with object dependencies under EBR

TL;DR;

USER_DEPENDENCIES does not show dependencies between objects in different editions. Which means, even when no dependencies are shown, we still can get errors when trying to drop objects. This is especially relevant when trying to drop TYPEs.

The basics

Oracle tracks dependencies between objects, for example packages but also between types. It provides us a view USER_DEPENDENCIES (and the matching ALL_|DBA_|CDB_DEPENDENCIES views) where we can see how those objects are related (are calling each other).

Here are two small examples

Example 1) Table + View

create table myTab (col1 number);
create view myView as select * from myTab;
Table MYTAB created.
View MYVIEW created.

Example 2) Object Type + Collection Type

create type myOT as object (col1 number);
/
create type myCT as table of myOT;
/
Type MYOT compiled

Type MYCT compiled

Further code demos will all assume that those 4 objects are in place unless mentioned otherwise.

The data dictionary

select * from user_dependencies;
NAMETYPEREFERENCED_OWNERREFERENCED_NAMEREFERENCED_TYPEREFERENCED_LINK_NAMESCHEMAIDDEPENDENCY_TYPE
MYOTTYPESYSSTANDARDPACKAGE122HARD
MYCTTYPESYSSTANDARDPACKAGE122HARD
MYVIEWVIEWSVENMYTABTABLE122HARD
MYCTTYPESVENMYOTTYPE122HARD

We can ignore the references to the sys.standard package. I think they come from references to datatypes that are defined there. Those dependencies do not effect the following quirks.

Data dictionary shows that the view MYVIEW depends on the table MYTAB (example 1) and the collection type MYCT depends on the object type MYOT (example 2)

The dependency type HARD tells us, that we are not allowed to drop the referenced objects without breaking|invalidating the dependent object.

drop table mytab;
Table MYTAB dropped.
select status from user_objects where object_name = 'MYVIEW';
INVALID

For types we get an error message when trying to drop it.

drop type myot;
ORA-02303: cannot drop or replace a type with type or table dependents

Side note: Another dependency type would be REF, which we encounter for example when using materialized views (MV). If a table is dropped that is used by a select in the materialized view, then we can still fetch data from the MV.

To successfully drop both types we first need to drop the collection type MYCT before we can drop the referenced object type MYOT.

drop type myct;
drop type myot;
Type MYCT dropped.

Type MYOT dropped.

Ok, this worked as expected. Dropping the objects in the opposite order as we had created them.

How does it work under edition based redefinition (EBR)?

Views and types are both editionable objects. That means we can have several “versions” of them in the same schema under different editions.

Assuming my typical 3 edition hierarchy

ORA$BASE ⇒ BETA ⇒ ALPHA 
(Parent ⇒ child1 ⇒ child2)  

I will only use ORA$BASE and ALPHA for this experiment. The 4 objects all have been created in ORA$BASE. Which means, they can be accessed also in the child editions. So let’s switch to ALPHA.

alter session set edition="ALPHA";
select sys_context('userenv','current_edition_name') from dual;
ALPHA

The data dictionary entries are the same as in the base edition.

select * from user_dependencies where name like 'MY%';
NAMETYPEREFERENCED_OWNERREFERENCED_NAMEREFERENCED_TYPEREFERENCED_LINK_NAMESCHEMAIDDEPENDENCY_TYPE
MYOTTYPESYSSTANDARDPACKAGE122HARD
MYCTTYPESYSSTANDARDPACKAGE122HARD
MYVIEWVIEWSVENMYTABTABLE122HARD
MYCTTYPESVENMYOTTYPE122HARD

Now we compile one of the types, but NOT the other.

alter type myct compile;
Type MYCT altered.

This did actualize the type MYCT in the child edition. Actualization is a concept of EBR. When a parent object is (re)compiled in a child edition the code afterwards exists twice. In the parent and in the child edition.

We go back to the parent edition and repeat our test now.

alter session set edition="ORA$BASE";
select sys_context('userenv','current_edition_name') from dual;
ORA$BASE

We try again to drop the types in the correct drop order.

drop type myct;
drop type myot;
Type MYCT dropped.

Error starting at line : 2 in command -
drop type myot
Error report -
ORA-02303: cannot drop or replace a type with type or table dependents

This time we can not drop the object type. Unfortunately the data dictionary does not tell, which object causes this dependency problem.

select * from user_dependencies where name like 'MY%';
NAMETYPEREFERENCED_OWNERREFERENCED_NAMEREFERENCED_TYPEREFERENCED_LINK_NAMESCHEMAIDDEPENDENCY_TYPE
MYOTTYPESYSSTANDARDPACKAGE122HARD
MYVIEWVIEWSVENMYTABTABLE122HARD

MYOT is not in the list of referenced names anymore. So we should be able to drop it.

The problem is that the view USER_DEPENDENCIES only shows objects that are in the current edition or that are not editionable (like sys.standard). But we have created an dependency between the object MYCT that still exists in edition ALPHA to the object MYOT from the parent edition ORA$BASE.

Which objects exist in which edition can be checked using the user_objects_AE view. The AE stands for all editions.

select edition_name, object_name, object_type, editionable
from user_objects_ae
where object_name like 'MY%';
EDITION_NAMEOBJECT_NAMEOBJECT_TYPEEDITIONABLE
ORA$BASEMYCTNON-EXISTENT
ALPHAMYCTTYPEY
ORA$BASEMYOTTYPEY
MYTABTABLE
ORA$BASEMYVIEWVIEWY

Because the type MYOT was never actualized in the edition ALPHA, we only have this object in the parent ORA$BASE. The dependency crosses from one edition to the other. In ALPHA we are still able to see this dependency in the dictionary.

So how can we drop the object in ORA$BASE?

There are three possible ways. Which way to choose depends on the reason why you dropped the object in the first place.

a) Use the force!

drop type myot force;

I do not recommend this, but there are situations were it is the most easy way. This also invalidates any usage of the dependent collection type in the child edition. It is very easy to overlook such an issue therefore only use FORCE when you understand what the reason the the error is and why it is ok to use FORCE,

b) drop the collection type in the child edition first.

alter session set edition="ALPHA";
drop type myct;

alter session set edition="ORA$BASE";
drop type myot;

c) actualize the object type in the child edition
This is easiest when done before dropping the type.

alter session set edition="ALPHA";
alter type myot compile;

Compiling the type creates a copy of the code in the data dictionary. From that point on any drop will leave a proper reference in the data dictionary of each edition. Cross edition references will not happen when all objects are actualized always. That is also why I start creating new objects in ALPHA and then move them to BETA and later for final testing to ORA$BASE.

Further complications

This looks a bit complicated, but after all we knew what we were doing. So it shouldn’t pose a too big of a problem. Right?

Let me introduce you to 12.2 oracle maintained sys_plsql types.

Since 12.2 plsql pipelined table functions do not need a dedicated SQL type anymore. If we create such a function, typically inside some package, we can create the needed structures (types) purely inside the package spec.

I also dropped all the previously created objects to avoid confusions.

Example

create or replace package myPkg
  authid definer
is 
  type ot_typ is record (col1 number, col2 varchar2(30));
  type ct_typ is table of ot_typ;
  
  function myFnc return ct_typ pipelined;
end;
/

create or replace package body myPkg 
is 
  
  function myFnc return ct_typ pipelined 
  is
    vRow ot_typ := ot_typ();
  begin
    for i in 1..5 loop
      vRow.col1 := i;
      vRow.col2 := chr(ascii('A')-1+i);
      pipe row(vRow);
    end loop;
  end myFnc;
  
end;
/

Now run the function. Since 12.2 there is no need to add the TABLE() operator anymore. However because of name resolution issues, then the function needs a trailing pair of parenthesis.

-- table "operator" syntax
select * from table(myPkg.myFnc); 

-- syntax without table keyword
select * from myPkg.myFnc(); 
COL1 COL2
1    A
2    B
3    C
4    D
5    E

The function successfully returns five rows. No SQL type was actively created. This was needed in the past but is not needed anymore.

Also if we check the dictionary, we do not see any new types. But they secretly exist. If we check user_dependencies (or dba_objects) we can see them.

select * from user_types;
no data found
select * from user_dependencies;
NAMETYPEREFERENCED_OWNERREFERENCED_NAMEREFERENCED_TYPEREFERENCED_LINK_NAMESCHEMAIDDEPENDENCY_TYPE
MYPKGPACKAGESYSSTANDARDPACKAGE122HARD
MYPKGPACKAGE BODYSYSSTANDARDPACKAGE122HARD
SYS_PLSQL_9C0C5336_24_1TYPESYSSTANDARDPACKAGE122HARD
SYS_PLSQL_9C0C5336_DUMMY_1TYPESYSSTANDARDPACKAGE122HARD
SYS_PLSQL_9C0C5336_9_1TYPESYSSTANDARDPACKAGE122HARD
MYPKGPACKAGE BODYSVENMYPKGPACKAGE122HARD
SYS_PLSQL_9C0C5336_24_1TYPESVENSYS_PLSQL_9C0C5336_9_1TYPE122HARD

The view shows a set of types that all start with the almost identical name. Most importantly type SYS_PLSQL_9C0C5336_24_1 depends on type SYS_PLSQL_9C0C5336_9_1.

This was the output in 18c (18.7). The result changed in a 19c database. More about that later.

These two types match the type definitions in the package spec.

type ot_typ is record (col1 number, col2 varchar2(30));
type ct_typ is table of ot_typ;

Depending on how the package is then changed in different editions, it can happen that a cross edition dependency is created for those sys_plsql types. This can then prevent further compilation of the package in one of the editions.

We encountered such a scenario during development in an 18.7 database base. However the used types were considerably more complex. At some point we were not able to compile a package body because of a missing spec. However the spec was there and could apparently be compiled (but secretly would not properly compile).

I was able to track down the problem to a cross edition type dependency for the generated sys_plsql types. Dropping those types (in the relevant editions) allowed us to recompile the package spec and the package bodies.

If you encounter such a case a drop of all the sys_plsql types in each edition and a package recompile then solves the issue.

Behaviour in 19c (19.3)

The general cross edition type dependency issue also exists in 19c.

The sys_plsql types are not shown anymore in the dictionary views – I am not sure if the same behaviour still is in use. I could not reproduce the package problem that we encountered in an 18c (18.7) database that was caused by the sys_plsql types using a simple test case – neither in 18c nor in 19c. Which does not mean, that the problem is gone. Only that it is tricky to recreate.

The 18c and 19c patch release notes do not indicate any changes for the sys_plsql types – so it is not clear if there was a released bug that was fixed. Maybe some internal behaviour was changed but no note is added in the documentation about this.

However in the EBR guide the following is mentioned under Evolutionary capability improvements

The dependency model is now fine-grained: e.g. adding a new column to a table, or a new subprogram to a package spec, no longer invalidates the dependants.

Conclusion

Make sure to always compile all code in your specific edition. This actualizes all the objects and the editions are then independent from each other.

Start by adding new code objects in the child edition first, before you move it to the parent.

Under EBR consider to use specific SQL types for table functions, even if they are not needed anymore. This will give you a tiny bit more control over the code and a better chance to react to problematic dependency issues.

10 Oracle SQL functions and function parameters you might not know

Functions mentioned here are in alphabetical order

BITAND function but no BITOR function

There is a BITAND function in SQL but no BITOR function.

Bitand allows to bit-compare values. It returns a value where all bits have been compared using AND. That means, each bit in the first operand must be matched with the bits in the second operand. This is useful to compare a set of flags (yes/no values) that are stored inside the same integer field.

BIN_TO_NUM is a helper function that “translates” bits into (decimal) numbers.

BITAND example

SELECT BITAND(
    BIN_TO_NUM(1,0,0),  -- 4 in decimal
    BIN_TO_NUM(0,0,1)   -- 1 in decimal
    ) check_flag
  from dual;

0

So 4 and 1 have no bits in common.

SELECT BITAND(
    BIN_TO_NUM(1,0,1),  -- 5 in decimal
    BIN_TO_NUM(1,0,0)   -- 4 in decimal
    ) check_flag
  from dual;

4
4 = 0×2⁰ + 0×2 + 1×2² = BIN_TO_NUM(1,0,0)

So 5 and 4 have bit 3 in common. (Bitcounting starts 1 but powers of 2 start with 0)

BITOR would combine the bits from each operand.

Fortunately BITOR can be calculated using BITAND. We need to sum the operands and then subtract the BITAND result from that.

Here is an example

select 1 + 4  
       - BITAND(1,4) bitor_compare
  from dual;

5

select 5 + 4  
       - BITAND(5,4) bitor_compare
from dual;

5

CAST with default on conversion error

The CAST function allows to convert values from one datatype into another.

Parameter enhancements have been introduced in 12.2 along with similar enhancements to many datatype conversion functions.

https://docs.oracle.com/en/database/oracle/oracle-database/20/sqlrf/CAST.html#GUID-5A70235E-1209-4281-8521-B94497AAEF75

Let’s experiment a little with it.

select cast('01-01-20' as date default null on conversion error
           ,'dd-mm-rr') 
from dual;
01-Jan-20

select cast('INVALID' as date default null on conversion error
           ,'dd-mm-rr') 
from dual;
(null)

Works like a charm. The string was converted into a date (and converted back into a string to print it on the screen using local nls settings). An invalid string could not be converted, instead the default NULL value was chosen.

Sometimes we want to return today in case there is a conversion error…

select cast('01-01-20' as date default sysdate on conversion error
           ,'dd-mm-rr') 
from dual;
01-Jan-20

select cast('INVALID' as date default sysdate on conversion error
           ,'dd-mm-rr') 
from dual;
17-Mar-20

seems to work!

..not so fast

alter session set nls_date_format = 'day-Mon-rr';
select cast('01-01-20' as date default sysdate on conversion error
           ,'dd-mm-rr') 
from dual;

ORA-01858: a non-numeric character was found where a numeric was expected

Although we used cast with a default on conversion error, we still got an conversion error!
The problem here is the default parameter. It needs to be a string. Since it is not, an implicit type conversion happened. The result differs from the function parameter fmt. Which then leads to a conversion error.

Unfortunately the default parameter can not be an expression.

select cast('01-01-20' as date 
            default to_char(sysdate,'dd-mm-rr') 
            on conversion error
           ,'dd-mm-rr')
from dual;

ORA-43907: This argument must be a literal or bind variable.

Also nls_date_format is not allowed as nlsparam.

select cast('01-01-20' as date default sysdate on conversion error
           ,'dd-mm-rr'
           ,'nls_date_format=dd-mm-rr')
from dual;

ORA-12702: invalid NLS parameter string used in SQL function

The same could happen for other conversions, like casting a string into a number. This example is from the docs.

SELECT CAST(200
       AS NUMBER
       DEFAULT 0 ON CONVERSION ERROR)
  FROM DUAL;

It is a very very bad example, since 200 is already a number. So let’s assume this is ‘200.00’ but the default is still 0. And we add the other parameters to specify the format of this string.

select cast('200.00' as number default 0 on conversion error
           ,'FM999d00'
           ,'nls_numeric_characters=.,')
from dual;
200

select cast('200x00' as number default 0 on conversion error
           ,'FM999d00'
           ,'nls_numeric_characters=.,')
from dual;
0

Seems to work. But here again we can manipulate the session settings into a wrong conversion. Notice that the alter session settings and the nlsparam settings differ.

alter session set nls_numeric_characters = ',.';
Session altered.

select cast('200x00' as number default 0 on conversion error
           ,'FM999d00'
           ,'nls_numeric_characters=.,')
from dual;
0

Good. Fortunately the number conversion is slightly more robust than the date conversion. We can not set a session default number format (it is always “TM9” I believe) in the same way as we can set the default date format.

However we get into trouble once the default value uses non-integer numbers.

select cast('200x00' as number default 0.5 on conversion error
           ,'FM999d00'
           ,'nls_numeric_characters=.,')
from dual;

5

The default was 0.5 but we got 5!

The correct way to do it of cause would be to use a string with a format that reflects the fmt and nlsparam parameters.

select cast('200x00' as number default '0.5' on conversion error
           ,'FM999d00'
           ,'nls_numeric_characters=.,')
from dual;

0,5

Now we got the correct default value returned!

CHECKSUM

Checksum can be used to quickly calculate a checksum over a set of rows. There is an analytic version of it too.

Description of checksum.eps follows
https://docs.oracle.com/en/database/oracle/oracle-database/20/sqlrf/checksum.html#GUID-3F55C5DF-F23A-4B2F-BC6F-E03B34B78BA8

If you want to quickly see if data was changed between two databases, you can run a checksum over each column and compare the result on both environments.

select checksum(phone_number)
from hr.employees;

Never heard of this function before? Probably because it just got introduced in 20c.

Dump

The dump function is often used to inspect binary values. Useful for example when solving character set issues.

Not many know that it has some additional parameters.

Description of dump.eps follows
https://docs.oracle.com/en/database/oracle/oracle-database/20/sqlrf/DUMP.html#GUID-A05793C9-B35D-4BA7-B68C-E3693BCF47A5

Interesting is especially the return_fmt parameter. It allows to return the dump in octal (8), hex (16), decimal (10=default) or a special format (17). The return format “17” will return the single byte values in readable ASCII format, but multi byte values that can not be converted are shown in hex.

For example the “€” symbol in UTF-8 uses 3 Bytes: E2 82 AC

Source: https://unicode-table.com/en/20AC/

Dump-17 will return this

select dump('abc€',17) from dual;
Typ=96 Len=6: a,b,c,e2,82,ac

The string is made of 6 bytes. The first 3 are single byte characters. They are converted into ascii. The 4th character is the Euro-Sign, witch is 3 bytes in UTF-8. So the format 17 is interesting, because it helps us to focus and find the problematic characters.

Another interesting option is to add 1000 to the format. This will add the character-set to the output.

select dump('abc€',1010) from dual;
Typ=96 Len=6 CharacterSet=AL32UTF8: 97,98,99,226,130,172

Of cause that works with NCHAR too.

select dump(n'abc€',1010) from dual;
Typ=96 Len=8 CharacterSet=AL16UTF16: 0,97,0,98,0,99,32,172

LNNVL

This is a very special function. Apart from the unspeakable name it is used for Oracle internal SQL transformations by the optimizer. Here is a document that describes such transformations. LNNVL returns a Boolean and can be used directly inside a where clause. This is not yet possible for user defined functions that return Boolean, but other such functions do exist (e.g. regexp_like).

I sometimes use it to find rows where two values differ. If the comparison value is NULL I still want to consider it to be different.

Instead of using the non-equality operator != we have to use the opposite the equality operator =. This is just how LNNVL works.

select dummy from dual
where lnnvl('A'='B');
X;

select dummy from dual
where lnnvl('A'='A');
No data found.

select dummy from dual
where lnnvl('A'=null);
X

This assumes that ‘A’ always is a not null value.

To get the same result typically a comparison condition needs also consider the NULL case.

select dummy from dual
where (('A'!='B' and 'B' is not null) OR 'B' is null);

Since more conditions like this might follow, the statement very soon becomes cluttered with OR checks, parenthesis and IS NULL comparisons.

Unfortunately since not many developers are familiar with this function, we should always add some comments to explain the purpose and the behaviour.

NANVL

NANVL is similar to NVL.

NVL returns a value, when NULL is encountered.
NANVL returns a value when NaN (not a number) is encountered.

NaN is part of the binary_float and binary_double datatypes.

select to_binary_float(1/17) from dual;
0,05882353

select to_binary_float('NaN') from dual;
NaN

Lets apply NANVL to it

select nanvl(to_binary_float('NaN'),'0') from dual;
0.0

Or we could set it to negative infinity…

select nanvl(to_binary_float('NaN'),'-INF') from dual;
-Infinity

Somehow interesting, but rarely useful I think.

NEXT_DAY function parameter

The NEXT_DAY function needs a second parameter that defines what target weekday it points to.

Description of next_day.eps follows

The argument char must be a day of the week in the date language of your session, either the full name or the abbreviation. The minimum number of letters required is the number of letters in the abbreviated version. Any characters immediately following the valid abbreviation are ignored. The return value has the same hours, minutes, and seconds component as the argument date.

https://docs.oracle.com/en/database/oracle/oracle-database/20/sqlrf/NEXT_DAY.html#GUID-01B2CC7A-1A64-4A74-918E-26158C9096F6

This second parameter char works differently than what you might expect. Here is an example.

alter session set nls_date_format = 'Dy, DD-MON-YYYY';
Session altered.

alter session set nls_language='AMERICAN';
select next_day(sysdate,'FRIDAY') from dual;
Fri, 20-MAR-2020

alter session set nls_language='GERMAN';
Session altered.

select next_day(sysdate,'FREITAG') from dual;
Fr, 20-MRZ-2020

select next_day(sysdate,'FRIDAY') from dual;
Fr, 20-MRZ-2020

Isn’t that strange? It looks as if although my session language is GERMAN, but that the American spelling still works!

Let’s test this with a different language. In Spanish “friday” would be “viernes”.

select next_day(sysdate,'VIERNES') from dual;
ORA-01846: Kein gültiger Wochentag
"not a valid day of the week"

alter session set nls_language='SPANISH';
Session altered.

select next_day(sysdate,'VIERNES') from dual;
Vie, 20-MAR-2020

select next_day(sysdate,'FRIDAY') from dual;
ORA-01846: día de la semana no válido
"not a valid day of the week"

Ok. So the Spanish language works as long the NLS settings are correct, but it doesn’t accept the American spelling.

Is German special? In a way yes. In German several weekdays have an abbreviation that matches the American spelling. And the abbreviation is only two letters short. Especially those days that have a good chance to be used in the NEXT_DAY function (monday, friday, saturday, but not sunday!).

For “FREITAG” the abbreviation is “FR”. The NEXT_DAY function accepts anything as long as the char string starts with the abbreviation of a weekday in the correct language. So FREITAG, FRIDAY and even FRITZ or FROG all return the next friday.

alter session set nls_language='GERMAN';
Session altered.

select next_day(sysdate,'FR') from dual;
Fr, 20-MRZ-2020

select next_day(sysdate,'FRITZ') from dual;
Fr, 20-MRZ-2020

select next_day(sysdate,'FROG') from dual;
Fr, 20-MRZ-2020

Is this how you expected the char parameter to work?

Recommendation? Don’t use NEXT_DAY at all. I prefer TRUNC over NEXT_DAY any day.

POWER (sql) = ** (plsql)

In SQL to calculate an exponentiation we have to use the POWER function.

Example 2³

select power(2,3) from dual;
8

In plsql we can use the ** operator instead.

set serveroutput on
begin
  dbms_output.put_line(2**3); 
end;
/
8

Unfortunately the ** operator will not work in SQL.

ROUND_TIES_TO_EVEN function

Everybody knows the round function. Round rounds a value up which is exactly half.

1.5 ≈ 2
2.5 ≈ 3
...
7.5 ≈ 8
8.5 ≈ 9
9.5 ≈ 10

This is what statisticians call a biased function. The computed values do not reflect the distribution of the original values – especially if there are many 0.5 decisions.

An unbiased version of rounding is to round one value up, the next down, then up again.

1.5 ≈ 2
2.5 ≈ 2
...
7.5 ≈ 8
8.5 = 8
9.5 ≈ 10

In German that is called “kaufmännisches Runden”. Loosely translated to Bankers Rounding.

The round_ties_to_even function has been introduced in 18c.

Description of round_ties_to_even.eps follows
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/ROUND_TIES_TO_EVEN-number.html#GUID-49919B6B-4337-4812-A248-B5D98F102DBD
select level+0.5 as val, 
       round(level+0.5), 
       round_ties_to_even(level+0.5)
from dual connect by level <= 10;
VALROUND(VAL)ROUND_TIES_TO_EVEN(VAL)
1.522
2.532
3.544
4.554
5.566
6.576
7.588
8.598
9.51010
10.51110

This new rounding function implements “Bankers Rounding”. Especially when adding up the rounded values, the rounded sum differs less from the correct (unrounded) sum compared to what normal rounding delivers.

STANDARD_HASH function

If we quickly want to create some hash values we can use the standard_hash function.

select standard_hash(dummy) from dual;
C032ADC1FF629C9B66F22749AD667E6BEADF144B

The default uses an SHA1 hash algorithm, but we can specify SHA1, SHA256, SHA384, SHA512 and MD5.

select standard_hash(dummy,'SHA512') from dual;
3173F0564AB9462B0978A765C1283F96F05AC9E9F8361EE1006DC905C153D85BF0E4C45622E5E990ABCF48FB5192AD34722E8D6A723278B39FEF9E4F9FC62378

MD5 is not recommended at all (https://www.md5online.org/blog/why-md5-is-not-safe/). Also SHA1 is not considered secure anymore, but you can still use it safely to create hash values by it for non security purposes.

Conclusion

That’s it. 10 functions or parameters of functions that you probably didn’t know before. At least not all of them.

SQL Quickie: How to reset an identity column

Introduction

Since Oracle 12 we can declare columns as an identity column. Other databases call this feature “autoincrement” column. A term that can easily lead to some misunderstandings, but is already well established. In some of those databases such a column allows to reset the current value. There is no (documented) way for that with an Oracle identity column. A recent discussion in the ODC forums lead me to think whether it is possible to set an identity column to a custom value. And yes it is.

TL;DR;

  • Change the increment of the sequence that supports the identity column by modifying the column.
  • Increase the value by an insert+rollback.
  • Reset the sequence by modifying the column again.

or use the undocumented RESTART option (12.2 onwards)

alter table demo_identity_reset
modify id generated always as identity restart start with 60;

Showcase

demo setup

First we create a table with an identity column. We use default settings for the sequence, which for example means increment +1 and cache 20.

/* prepare demo table */
create table demo_identity_reset
(id number generated always as identity primary key
,name varchar2(500)
);

Then insert some demo values.

/* add some data */
insert into demo_identity_reset (name) values ('test1');
insert into demo_identity_reset (name) values ('test2');
commit;
select * from demo_identity_reset;
ID NAME
1 test1
2 test2

We now have a table with an identity column where some values are used and more values are currently cached in the SGA.

Set a custom value (12.1)

Gather info

First we need to find the name of the sequence. Either by looking at all sequences, but more correct would be to look at the column definitions. The name of the sequence can be seen in the default value of the identity column.

/* check data dictionary and find the sequence name*/
select * from user_sequences;

or

/* check data dictionary and find the sequence name*/
select data_default
from user_tab_columns
where table_name = 'DEMO_IDENTITY_RESET'
and identity_column = 'YES';
DATA_DEFAULT
"MYSCHEMA"."ISEQ$$_172206".nextval

In my case the sequence name is ISEQ$$_172206. The generated name will always start with ISEQ$$_.

Other information we need to find is the current value, the cache size and the increment. The only decent way to see the current value is to set the sequence to nocache and after that check the data dictionary. We need to remember the original cache size setting to be able to restore it later to that value.

Start the change

From this step onward no other session should insert at the same time. It might be a good idea to lock the table in exclusive mode during those steps.

The cache also needs to be set to nocache to prevent caching any values with the wrong increment during the next steps.

/* reset the cache so that we can see the next value */
select cache_size 
from user_sequences 
where sequence_name = 'ISEQ$$_172206';

alter table demo_identity_reset 
modify id generated always as identity nocache;

Cache size was 20. The sequence now is not caching anymore. This is shown in the user_sequences dictionary view.

Now we can read the next value from the data dictionary and use that information to set an increment that jumps to our target value.

Here I assume a target value of 60.

/* find the next value */
select last_number from user_sequences 
where sequence_name = 'ISEQ$$_172206';

/* calculate the increment: 57 = 60(target value) - 3(last number) */

/* change the increment so that it jumps over all the other values. */
alter table demo_identity_reset 
modify id generated always as identity increment by 57 nocache; 

The last number was 3. Last number will always hold the next value after the cached values.

-- this does a sequence.nextval which we can not call otherwise
insert into demo_identity_reset (name) 
values ('dummy. rollback immediatly.'); 
rollback;

It doesn’t matter much, which value was inserted (59). In case you want to avoid any gaps, then this insert needs to be one of the real inserts that needs to be done. And the increment needs to be one value more (58 instead of 57). So that instead of a rollback you can do a commit.

/* reset the changes */
alter table demo_identity_reset 
modify id generated always as identity increment by 1 cache 20;

Done. The next insert will now start with our target value 60.

Set a custom value (12.2)

I tested this in 19c on LiveSql but it should work in all versions that belong to the 12.2. database family (12.2.0.1, 18c, 19c).

Instead of the complicated series of steps in 12.1 we can do it all in a single command. Unfortunately this command is undocumented, so use it at your own risk.

alter table demo_identity_reset 
modify id generated always as identity restart start with 60;

The RESTART option was mentioned in one of the new feature guides and Roger Troller made me aware of it in one of his blog posts. However currently (19c) it is still missing in the SQL reference documentation.

Test the change

The following steps are not needed anymore. But they are proof that the concepts works.

/* check the result in the dictionary*/
select last_number, cache_size 
from user_sequences 
where sequence_name = 'ISEQ$$_172206';
LAST_NUMBER    CACHE_SIZE
 60    20
/* test the result using the the table*/
insert into demo_identity_reset (name) values ('test3');
commit;
select * from demo_identity_reset;
IDNAME
1test1
2test2
60test3

Yes it works!

Cleanup code

Drop the table and the sequence. A simple drop table is not enough, because the sequence will continue to exist as long as the table is still in the recycle bin.

/* undo testcase */
drop table demo_identity_reset purge;

Further information

In general I would not recommend doing that in a production environment. But there are situations where you want to do it, for example after coping a production schema to a test database and cleaning out many values, you might like to reset the sequences too.

Such situations are rare and probably once in a while activities. That is also why I didn’t include the lock table command. And the lock will only hold until the next alter table command. Which makes it not safe to use.

It is not possible to use an alter sequence command for sequences that are connected with an identity column. It results in an error message.

alter sequence "ISEQ$$_172206" restart start with 60;

ORA-32793: cannot alter a system-generated sequence

Here is an example of the definition of an autoincrement column in another database (MySQL).

https://dev.mysql.com/doc/refman/5.7/en/example-auto-increment.html

The AUTO_INCREMENT attribute can be used to generate a unique identity for new rows

No value was specified for the AUTO_INCREMENT column, so MySQL assigned sequence numbers automatically. You can also explicitly assign 0 to the column to generate sequence numbers, unless the NO_AUTO_VALUE_ON_ZERO SQL mode is enabled.

If the column is declared NOT NULL, it is also possible to assign NULL to the column to generate sequence numbers.

When you insert any other value into an AUTO_INCREMENT column, the column is set to that value and the sequence is reset so that the next automatically generated value follows sequentially from the largest column value.

Updating an existing AUTO_INCREMENT column value in an InnoDB table does not reset the AUTO_INCREMENT sequence as it does for MyISAM and NDB tables.

MySQL 5.7 Reference Manual – 3.6.9 Using AUTO_INCREMENT

Some quick facts about sequence caches and gaps in IDs

  1. Oracle sequences generate key values without gaps (minor exception from the rule: scalable sequences).
  2. Surrogate keys can and will have gaps.
  3. Gaps exist because it is not worth the huge effort to avoid them.

The biggest misunderstanding is probably the difference between the value coming from a sequence and the primary key value inside a database column (ID column). We can use a sequence to set a surrogate key. Typically we do this by a database trigger or an identity column or even as a value directly during the insert.

“The history of sequence caches is a history of misunderstandings.”

somebody on the internet, 1963

Essentially it is a two step process. The new ID value is created by sequence.nextval and then used during the insert. Oracle provided the logic to get a new sequence value in a fast, performant and multi-session friendly way. The developer is responsible to make sure this value is not wasted.

The sequence cache is a performance instrument. A lost cache value means we will see gaps in future ID values. There is only a very limited number of activities that makes cached sequence values “disappear”. The majority of gaps originates from other activities, like failed inserts or deletes. Many developers seem to think, that such a gap is caused by the sequence cache. I find it important to separate what reason ultimately lead to a gap in an ID value.

A gap analysis can be the first step. The following select finds gaps and orders them by size. Put your own table and PK column into the “ids” subquery and find out how the gaps are distributed in that table.

-- gap analysis
with ids as (select region_id as id, lead(region_id) over (order by region_id) as nextid from regions)
   , gaps as (select id, nextid - id as gap_size from ids where nextid > id + 1)
select gap_size , count(*) gaps_found  
from gaps
group by gap_size
order by count(*) desc;

Example result

gap_sizegaps_found
205
32
22
191

Here the data shows a few small gaps. Either some insert didn’t make it or rows had been deleted. But there are also several gaps of size 19 and 20, which means the sequence cache was probably lost at least 6 times.

To make it clear, IDs are allowed to have gaps. However when an end user complains why he constantly sees gaps in some ID, then we should investigate what is going on. Frequent gaps should not be the expected normal behaviour – it is worth finding the root cause for such an issue.

So here is a (non complete) list of activities that do or do not influence the caching of sequences and the existence of gaps in an ID column.

action result
multiple sessionscache used
multiple sessions with one session slow to commitIDs not in order of commit => not a gap eventually, but looks like a gap until final commit
alter system flush shared_pool ;cache values lost => gap in ID
shutdown immediate;cache values not lost
shutdown abort;cache values lost => gap in ID
insert; commit;no gap
insert; rollback;gap in ID
failed insertgap in ID
delete; commit;gap in ID
switch sequence from cache to nocachecache values not lost
kill sessionpossible gap in ID
(values not used|commited are lost)
different RAC sessionscache values not lost
(but separate caches on each RAC node)
SGA size too smallcache values lost => gap in ID
(sequence cache ages out of shared pool per LRU list)
session sequencesnot useful for unique ID values => duplicate ID
scalable sequencescache values not lost => planned large gap in ID
(because of prefix)
alter system cancel SQL ‘SID, serial’;
(insert killed)
gap in ID
alter system disconnect session post_transaction;no gap
(unless sequence was fetched before the transaction started)
alter system disconnect session immediate; gap in ID
sequence restartpossible duplicate IDs
sequence interval > 1planned gap in ID

That a sequence ages out of the shared pool is in my experience the most common cause of lost sequence values that are noticed. A strong indication is when frequent gaps of 20 values exist.

Possible workarounds are

a) set the sequence to nocache if the sequence is used rarely
alter myseq nocache;
This might cost performance.

b) pin the sequence in the shared pool
dbms_shared_pool.keep('MYSCHEMA.MYSEQ','Q')
Pinning the sequence is useful when you have times with lots of concurrent activities followed by larger idle periods.
This costs a little memory.

c) increase SGA size – sometimes it helps to increase the restricted shared pool area.
This costs memory (and money) – not available for other non-database processes.

Conclusions

  • Small gaps in surrogate keys (ID columns) are natural, mostly because of delete statements or failed inserts.
  • The sequence object typically does not produce gaps.
  • Very frequent gaps of size 20 (default cache size) are not natural and might need investigation.