Sequence and Audit columns with Apex 5 and 12c

Introduction

In the good ol’ times database triggers were used to populate database columns like primary identifiers or audit columns.

Here is an example of such a trigger.
For sake of simplicity I concentrate on the insert trigger.

create table swe_demo (id number primary key
                      ,col1 number
                      ,col2 varchar2(30)
                      ,inserted_date date not null
                      ,inserted_from varchar2(30) not null);

create sequence swe_demo_seq cache 10000;

create or replace trigger swe_demo_bri_trg
  BEFORE INSERT ON swe_demo
  FOR EACH ROW
BEGIN
   -- record needs always a key
   IF :new.id IS NULL
   THEN
      :new.id := swe_demo_seq.NEXTVAL;
   END IF;

   -- timestamp of the last changes
   :new.inserted_date := SYSDATE;
   :new.inserted_from := COALESCE(v('APP_USER'), user);
END swe_demo_bri_trg;
/

What does it do?

  • The ID column is filled with a sequence value.
  • The inserted_date column is filled with sysdate.
  • The inserted_from column is filled with the current apex user or if that is not set, then the connected user is choosen.

It is time to revisit that functionality and see which new and maybe better conceptes are offered when the newest technologies are in place.

This was tested against Apex 5.0.1 and Oracle 12.1.0.1 .
It should work also in Apex 5.0.3 and Oracle 12.1.0.2

Solution

If you are just looking for the final code – here is my “perfect” solution.

create sequence swe_demo_seq cache 10000;
create table swe_demo
  (id number default swe_demo_seq.nextval primary key
   ,col1 number
   ,col2 varchar2(30)
   ,inserted_date date default sysdate not null
   ,inserted_from varchar2(30) default coalesce(
          sys_context('APEX$SESSION','app_user')
         ,regexp_substr(sys_context('userenv','client_identifier'),'^[^:]*')
         ,sys_context('userenv','session_user')
         )
      not null);

The inserted_from value is now more accurate.

You might have noticed that the need for an insert trigger is gone.
This increases performance considerably.

To insert 1 mill rows the old version needed more than a minute. This can be considered fast.
Elapsed: 00:01:12.904

The new version needs less than 6 seconds. This is a blizzard!
Elapsed: 00:00:05.598

I admit that the test was specifically set up to show a major difference. Feel free to set up your own more realistic test cases.

Still 12 times faster is really nice.

The test code to get this result can be seen in chapter TC4.

Considerations and test cases

How to find out about the current user

This line

COALESCE(v('APP_USER'), user)

from the example is really bad.

v(‘APP_USER’) is a bit slow, although not as slow as I suspected. It will not work on databases without apex installations. “v” is just a public synonym for an apex function that looks up session specific values in the apex repository. Surprisingly the “user” function is a much bigger drag. See TC1 below for performance comparisons.

In older apex versions a good way to find out who was logged in, was to read the client_identifier from the userenv context.

sys_context('userenv','client_identifier')

The structure of this did change between the different apex versions. In Apex 5 it now holds the APP_USER and the Session ID, e.g. SVEN:0123456789

A context is an extremly efficient way to read global session or application data.

Apex 5 introduced a new context with the namespace “APEX$SESSION”. This context is populated by the apex engine (APEX_050000.WWV_FLOW_SESSION_CONTEXT) and holds app_user, app_session and workspace_id. There is a good chance that future versions will add more parameters to this namespace.

See also: http://jeffkemponoracle.com/2015/11/24/apex-5-application-context/

To read the logged in user we fetch SYS_CONTEXT(‘APEX$SESSION’,’APP_USER’).
To read the user if there is something non-apex running, we should read the client_identifer. But there is more to consider.

database links

In many apex projects we do not insert into the local apex database, but into some remote project schema. This is often the case when I just add some reporting or monitoring solution in apex, without interfering with the real project database too much. So lets compare how the different contexts behave when done local vs. remote (over a db link).

12c introduced some changes to the userenv namespace. Especially the “current_user” is now deprecated.
Here is a list of all parameters that belong to the userenv namespace:
https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions165.htm#g1513460

The interesting ones are:

  • client_identifier
    Returns an identifier that is set by the application through the DBMS_SESSION.SET_IDENTIFIER procedure, the OCI attribute OCI_ATTR_CLIENT_IDENTIFIER, or the Java class Oracle.jdbc.OracleConnection.setClientIdentifier. This attribute is used by various database components to identify lightweight application users who authenticate as the same database user.
  • current_schema
    Name of the default schema being used in the current schema. This value can be changed during the session with an ALTER SESSION SET CURRENT_SCHEMA statement.
  • current_user
    Deprecated – Use the SESSION_USER parameter instead.
  • session_user
    For enterprises users, returns the schema. For other users, returns the database user name by which the current user is authenticated. This value remains the same throughout the duration of the session.
  • dblink_info
    Returns the source of a database link session. Specifically, it returns a string of the form:SOURCE_GLOBAL_NAME=dblink_src_global_name, DBLINK_NAME=dblink_name, SOURCE_AUDIT_SESSIONID=dblink_src_audit_sessionid
  • authenticated_identity
    Returns the identity used in authentication. In the list that follows, the type of user is followed by the value returned: …

For testing purposes I just made a loopback link to the same database but a different schema. The testscript can be found in section TC2.

The following tests were done:
1) An insert from an apex page into a view that included a db link to REMOTE_B
2) An insert from an apex page, directly into a local view (LOCAL_A) which was mapped to demo.swe_demo.
5) A direct database insert in the view with the DB link (remote insert)
6) A direct database insert in the view without the db link
7) an insert from a DBA “IAMDBA” using alter session set current_schema

This was the result

Id Some Text Inserted Date Inserted From App User Apex Session Client Identifier Current Schema Current User User Session User Authenticated Identity Dblink Info
1 apex remote insert 22-FEB-16 SVEN SVEN:4628609689353 DEMO DEMO REMOTE_B REMOTE_B REMOTE_B SOURCE_GLOBAL_NAME=DB_A.11623201
2 apex local insert 22-FEB-16 SVEN SVEN SVEN SVEN:4628609689353 DEMO DEMO APEX_PUBLIC_USER APEX_PUBLIC_USER APEX_PUBLIC_USER
5 direct remote insert 22-FEB-16 REMOTE_B DEMO DEMO REMOTE_B REMOTE_B REMOTE_B
6 direct local insert 22-FEB-16 LOCAL_A DEMO DEMO LOCAL_A LOCAL_A LOCAL_A
7 direct insert current_schema 22-FEB-16 IAMDBA DEMO DEMO IAMDBA IAMDBA IAMDBA

I did some more tests, for example using definer and invoker rights procedures, but they didn’t reveal any important differences.

The findings

APEX$SESSION is not populated via db link, but CLIENT_IDENTIFIER is.
Other than those two SESSION_USER has the best information and is always populated.

Therefore in an apex near environment the best expression to find out who inserted some record would be

coalesce(
   sys_context('APEX$SESSION','app_user')
   ,regexp_substr(sys_context('userenv','client_identifier'),'^[^:]*')
   ,sys_context('userenv','session_user')
   )

One should also notice that current_user returns the same values as current_session. This is contrary to the documentation in older database versions. And Oracles suggestion to use session_user instead is certainly correct, but you should be aware that then a different user might be stored than before.

Authenticated_Identity promises to be interesting in very specific security environments.

12c identity columns and default values

With 12c we can consider to use an identity column as our primary key. Also it is now possible to use a sequence as a default value for the column.

The base syntax for an identity column is

id number generated as identity

The base syntax for a default value column is

id number default swe_demo_seq.nextval

There are several differences between the two.
An identity column essentially prevents that values are inserted into this column. So that it is ensured that always the sequence was used to populate the data. The default value column uses the sequence only if null was inserted. This is essentially the same functionality what the trigger did.

It is possible to setup an identity column to behave almost the same as the default value column.

The extended syntax for such an identity column is

id number generated by default on null as identity (cache 10000)

I slightly prefer the default value column syntax for two reasons

  1. The default value syntax is also used for the two audit columns. So it is consistent to use the same mechanism.
  2. Default value columns allow more control over the sequence object.

I plan to write an extra blog post to discuss the differences in more detail.

TC3 shows an performance comparison between trigger logic, default value and identity column. Default values and identity are equally fast. But the trigger looses by a large margin!

Various testcases

For most cases the results should be consistent in other environments too. However always test it your own. Here are the scripts that I used. They might help to setup your own test cases.

TC1) Compare performance of different expressions to read the client identifer

The following plsql block was used and run

set time on
set timing on
declare
  v_result varchar2(100);
begin
  for i in 1..1000000 loop
  v_result := ##EXPRESSION##;
  end loop;
end;
/

This is the output for several tested expressions.
Watch out that some expressions return different results.

I always did three runs to see if the results are consistent and copied one of the elapsed times to the table.

sys_context('userenv','client_identifier')

00:00:02.436
This is the base. It should not be possible to go faster.

substr(sys_context('userenv','client_identifier'),1
            ,instr(sys_context('userenv','client_identifier'),':')-1)

00:00:04.288
Finds the colon and returns everything before that. But returns NULL if no colon is in the client_identifer.

substr(sys_context('userenv','client_identifier'),1, 
   coalesce(nullif(instr(sys_context('userenv','client_identifier'),':'),0)-1,
 length(sys_context('userenv','client_identifier'))))

00:00:06.311
A little complex to read, isn’t it…

regexp_substr(sys_context('userenv','client_identifier'),'^[^:]*')	

00:00:03.459
This is the winner!

translate(sys_context('userenv','client_identifier'),'A:0123456789','A')

00:00:05.663
Surprisingly slow

It surprised me that the REGEXP_SUBSTR expression was the best performing. Previous comparisons between regular expressions and substr/instr combinations always went in favor of substr. It seems either the internal code was optimized by oracle. Or more likely that this specific task (simple search from the beginning of the string) is well suited for a fast regexp search.

In general all expressions were fast. Interesting side note is that REGEXP_SUBSTR and TRANSLASTE were a tiny bit slower for the second identifier, while the other expressions performed equally fast.

Some other expressions:

user 

Elapsed: 00:00:20.652
Really really slow!

sys_context('APEX$SESSION','app_user')

00:00:01.549
Blazingly fast!

v('APP_USER')

00:00:05.646
Not as slow as I thought!

TC2) Analyze the value of differnt contexts with regards to DB links

This script can be run as DBA to setup a demo user.
Make sure you do not have a schema named “demo” already in your system.


create user demo identified by demo;
grant unlimited tablespace to demo;

drop table demo.swe_demo ;
drop sequence demo.swe_demo_seq;

create table demo.swe_demo 
                      (id number primary key
                      ,some_text varchar2(30)
                      ,inserted_date           date not null
                      ,inserted_from           varchar2(30) not null
                      ,c_app_user                varchar2(30)
                      ,c_apex_session            varchar2(30)
                      ,c_client_identifier       varchar2(100)
                      ,c_current_schema          varchar2(30)
                      ,c_current_user            varchar2(30)
                      ,c_user                    varchar2(30)
                      ,c_session_user            varchar2(30)
                      ,c_authenticated_identity  varchar2(100)
                      ,c_dblink_info             varchar2(100)
                      ,c_external_name           varchar2(30)
                      );

create sequence demo.swe_demo_seq cache 10000;


create or replace trigger demo.swe_demo_bri_trg
  BEFORE INSERT ON demo.swe_demo
  FOR EACH ROW
BEGIN
   -- record needs a key
   IF :new.id IS NULL
   THEN
      :new.id := swe_demo_seq.NEXTVAL;
   END IF;

   -- timestamp of last changes 
   :new.inserted_date := SYSDATE;
   :new.c_app_user               := v('APP_USER');
   :new.c_apex_session           := sys_context('APEX$SESSION','app_user');
   :new.c_client_identifier      := sys_context('userenv','client_identifier');
   :new.c_current_schema         := sys_context('userenv','current_schema');
   :new.c_current_user           := sys_context('userenv','current_user');
   :new.c_user                   := user;
   :new.c_authenticated_identity := sys_context('userenv','authenticated_identity');
   :new.c_session_user           := sys_context('userenv','session_user');
   :new.c_dblink_info            := sys_context('userenv','dblink_info');
   :new.c_external_name          := sys_context('userenv','EXTERNAL_NAME');
   
   -- recommended:
   :new.inserted_from        := coalesce(sys_context('APEX$SESSION','app_user')
                                     ,regexp_substr(sys_context('userenv','client_identifier'),'^[^:]*')
                                     ,sys_context('userenv','session_user')
                                     );
   
END swe_demo_bri_trg;
/

show errors

grant select, insert, update, delete on demo.swe_demo to SchemaB;

TC3) Performance comparison for populating PK columns

--------------------------------------------------------------------------------
-- swe 12c demo identity columns
--------------------------------------------------------------------------------
set time on
set timing on

-- old logic using trigger
drop table swe_demo ;
drop sequence swe_demo_seq;
create table swe_demo (id number primary key, col1 number, col2 varchar2(30));
create sequence swe_demo_seq cache 10000;

create or replace trigger swe_demo_bri_trg
  BEFORE INSERT ON swe_demo
  FOR EACH ROW
BEGIN
   -- record needs a key
   IF :new.id IS NULL
   THEN
      :new.id := swe_demo_seq.NEXTVAL;
   END IF;

END swe_demo_bri_trg;
/

-- Performance test to insert 1 mill records
insert into swe_demo (col1)
select level from dual connect by level <= 1000000;
   
-- result
1,000,000 rows inserted.
Elapsed: 00:00:36.854

-- setup 12c logic using default value sequences
drop table swe_demo ;
drop sequence swe_demo_seq;
create sequence swe_demo_seq cache 10000;
create table swe_demo (id number default swe_demo_seq.nextval primary key, col1 number, col2 varchar2(30));


-- performance test to insert 1 mill records
insert into swe_demo (col1)
select level from dual connect by level <= 1000000;

-- result
1,000,000 rows inserted.
Elapsed: 00:00:04.068


-- 12c logic using identity column 
drop table swe_demo ;
drop sequence swe_demo_seq;
create table swe_demo (id number generated as identity primary key, col1 number, col2 varchar2(30));

-- Performance test to insert 1 mill records
insert into swe_demo (col1)
select level from dual connect by level <= 1000000;

--result 
1,000,000 rows inserted.
Elapsed: 00:00:08.626

There is a performance difference compared to the default value solution. This is purely based upon the sequence cache
The identity column was created with a default cache size of 20.

--Try to manually insert a row into the identiy column 
insert into swe_demo (id, col1)
values (null, -1);

SQL Error: ORA-32795: cannot insert into a generated always identity column
32795.0000 -  "cannot insert into a generated always identity column"
*Cause:    An attempt was made to insert a value into an identity column
           created with GENERATED ALWAYS keywords.
*Action:   A generated always identity column cannot be directly inserted.
           Instead, the associated sequence generator must provide the value.
;
-- use non default settings for the identity column
-- 12c logic using identity column 
drop table swe_demo purge;
create table swe_demo (id number generated by default on null as identity (cache 10000) primary key , col1 number, col2 varchar2(30));

-- Performance test to insert 1 mill records
insert into swe_demo (col1)
select level from dual connect by level <= 1000000;

--result 
1,000,000 rows inserted.
Elapsed: 00:00:03.763    

-- test insert
insert into swe_demo (id, col1)
values (null, -1);

1 row inserted.

insert into swe_demo (id, col1)
values (-1, -1);

1 row inserted.

It seems as this is even a tiny bit faster than the default value solution. The test results were influenced heavily by other network or database activities.
Both execution times are very close. The top score however was done using an identity column.

TC4) Extend performance test TC3 with audit columns

--------------------------------------------------------------------------------
-- Adding audit columns to the previous example
-- Final comparison
--------------------------------------------------------------------------------
-- old logic 
drop table swe_demo purge;
drop sequence swe_demo_seq ;

create sequence swe_demo_seq cache 10000;
create table swe_demo (id number primary key
                      ,col1 number
                      ,col2 varchar2(30)
                      ,inserted_date date not null
                      ,inserted_from varchar2(30) not null);

create or replace trigger swe_demo_bri_trg
  BEFORE INSERT ON swe_demo
  FOR EACH ROW
BEGIN
   -- record needs always a key
   IF :new.id IS NULL
   THEN
      :new.id := swe_demo_seq.NEXTVAL;
   END IF;

   -- timestamp of the last changes
   :new.inserted_date := SYSDATE;
   :new.inserted_from := COALESCE(v('APP_USER'), user);
END swe_demo_bri_trg;
/

-- Performance test to insert 1 mill rows
insert into swe_demo (col1)
select level from dual connect by level <= 1000000;

1,000,000 rows inserted.
Elapsed: 00:01:12.904

-- new logic 
drop table swe_demo purge;
drop sequence swe_demo_seq;
create sequence swe_demo_seq cache 10000;
create table swe_demo
  (id number default swe_demo_seq.nextval primary key
   ,col1 number
   ,col2 varchar2(30)
   ,inserted_date date default sysdate not null
   ,inserted_from varchar2(30) default coalesce(
          sys_context('APEX$SESSION','app_user')
         , regexp_substr(sys_context('userenv','client_identifier'),'^[^:]*')
         ,sys_context('userenv','session_user')
         )
      not null);

-- Performance test to insert 1 mill rows
insert into swe_demo (col1)
select level from dual connect by level <= 1000000;

1,000,000 rows inserted.
Elapsed: 00:00:05.598

Side notes

EBR considerations

A trigger is an editionable object. A default column value is not editionable. So if the default expression needs to change, then this will be done in all editions at the same time. With a trigger we are able to choose.

Possible issues

The db_link_info context did not return what the documentation said.
This might be an issue specific to the oracle db version (12.1.0.1).

Security considerations

Consider if you want to trust the client_identifier. Any session can simply set this value and _identify_ as someone else. The apex$session context can not so easily be manipulated.

fragments

When using identity columns if you drop the table the automatically generated sequence (ISEQ$…) will stay. This is needed in case the table is flashbacked from the recycle bin. Therefore I always try to remember to purge the table.

drop table swe_demo purge;

Cleanup

Cleanup code for most of the generated objects:

drop user demo cascade;
drop table swe_demo purge;
drop sequence swe_demo_seq;

12 thoughts on “Sequence and Audit columns with Apex 5 and 12c

  1. But this only works on the new/inserted audit.
    When also need to audit last modification date/user on existing data this will not work, right?

    • Hello Bert,

      Yes this is correct. Currently there is no way to avoid the update trigger (after row update) that is used to fill audit columns. I opened a database idea for this. Feel free to vote for it. https://community.oracle.com/ideas/15760

      Also keep im mind that the performance with triggers is not bad for single row activities. However it hurts, when mass bulk inserts are done. It is more common to do bulk inserts (like data loads) than to do bulk updates. If there is a frequent need for bulk updates then often there are other ways to optimize that, like changeing the datamodel.

      regards
      Sven

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.