Introduction
In the good ol’ times database triggers were used to populate database columns like primary identifiers or audit columns.
Here is an example of such a trigger.
For sake of simplicity I concentrate on the insert trigger.
create table swe_demo (id number primary key ,col1 number ,col2 varchar2(30) ,inserted_date date not null ,inserted_from varchar2(30) not null); create sequence swe_demo_seq cache 10000; create or replace trigger swe_demo_bri_trg BEFORE INSERT ON swe_demo FOR EACH ROW BEGIN -- record needs always a key IF :new.id IS NULL THEN :new.id := swe_demo_seq.NEXTVAL; END IF; -- timestamp of the last changes :new.inserted_date := SYSDATE; :new.inserted_from := COALESCE(v('APP_USER'), user); END swe_demo_bri_trg; /
What does it do?
- The ID column is filled with a sequence value.
- The inserted_date column is filled with sysdate.
- The inserted_from column is filled with the current apex user or if that is not set, then the connected user is choosen.
It is time to revisit that functionality and see which new and maybe better conceptes are offered when the newest technologies are in place.
This was tested against Apex 5.0.1 and Oracle 12.1.0.1 .
It should work also in Apex 5.0.3 and Oracle 12.1.0.2
Solution
If you are just looking for the final code – here is my “perfect” solution.
create sequence swe_demo_seq cache 10000; create table swe_demo (id number default swe_demo_seq.nextval primary key ,col1 number ,col2 varchar2(30) ,inserted_date date default sysdate not null ,inserted_from varchar2(30) default coalesce( sys_context('APEX$SESSION','app_user') ,regexp_substr(sys_context('userenv','client_identifier'),'^[^:]*') ,sys_context('userenv','session_user') ) not null);
The inserted_from value is now more accurate.
You might have noticed that the need for an insert trigger is gone.
This increases performance considerably.
To insert 1 mill rows the old version needed more than a minute. This can be considered fast.
Elapsed: 00:01:12.904
The new version needs less than 6 seconds. This is a blizzard!
Elapsed: 00:00:05.598
I admit that the test was specifically set up to show a major difference. Feel free to set up your own more realistic test cases.
Still 12 times faster is really nice.
The test code to get this result can be seen in chapter TC4.
Considerations and test cases
How to find out about the current user
This line
COALESCE(v('APP_USER'), user)
from the example is really bad.
v(‘APP_USER’) is a bit slow, although not as slow as I suspected. It will not work on databases without apex installations. “v” is just a public synonym for an apex function that looks up session specific values in the apex repository. Surprisingly the “user” function is a much bigger drag. See TC1 below for performance comparisons.
In older apex versions a good way to find out who was logged in, was to read the client_identifier from the userenv context.
sys_context('userenv','client_identifier')
The structure of this did change between the different apex versions. In Apex 5 it now holds the APP_USER and the Session ID, e.g. SVEN:0123456789
A context is an extremly efficient way to read global session or application data.
Apex 5 introduced a new context with the namespace “APEX$SESSION”. This context is populated by the apex engine (APEX_050000.WWV_FLOW_SESSION_CONTEXT) and holds app_user, app_session and workspace_id. There is a good chance that future versions will add more parameters to this namespace.
See also: http://jeffkemponoracle.com/2015/11/24/apex-5-application-context/
To read the logged in user we fetch SYS_CONTEXT(‘APEX$SESSION’,’APP_USER’).
To read the user if there is something non-apex running, we should read the client_identifer. But there is more to consider.
database links
In many apex projects we do not insert into the local apex database, but into some remote project schema. This is often the case when I just add some reporting or monitoring solution in apex, without interfering with the real project database too much. So lets compare how the different contexts behave when done local vs. remote (over a db link).
12c introduced some changes to the userenv namespace. Especially the “current_user” is now deprecated.
Here is a list of all parameters that belong to the userenv namespace:
https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions165.htm#g1513460
The interesting ones are:
- client_identifier
Returns an identifier that is set by the application through the DBMS_SESSION.SET_IDENTIFIER procedure, the OCI attribute OCI_ATTR_CLIENT_IDENTIFIER, or the Java class Oracle.jdbc.OracleConnection.setClientIdentifier. This attribute is used by various database components to identify lightweight application users who authenticate as the same database user. - current_schema
Name of the default schema being used in the current schema. This value can be changed during the session with an ALTER SESSION SET CURRENT_SCHEMA statement. - current_user
Deprecated – Use the SESSION_USER parameter instead. - session_user
For enterprises users, returns the schema. For other users, returns the database user name by which the current user is authenticated. This value remains the same throughout the duration of the session. - dblink_info
Returns the source of a database link session. Specifically, it returns a string of the form:SOURCE_GLOBAL_NAME=dblink_src_global_name, DBLINK_NAME=dblink_name, SOURCE_AUDIT_SESSIONID=dblink_src_audit_sessionid - authenticated_identity
Returns the identity used in authentication. In the list that follows, the type of user is followed by the value returned: …
For testing purposes I just made a loopback link to the same database but a different schema. The testscript can be found in section TC2.
The following tests were done:
1) An insert from an apex page into a view that included a db link to REMOTE_B
2) An insert from an apex page, directly into a local view (LOCAL_A) which was mapped to demo.swe_demo.
5) A direct database insert in the view with the DB link (remote insert)
6) A direct database insert in the view without the db link
7) an insert from a DBA “IAMDBA” using alter session set current_schema
This was the result
Id | Some Text | Inserted Date | Inserted From | App User | Apex Session | Client Identifier | Current Schema | Current User | User | Session User | Authenticated Identity | Dblink Info |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | apex remote insert | 22-FEB-16 | SVEN | – | – | SVEN:4628609689353 | DEMO | DEMO | REMOTE_B | REMOTE_B | REMOTE_B | SOURCE_GLOBAL_NAME=DB_A.11623201 |
2 | apex local insert | 22-FEB-16 | SVEN | SVEN | SVEN | SVEN:4628609689353 | DEMO | DEMO | APEX_PUBLIC_USER | APEX_PUBLIC_USER | APEX_PUBLIC_USER | |
5 | direct remote insert | 22-FEB-16 | REMOTE_B | – | – | – | DEMO | DEMO | REMOTE_B | REMOTE_B | REMOTE_B | |
6 | direct local insert | 22-FEB-16 | LOCAL_A | – | – | – | DEMO | DEMO | LOCAL_A | LOCAL_A | LOCAL_A | |
7 | direct insert current_schema | 22-FEB-16 | IAMDBA | – | – | – | DEMO | DEMO | IAMDBA | IAMDBA | IAMDBA |
I did some more tests, for example using definer and invoker rights procedures, but they didn’t reveal any important differences.
The findings
APEX$SESSION is not populated via db link, but CLIENT_IDENTIFIER is.
Other than those two SESSION_USER has the best information and is always populated.
Therefore in an apex near environment the best expression to find out who inserted some record would be
coalesce( sys_context('APEX$SESSION','app_user') ,regexp_substr(sys_context('userenv','client_identifier'),'^[^:]*') ,sys_context('userenv','session_user') )
One should also notice that current_user returns the same values as current_session. This is contrary to the documentation in older database versions. And Oracles suggestion to use session_user instead is certainly correct, but you should be aware that then a different user might be stored than before.
Authenticated_Identity promises to be interesting in very specific security environments.
12c identity columns and default values
With 12c we can consider to use an identity column as our primary key. Also it is now possible to use a sequence as a default value for the column.
The base syntax for an identity column is
id number generated as identity
The base syntax for a default value column is
id number default swe_demo_seq.nextval
There are several differences between the two.
An identity column essentially prevents that values are inserted into this column. So that it is ensured that always the sequence was used to populate the data. The default value column uses the sequence only if null was inserted. This is essentially the same functionality what the trigger did.
It is possible to setup an identity column to behave almost the same as the default value column.
The extended syntax for such an identity column is
id number generated by default on null as identity (cache 10000)
I slightly prefer the default value column syntax for two reasons
- The default value syntax is also used for the two audit columns. So it is consistent to use the same mechanism.
- Default value columns allow more control over the sequence object.
I plan to write an extra blog post to discuss the differences in more detail.
TC3 shows an performance comparison between trigger logic, default value and identity column. Default values and identity are equally fast. But the trigger looses by a large margin!
Various testcases
For most cases the results should be consistent in other environments too. However always test it your own. Here are the scripts that I used. They might help to setup your own test cases.
TC1) Compare performance of different expressions to read the client identifer
The following plsql block was used and run
set time on set timing on declare v_result varchar2(100); begin for i in 1..1000000 loop v_result := ##EXPRESSION##; end loop; end; /
This is the output for several tested expressions.
Watch out that some expressions return different results.
I always did three runs to see if the results are consistent and copied one of the elapsed times to the table.
sys_context('userenv','client_identifier')
00:00:02.436
This is the base. It should not be possible to go faster.
substr(sys_context('userenv','client_identifier'),1 ,instr(sys_context('userenv','client_identifier'),':')-1)
00:00:04.288
Finds the colon and returns everything before that. But returns NULL if no colon is in the client_identifer.
substr(sys_context('userenv','client_identifier'),1, coalesce(nullif(instr(sys_context('userenv','client_identifier'),':'),0)-1, length(sys_context('userenv','client_identifier'))))
00:00:06.311
A little complex to read, isn’t it…
regexp_substr(sys_context('userenv','client_identifier'),'^[^:]*')
00:00:03.459
This is the winner!
translate(sys_context('userenv','client_identifier'),'A:0123456789','A')
00:00:05.663
Surprisingly slow
It surprised me that the REGEXP_SUBSTR expression was the best performing. Previous comparisons between regular expressions and substr/instr combinations always went in favor of substr. It seems either the internal code was optimized by oracle. Or more likely that this specific task (simple search from the beginning of the string) is well suited for a fast regexp search.
In general all expressions were fast. Interesting side note is that REGEXP_SUBSTR and TRANSLASTE were a tiny bit slower for the second identifier, while the other expressions performed equally fast.
Some other expressions:
user
Elapsed: 00:00:20.652
Really really slow!
sys_context('APEX$SESSION','app_user')
00:00:01.549
Blazingly fast!
v('APP_USER')
00:00:05.646
Not as slow as I thought!
TC2) Analyze the value of differnt contexts with regards to DB links
This script can be run as DBA to setup a demo user.
Make sure you do not have a schema named “demo” already in your system.
create user demo identified by demo; grant unlimited tablespace to demo; drop table demo.swe_demo ; drop sequence demo.swe_demo_seq; create table demo.swe_demo (id number primary key ,some_text varchar2(30) ,inserted_date date not null ,inserted_from varchar2(30) not null ,c_app_user varchar2(30) ,c_apex_session varchar2(30) ,c_client_identifier varchar2(100) ,c_current_schema varchar2(30) ,c_current_user varchar2(30) ,c_user varchar2(30) ,c_session_user varchar2(30) ,c_authenticated_identity varchar2(100) ,c_dblink_info varchar2(100) ,c_external_name varchar2(30) ); create sequence demo.swe_demo_seq cache 10000; create or replace trigger demo.swe_demo_bri_trg BEFORE INSERT ON demo.swe_demo FOR EACH ROW BEGIN -- record needs a key IF :new.id IS NULL THEN :new.id := swe_demo_seq.NEXTVAL; END IF; -- timestamp of last changes :new.inserted_date := SYSDATE; :new.c_app_user := v('APP_USER'); :new.c_apex_session := sys_context('APEX$SESSION','app_user'); :new.c_client_identifier := sys_context('userenv','client_identifier'); :new.c_current_schema := sys_context('userenv','current_schema'); :new.c_current_user := sys_context('userenv','current_user'); :new.c_user := user; :new.c_authenticated_identity := sys_context('userenv','authenticated_identity'); :new.c_session_user := sys_context('userenv','session_user'); :new.c_dblink_info := sys_context('userenv','dblink_info'); :new.c_external_name := sys_context('userenv','EXTERNAL_NAME'); -- recommended: :new.inserted_from := coalesce(sys_context('APEX$SESSION','app_user') ,regexp_substr(sys_context('userenv','client_identifier'),'^[^:]*') ,sys_context('userenv','session_user') ); END swe_demo_bri_trg; / show errors grant select, insert, update, delete on demo.swe_demo to SchemaB;
TC3) Performance comparison for populating PK columns
-------------------------------------------------------------------------------- -- swe 12c demo identity columns -------------------------------------------------------------------------------- set time on set timing on -- old logic using trigger drop table swe_demo ; drop sequence swe_demo_seq; create table swe_demo (id number primary key, col1 number, col2 varchar2(30)); create sequence swe_demo_seq cache 10000; create or replace trigger swe_demo_bri_trg BEFORE INSERT ON swe_demo FOR EACH ROW BEGIN -- record needs a key IF :new.id IS NULL THEN :new.id := swe_demo_seq.NEXTVAL; END IF; END swe_demo_bri_trg; / -- Performance test to insert 1 mill records insert into swe_demo (col1) select level from dual connect by level <= 1000000; -- result 1,000,000 rows inserted. Elapsed: 00:00:36.854 -- setup 12c logic using default value sequences drop table swe_demo ; drop sequence swe_demo_seq; create sequence swe_demo_seq cache 10000; create table swe_demo (id number default swe_demo_seq.nextval primary key, col1 number, col2 varchar2(30)); -- performance test to insert 1 mill records insert into swe_demo (col1) select level from dual connect by level <= 1000000; -- result 1,000,000 rows inserted. Elapsed: 00:00:04.068 -- 12c logic using identity column drop table swe_demo ; drop sequence swe_demo_seq; create table swe_demo (id number generated as identity primary key, col1 number, col2 varchar2(30)); -- Performance test to insert 1 mill records insert into swe_demo (col1) select level from dual connect by level <= 1000000; --result 1,000,000 rows inserted. Elapsed: 00:00:08.626 There is a performance difference compared to the default value solution. This is purely based upon the sequence cache The identity column was created with a default cache size of 20. --Try to manually insert a row into the identiy column insert into swe_demo (id, col1) values (null, -1); SQL Error: ORA-32795: cannot insert into a generated always identity column 32795.0000 - "cannot insert into a generated always identity column" *Cause: An attempt was made to insert a value into an identity column created with GENERATED ALWAYS keywords. *Action: A generated always identity column cannot be directly inserted. Instead, the associated sequence generator must provide the value. ; -- use non default settings for the identity column -- 12c logic using identity column drop table swe_demo purge; create table swe_demo (id number generated by default on null as identity (cache 10000) primary key , col1 number, col2 varchar2(30)); -- Performance test to insert 1 mill records insert into swe_demo (col1) select level from dual connect by level <= 1000000; --result 1,000,000 rows inserted. Elapsed: 00:00:03.763 -- test insert insert into swe_demo (id, col1) values (null, -1); 1 row inserted. insert into swe_demo (id, col1) values (-1, -1); 1 row inserted. It seems as this is even a tiny bit faster than the default value solution. The test results were influenced heavily by other network or database activities. Both execution times are very close. The top score however was done using an identity column.
TC4) Extend performance test TC3 with audit columns
-------------------------------------------------------------------------------- -- Adding audit columns to the previous example -- Final comparison -------------------------------------------------------------------------------- -- old logic drop table swe_demo purge; drop sequence swe_demo_seq ; create sequence swe_demo_seq cache 10000; create table swe_demo (id number primary key ,col1 number ,col2 varchar2(30) ,inserted_date date not null ,inserted_from varchar2(30) not null); create or replace trigger swe_demo_bri_trg BEFORE INSERT ON swe_demo FOR EACH ROW BEGIN -- record needs always a key IF :new.id IS NULL THEN :new.id := swe_demo_seq.NEXTVAL; END IF; -- timestamp of the last changes :new.inserted_date := SYSDATE; :new.inserted_from := COALESCE(v('APP_USER'), user); END swe_demo_bri_trg; / -- Performance test to insert 1 mill rows insert into swe_demo (col1) select level from dual connect by level <= 1000000; 1,000,000 rows inserted. Elapsed: 00:01:12.904 -- new logic drop table swe_demo purge; drop sequence swe_demo_seq; create sequence swe_demo_seq cache 10000; create table swe_demo (id number default swe_demo_seq.nextval primary key ,col1 number ,col2 varchar2(30) ,inserted_date date default sysdate not null ,inserted_from varchar2(30) default coalesce( sys_context('APEX$SESSION','app_user') , regexp_substr(sys_context('userenv','client_identifier'),'^[^:]*') ,sys_context('userenv','session_user') ) not null); -- Performance test to insert 1 mill rows insert into swe_demo (col1) select level from dual connect by level <= 1000000; 1,000,000 rows inserted. Elapsed: 00:00:05.598
Side notes
EBR considerations
A trigger is an editionable object. A default column value is not editionable. So if the default expression needs to change, then this will be done in all editions at the same time. With a trigger we are able to choose.
Possible issues
The db_link_info context did not return what the documentation said.
This might be an issue specific to the oracle db version (12.1.0.1).
Security considerations
Consider if you want to trust the client_identifier. Any session can simply set this value and _identify_ as someone else. The apex$session context can not so easily be manipulated.
fragments
When using identity columns if you drop the table the automatically generated sequence (ISEQ$…) will stay. This is needed in case the table is flashbacked from the recycle bin. Therefore I always try to remember to purge the table.
drop table swe_demo purge;
Cleanup
Cleanup code for most of the generated objects:
drop user demo cascade; drop table swe_demo purge; drop sequence swe_demo_seq;
Great post. Very interesting. We would still need a trigger for update auditing i.e. Updated_date, Updated_by.
Awesome.. I just today was looking for a solution to do this.. and nice! thanks you
This is really good stuff, I think it’s time for a change. Thank you!
Tx for share With the apex comunity, great tip!
Great post, Thanks!
Great post!
If you would like to see this information from this post incorporated into a (short, 10 min) presentation format, try this
https://prezi.com/g0cew4clllzk/the-perfect-trigger/
Thank you, Sven.
[…] This new 12c “default on null” feature can be used to replace the typical BEFORE ROW INSERT trigger. More info how to do this in Sequence and Audit columns with Apex 5 and 12c […]
[…] Sven Weller about the perfect insert trigger […]
But this only works on the new/inserted audit.
When also need to audit last modification date/user on existing data this will not work, right?
Hello Bert,
Yes this is correct. Currently there is no way to avoid the update trigger (after row update) that is used to fill audit columns. I opened a database idea for this. Feel free to vote for it. https://community.oracle.com/ideas/15760
Also keep im mind that the performance with triggers is not bad for single row activities. However it hurts, when mass bulk inserts are done. It is more common to do bulk inserts (like data loads) than to do bulk updates. If there is a frequent need for bulk updates then often there are other ways to optimize that, like changeing the datamodel.
regards
Sven
[…] In an older and quite successful blog post of mine I tried to find the “perfect” solution for audit columns. See: sequence-and-audit-columns-with-apex-5-and-12c. […]