fun with calendars and dates

Intro

1911_Ottoman_Calendar
By Unknown – 1911 Ottoman calendar, Public Domain, Link

Calendar and date arithmetic is not easy. And not because of the database, but because the stuff exists in the real word. The database does an excellent job to reflect many or most of the strange attributes that calendar systems have.

Here is a collection of surprising, quirky or interesting things around usage of date and time values in an Oracle database. A lot of those things have nothing to do with the database, but instead with the complexity of the world.

By examining those edge cases we might learn something about the general workings of dates, calendars and the Oracle database.

5 Fun Facts

1) We live in the year 2561

alter session set nls_calendar = 'THAI BUDDHA';
select to_char(sysdate,'YYYY') as current_year from dual;

CURRENT_YEAR
2561

q.e.d.

Reason of cause is that there are other calendars out there in the world, not only the default Gregorian one. And according to the Thai Buddha calendar we are now in 2561.

2) persons existed that were born on the 30th of february

…but only in Sweden some time ago.

I got this from a source which doesn’t exist anymore. However here is a site that seem to have the same information stored:
the-ultimate-guide-to-the-datetime-datatypes

Why is 1753 the earliest date for datetime?
Good question. It is for historical reasons. In what we sometimes refer to as the western world, we have had two calendars in modern time: the Julian and the Gregorian calendars. These calendars were a number of days apart (depending on which century you look at), so when a culture that used the Julian calendar moved to the Gregorian calendar, they dropped from 10 to 13 days. Great Britain made this shift in 1752 (1752-09-02 were followed by 1752-09-14). An educated guess why Sybase selected 1753 as earliest date is that if you were to store an earlier date than 1753, you would also have to know which country and also handle this 10-13 day jump. So they decided to not allow dates earlier than 1753. Note, however that other countries did the shift later than 1752. Turkey, for instance, did it as late as 1927.

Being Swedish, I find it a bit amusing that Sweden had the weirdest implementation. They decided to skip the leap day over a period of 40 years (from 1700 to 1740), and Sweden would be in sync with the Gregorian calendar after 1740 (but meanwhile not in sync with anyone). However, in 1704 and 1708 the leap day wasn’t skipped for some reason, so in 1712 which was a leap year, they inserted yet an extra day (imagine being born in Feb 30!) and then did the shift over a day like everyone else, in 1753.

Does the database know about this? Lets find out:

alter session set nls_date_language='SWEDISH';
alter session set nls_territory='SWEDEN';
select to_date('30-02-1712','dd-mm-YYYY') from dual;
ORA-01839: date not valid for month specified

Using a date literal instead of TO_DATE doesn’t help either.

select to_char(date '1712-02-30') from dual;
ORA-01847: day of month must be between 1 and last day of month

So this doesn’t work. Why? Because the 30th of february only existed in the Julian Calendar System. The Oracle Database doesn’t support a Julian Calendar – we can not choose calendar systems, that are not in use anymore.

Btw. the 29th of februar 1712 does exist. Even in Sweden.

select to_char(date '1712-02-29','fmddth Month YYYY') from dual;
29th Februari 1712

The article also briefly mentioned that Turkey was the last country to switch to the Gregorian Calendar. Before that, they had the Rumi Calendar, which is a variation of Julian. It differs mostly about the starting year.

Turkey did the switch as late as 1927. After the 15th February 1332 AH (Rumi) they skipped 13 days, so that the next day was the 1st March 1927 AD (Gregorian). And they switched the year too.

Again, if we ask the database then it does not know about this gap.

set pagesize 100
alter session set nls_date_language='TURKISH';
alter session set nls_territory='TURKEY';
-- 28 days later...
with twentyeight as (select level lv from dual connect by level <=28)
select to_char(date '1927-02-10' + lv,'fmddth Month YYYY')
from twentyeight;
11th Şubat 1927
12th Şubat 1927
13th Şubat 1927
14th Şubat 1927
15th Şubat 1927
16th Şubat 1927
17th Şubat 1927
18th Şubat 1927
19th Şubat 1927
20th Şubat 1927
21st Şubat 1927
22nd Şubat 1927
23rd Şubat 1927
24th Şubat 1927
25th Şubat 1927
26th Şubat 1927
27th Şubat 1927
28th Şubat 1927
1st Mart 1927
2nd Mart 1927
3rd Mart 1927
4th Mart 1927
5th Mart 1927
6th Mart 1927
7th Mart 1927

Since the leaped days existed in the Gregorian Calendar, there is no reason to leave them out.

Interestingly in Turkey there should still be living persons that were born in or before 1332!
(older than 92 years)

3) There are no days between 4 Oct 1582 and 15 Oct 1582

select to_char(date '1582-10-4','ddth Mon YYYY') from dual;  
04th Oct 1582

+1 day

select to_char(date '1582-10-4' + 1,'ddth Mon YYYY') from dual; 
15th Oct 1582

04th Oct 1582 + 1 = 15th Oct 1582 ?

After the previous chapter the reason should be obvious.
At this time the Gregorian calendar leapt 10 days in order to catch up with the astronomical year.

From http://www.unc.edu/~rowlett/units/dictY.html#year

year (a or y or yr) [2]
a traditional unit of time usually equal to 365 or 366 days. We need a whole number of days for the calendar year used in ordinary life. Ancient astronomers knew that the year [1] is approximately 365 days long, and we now know the correct figure is approximately 365.242 days. If we use 365 as the number of days in every calendar year, the extra 0.242 day adds up quickly and causes large errors in predicting the seasons. To solve this problem, the Roman emperor Julius Caesar decreed in 46 BC that the calendar year should have 365 days generally, but that every fourth year should have an extra, or 366th, day. The longer year is called a leap year. In this Julian calendar, four years equal exactly 1461 days, so the average Julian year is exactly 365.25 days.

This was a big step toward accuracy in the calendar, but the Julian year is too long by 0.008 day, or a little over 11 minutes. By the time of the Renaissance, these 11-minute errors had accumulated to a total error of about 10 days (since the Council of Nicaea in 325 AD, which set the rules for deciding when Easter should be celebrated). The spring equinox was occurring near March 11 instead of March 21. In 1582, Pope Gregory XIII decreed that 10 days should be dropped from the calendar: the day after 1582 October 4 was October 15. To reduce future errors, the pope further decreed that years divisible by 100 are not leap years unless they are also divisible by 400. Thus 2000 and 2400 are leap years, but 2100, 2200, and 2300 are not. It took many years, but the Gregorian calendar has now been accepted as the civil calendar in all countries of the world.

With the Gregorian adjustment, there are exactly 146 097 days in every 400 years, and the average Gregorian year is exactly 365.2425 days. The Gregorian year is still too long, but by less than half a minute. It will take thousands of years for this error to accumulate to 1 day, so the calendar year and the tropical year are in good enough agreement to last us a long time.

And this time, the database with its default Gregorian calendar knows about it!

Database considers days that seem to be inside this gap, as if they were after the leaped days.

select to_char(date '1582-10-5','ddth Mon YYYY') from dual; 
15th Oct 1582

4) The north pole and the south pole are on different time zones

From wikipedia: https://simple.wikipedia.org/wiki/Time_zone

In the poles, the time is UTC in the North Pole and UTC+12 in the South Pole.

(and no this is not the “South Pole, Illinois, USA” nor the “North Pole, New South Wales, Australia”)

So in case you happen to live or work at the south pole, set your session time zone to UTC+12.

Let me simulate that for you:

col sessiontimezone format A20
col dbtimezone format A20
-- check current settings
select sessiontimezone, tz_offset(sessiontimezone),
       dbtimezone, tz_offset(dbtimezone)
from dual;
SESSIONTIMEZONE      TZ_OFFS DBTIMEZONE           TZ_OFFS
-------------------- ------- -------------------- -------
Europe/Berlin        +02:00  +02:00               +02:00

Currently I’m in Frankfurt which is the same time zone as Berlin, so 2 hours before UTC.
Since I have to do some quick work at the south pole, let’s change my session settings to reflect that.

-- change to UTC+12
alter session set time_zone ='+12:00';
Session altered.
-- how late is it here currently?
select sysdate, current_date from dual;
SYSDATE           CURRENT_DATE     
----------------- -----------------
17.07.18 10:51:12 17.07.18 20:51:12

So here at the south pole, it is almost 9 p.m. whereas the database is still in Frankfurt at 11 a.m.

After enjoying The Great White Silence I need to return.

-- change back to original
alter session set time_zone=local;
Session altered.
-- doublecheck settings
select sessiontimezone, tz_offset(sessiontimezone),
       dbtimezone, tz_offset(dbtimezone)
from dual;
SESSIONTIMEZONE      TZ_OFFS DBTIMEZONE           TZ_OFFS
-------------------- ------- -------------------- -------
Europe/Berlin        +02:00  +02:00               +02:00

It is good to be back! Timezone travelling is exhausting.

5) The word calendar comes from the Latin word Kalendae which meant the first day of the month

A calendar is essentially a list of first month days.

Getting the first day of a month is easy.

trunc(sysdate,'MM') => 1st of the current month.

But what about the first day of a week? And which week are we in?

This is complex for two major reasons. There is quite some confusion about when does a week start and what is the first week in the year. And with confusion I mean, different countries have different rules for that.

https://www.calendar-week.org/

The good thing is, the Oracle database has all this knowledge. A developer just has to understand it. As usual NLS settings will influence the behaviour.

Different date formats refer to those NLS settings.

‘D’ returns the number of the day inside the week.

‘WW’ returns the number of the week inside the year. Week 1 starts on the first day of the year and continues to the seventh day of the year.
‘IW’ returns the number of the ISO-week.

‘YYYY’ refers to the year
‘IYYY’ refers the ISO-year.

some examples

What weekday is the first day of the week?

NLS_TERRITORY influences what the first day of a week is.

alter session set NLS_DATE_LANGUAGE = 'AMERICAN';

alter session set NLS_TERRITORY = 'AMERICA';
select to_char(trunc(sysdate,'D'),'Day') as "First Day in America" from dual;
alter session set NLS_TERRITORY = 'GERMANY';
select to_char(trunc(sysdate,'D'),'Day') as "First Day in Germany" from dual;
alter session set NLS_TERRITORY = 'IRAQ';
select to_char(trunc(sysdate,'D'),'Day') as "First Day in Iraq"  from dual;
select to_char(trunc(sysdate,'IW'),'Day') as "ISO First Day" from dual;

result

Session altered.
Session altered.

First Day in America
---------
Sunday   

Session altered.

First Day in Germany
---------
Monday   

Session altered.

First Day in Iraq
---------
Saturday 

ISO First Day
---------
Monday   

NLS_TERRITORY does not directly influence what the first week in the year is! To get the week we always use formatter options (IW or WW) that directly decide, which week logic needs to be applied.

However NLS_TERRITORY sets an ISO-Week flag. Unfortunately this is not well documented. I believe this flag is partially responsible for one of the behaviours mentioned in the “Stranger Things” section.

5 Stranger Things – bug or feature?

Stranger Things logo

1) sysdate returns a date, but systimestamp does not return a timestamp

systimestamp returns a timestamp with timezone. Localtimestamp would be the function that returns a timestamp. In many cases systimestamp still is the best option to use!

These are the DATATYPES : pseudocolumns dealing with points in time.

  • DATE : sysdate, current_date
  • TIMESTAMP : localtimestamp
  • TIMESTAMP WITH TIME ZONE : systimestamp, current_timestamp
  • TIMESTAMP WITH LOCAL TIME ZONE

example
First modify the session so that session timezone and db timezone differs.

ALTER SESSION SET TIME_ZONE='+10:00';

show nls
DB_TIMEZONE +01:00
SESSION_TIMEZONE +10:00
SESSION_TIMEZONE_OFFSET +10:00

Check the pseudocolumns and convert them to timestamp if needed (using cast). The same would happen when the value is stored in a timestamp column. The database implicitly does the conversion.

select to_char(sysdate,'DD-MM-RR HH24:MI:SS') as "sysdate",
       to_char(localtimestamp,'DD-MM-RR HH24:MI:SS') as "localTS",
       to_char(current_timestamp,'DD-MM-RR HH24:MI:SS TZH') as "currentTS",
       to_char(systimestamp,'DD-MM-RR HH24:MI:SS TZH') as "sysTS",
       to_char(cast(current_timestamp as timestamp),'DD-MM-RR HH24:MI:SS') as "currentTS_converted",
       to_char(cast(systimestamp as timestamp),'DD-MM-RR HH24:MI:SS') as "sysTS_converted"
from dual;

Result (timestamp):

sysdate	              16-07-18 15:01:58
localTS	              16-07-18 23:01:58
currentTS             16-07-18 23:01:58 +10	
sysTS	              16-07-18 15:01:58 +02
currentTS_converted   16-07-18 23:01:58
sysTS_converted       16-07-18 15:01:58

Note that localtimestamp and current_timestamp returns (and stores) a different hour than sysdate. Systimestamp returns the same hour after conversion and therefore matches sysdate.

Better would be to have a column that stores the timezone information too. Either timestamp with time zone or timestamp with local time zone. Since TZ info is stored, it does not matter anymore which datatype has the source.


select to_char(sysdate,'DD-MM-RR HH24:MI:SS') as "sysdate",
       to_char(localtimestamp,'DD-MM-RR HH24:MI:SS') as "localTS",
       to_char(current_timestamp,'DD-MM-RR HH24:MI:SS TZH') as "currentTS",
       to_char(systimestamp,'DD-MM-RR HH24:MI:SS TZH') as "sysTS",
       to_char(cast(localtimestamp as timestamp with local time zone),'DD-MM-RR HH24:MI:SS [TZR]') as "localTS_converted",
       to_char(cast(current_timestamp as timestamp with local time zone),'DD-MM-RR HH24:MI:SS [TZR]') as "currentTS_converted",
       to_char(cast(systimestamp as timestamp with local time zone),'DD-MM-RR HH24:MI:SS [TZR]') as "sysTS_converted"
from dual;

Result (timestamp with local time zone):

sysdate	              16-07-18 15:13:26
localTS	              16-07-18 23:13:26
currentTS             16-07-18 23:13:26 +10	
sysTS	              16-07-18 15:13:26 +02
localTS_converted     16-07-18 23:13:26 [+10:00]
currentTS_converted   16-07-18 23:13:26 [+10:00]
sysTS_converted       16-07-18 23:13:26 [+10:00]

As we can see the result is always the same point In Time.

2) Micro Intervals

And of cause there are also datatypes that deal with durations.

  • NUMBER
  • INTERVAL YEAR TO MONTH
  • INTERVAL DAY TO SECOND

Why are there two interval types? Because those are independend measurement systems for a duration of time. And there is no correct conversion possible from one into the other.

Consider this: 29 days (interval day to second) could be slightly less, slightly more or sometimes even exactly one 1 month (interval year to month). We can only do this conversion if we additionally have some reference point, like the starting day. Without this reference point in time, we can’t correctly convert from one system of measurement into the other.

We can convert from number to interval using the functions numtoDSinterval or numtoYMinterval.

Let’s do some workday calculations.

My typical day of work is from (ok I might understate a bit) is from 9:30 to 18:30.

select to_date('18:30','HH24:MI') - to_date('9:30','HH24:MI') from dual;
0.375 

A date minus a date returns a number. This is the result in days. So I worked for 0.375 days. Lets see that in interval format.

select numtoDSinterval(0.375, 'day') from dual;
+00 09:00:00.000000

Looks about right!

Can we also see that in a Month interval ? The MONTHS_BETWEEN functions gives us the result in number (= months) for the difference between two dates.

select months_between(to_date('19:30','HH24:MI')
                     ,to_date('10:30','HH24:MI')) 
from dual;
0

Hm… 9 hours equal 0 months…

Let’s assume somebody has a large shift requiring to work for more than 24 hours.

-- how many months did he work?
select months_between(to_date('2018-07-02 19:30','YYYY-MM-DD HH24:MI')
                     ,to_date('2018-07-01 07:15','YYYY-MM-DD HH24:MI'))
from dual;
0.0487231182795698924731182795698924731183

And as an interval…

select numtoYMinterval(0.0487, 'MONTH') from dual;
+00-01

Hmm… 0.0488 Months => 0 Years 1 Month ?

Is it always rounded up?

So what would a larger result be:

select numtoYMinterval(1.0487, 'MONTH') from dual;
+00-01

Hmm… 1.0487 Months => 0 Years 1 Month

This time it is not rounded up, but rounded down.

How about some more micro intervals?

numtoYMinterval(0.001, 'MONTH') => +00-00
numtoYMinterval(0.009, 'MONTH') => +00-00
numtoYMinterval(0.01,  'MONTH') => +00-01
numtoYMinterval(0.49,  'MONTH') => +00-01
numtoYMinterval(0.5,   'MONTH') => +00-01
numtoYMinterval(1.01,  'MONTH') => +00-01
numtoYMinterval(1.5,   'MONTH') => +00-02
numtoYMinterval(2.01,  'MONTH') => +00-02
numtoYMinterval(2.5,   'MONTH') => +00-03

Result

For very small values (0.01-0.49) numtoYMinterval rounds up, instead of rounding down. For larger values, the rounding is consistent. Those values are rounded down, as expected.

This feature was implemented somewhere between Oracle DB version 10 and 11. It took me a while, but now it is accepted as a unpublished bug (26244914).

Recommendation: Avoid numtoYMinterval. There are several strange edge cases with that function.

3) date format “J” – Julian

We can use the J format mask to spell out numbers (Jsp).
J is called Julian day; the number of days since January 1, 4712 BC.

Let’s play around with it a little.

select to_char(sysdate,'J') from dual;

2458317

Of cause when you execute it, you will get a different (larger) number. Time passed.

We use this number and substract it from sysdate, then add the number of days we are interested in. This time using “sp” = spelling in the format parameter.

select to_char((sysdate-(2458317-5)),'Jsp') from dual
Five

Cool! This way we can spell out any number. Unfortunatly this only works in English, not in other languages.

Important side note: The calculation must be like this

sysdate-(2458317-5)

Although mathematically the same, this would have resulted in an error

(sysdate-2458317)+5
ORA-01841: (full) year must be between -4713 and +9999, and not be 0

Problem is that the resulting date value is invalid (before the beginning of time).

To avoid hardcoding 2458317, we can calculate it from sysdate.

select to_char(sysdate-(to_number(to_char(sysdate,'J'))-123),'Jsp') spell_number from dual;
One Hundred Twenty-Three

Ok let’s try a large number…

select to_char(sysdate-(to_number(to_char(sysdate,'J'))-1721116),'Jsp') spell_number from dual
One Million Seven Hundred Twenty Thousand Seven Hundred Fifty-One

And one more…

select to_char(sysdate-(to_number(to_char(sysdate,'J'))-1721117),'Jsp') spell_number from dual
000000000000000000000000000000000000000000000000000000000000000000000000000000

Wat!? So spelling out One Million Seven Hundred Twenty Thousand Seven Hundred Fifty-One is ok, but One Million Seven Hundred Twenty Thousand Seven Hundred Fifty-Two not…

Hold on a minute. Even the first number was not spelled correctly!
1721116 != One Million Seven Hundred Twenty Thousand Seven Hundred Fifty-One

As so often there is a reason for that. The difference between both numbers is exactly 365. Which is our first clue. We need to understand which dates are we talking about.

The following rows depend heavily on the database version it is running.

This is from a 12.2.0.1 DB. For readability, I removed several hundred rows from the output.

with testdata as (select level lv, 1721340-300+level days# from dual connect by level <= 400)
select lv, days#, 
       to_char(sysdate-(to_number(to_char(sysdate,'J'))-days#),'YYYY-MM-DD A.D.') dy,
       to_char(sysdate-(to_number(to_char(sysdate,'J'))-days#),'J') Julian,
       to_char(sysdate-(to_number(to_char(sysdate,'J'))-days#),'Jsp') spelled_number
from testdata;
LV	DAYS#	DY		JULIAN	SPELLED_NUMBER
1	1721041	0001-12-15 B.C.	1721041	One Million Seven Hundred Twenty-One Thousand Forty-One
2	1721042	0001-12-16 B.C.	1721042	One Million Seven Hundred Twenty-One Thousand Forty-Two
3	1721043	0001-12-17 B.C.	1721043	One Million Seven Hundred Twenty-One Thousand Forty-Three
4	1721044	0001-12-18 B.C.	1721044	One Million Seven Hundred Twenty-One Thousand Forty-Four
5	1721045	0001-12-19 B.C.	1721045	One Million Seven Hundred Twenty-One Thousand Forty-Five
6	1721046	0001-12-20 B.C.	1721046	One Million Seven Hundred Twenty-One Thousand Forty-Six
7	1721047	0001-12-21 B.C.	1721047	One Million Seven Hundred Twenty-One Thousand Forty-Seven
8	1721048	0001-12-22 B.C.	1721048	One Million Seven Hundred Twenty-One Thousand Forty-Eight
9	1721049	0001-12-23 B.C.	1721049	One Million Seven Hundred Twenty-One Thousand Forty-Nine
10	1721050	0001-12-24 B.C.	1721050	One Million Seven Hundred Twenty-One Thousand Fifty
11	1721051	0001-12-25 B.C.	1721051	One Million Seven Hundred Twenty-One Thousand Fifty-One
12	1721052	0001-12-26 B.C.	1721052	One Million Seven Hundred Twenty-One Thousand Fifty-Two
13	1721053	0001-12-27 B.C.	1721053	One Million Seven Hundred Twenty-One Thousand Fifty-Three
14	1721054	0001-12-28 B.C.	1721054	One Million Seven Hundred Twenty-One Thousand Fifty-Four
15	1721055	0001-12-29 B.C.	1721055	One Million Seven Hundred Twenty-One Thousand Fifty-Five
16	1721056	0001-12-30 B.C.	1721056	One Million Seven Hundred Twenty-One Thousand Fifty-Six
17	1721057	0001-12-31 B.C.	1721057	One Million Seven Hundred Twenty-One Thousand Fifty-Seven
18	1721058	0001-01-01 B.C.	1720693	One Million Seven Hundred Twenty Thousand Six Hundred Ninety-Three
19	1721059	0001-01-02 B.C.	1720694	One Million Seven Hundred Twenty Thousand Six Hundred Ninety-Four
20	1721060	0001-01-03 B.C.	1720695	One Million Seven Hundred Twenty Thousand Six Hundred Ninety-Five
21	1721061	0001-01-04 B.C.	1720696	One Million Seven Hundred Twenty Thousand Six Hundred Ninety-Six
22	1721062	0001-01-05 B.C.	1720697	One Million Seven Hundred Twenty Thousand Six Hundred Ninety-Seven
23	1721063	0001-01-06 B.C.	1720698	One Million Seven Hundred Twenty Thousand Six Hundred Ninety-Eight
24	1721064	0001-01-07 B.C.	1720699	One Million Seven Hundred Twenty Thousand Six Hundred Ninety-Nine
25	1721065	0001-01-08 B.C.	1720700	One Million Seven Hundred Twenty Thousand Seven Hundred

... snip 1721066-1721112 no major change there...

73	1721113	0001-02-25 B.C.	1720748	One Million Seven Hundred Twenty Thousand Seven Hundred Forty-Eight
74	1721114	0001-02-26 B.C.	1720749	One Million Seven Hundred Twenty Thousand Seven Hundred Forty-Nine
75	1721115	0001-02-27 B.C.	1720750	One Million Seven Hundred Twenty Thousand Seven Hundred Fifty
76	1721116	0001-02-28 B.C.	1720751	One Million Seven Hundred Twenty Thousand Seven Hundred Fifty-One
77	1721117	0000-00-00 0000	0000000	000000000000000000000000000000000000000000000000000000000000000000000000000000
78	1721118	0001-03-01 B.C.	1720752	One Million Seven Hundred Twenty Thousand Seven Hundred Fifty-Two
79	1721119	0001-03-02 B.C.	1720753	One Million Seven Hundred Twenty Thousand Seven Hundred Fifty-Three
80	1721120	0001-03-03 B.C.	1720754	One Million Seven Hundred Twenty Thousand Seven Hundred Fifty-Four

... snip 1721120-1721420 no major change there...

381	1721421	0001-12-29 B.C.	1721055	One Million Seven Hundred Twenty-One Thousand Fifty-Five
382	1721422	0001-12-30 B.C.	1721056	One Million Seven Hundred Twenty-One Thousand Fifty-Six
383	1721423	0001-12-31 B.C.	1721057	One Million Seven Hundred Twenty-One Thousand Fifty-Seven
384	1721424	0001-01-01 A.D.	1721424	One Million Seven Hundred Twenty-One Thousand Four Hundred Twenty-Four
385	1721425	0001-01-02 A.D.	1721425	One Million Seven Hundred Twenty-One Thousand Four Hundred Twenty-Five
386	1721426	0001-01-03 A.D.	1721426	One Million Seven Hundred Twenty-One Thousand Four Hundred Twenty-Six
387	1721427	0001-01-04 A.D.	1721427	One Million Seven Hundred Twenty-One Thousand Four Hundred Twenty-Seven
388	1721428	0001-01-05 A.D.	1721428	One Million Seven Hundred Twenty-One Thousand Four Hundred Twenty-Eight
389	1721429	0001-01-06 A.D.	1721429	One Million Seven Hundred Twenty-One Thousand Four Hundred Twenty-Nine
390	1721430	0001-01-07 A.D.	1721430	One Million Seven Hundred Twenty-One Thousand Four Hundred Thirty
391	1721431	0001-01-08 A.D.	1721431	One Million Seven Hundred Twenty-One Thousand Four Hundred Thirty-One
392	1721432	0001-01-09 A.D.	1721432	One Million Seven Hundred Twenty-One Thousand Four Hundred Thirty-Two
393	1721433	0001-01-10 A.D.	1721433	One Million Seven Hundred Twenty-One Thousand Four Hundred Thirty-Three
394	1721434	0001-01-11 A.D.	1721434	One Million Seven Hundred Twenty-One Thousand Four Hundred Thirty-Four
395	1721435	0001-01-12 A.D.	1721435	One Million Seven Hundred Twenty-One Thousand Four Hundred Thirty-Five
396	1721436	0001-01-13 A.D.	1721436	One Million Seven Hundred Twenty-One Thousand Four Hundred Thirty-Six
397	1721437	0001-01-14 A.D.	1721437	One Million Seven Hundred Twenty-One Thousand Four Hundred Thirty-Seven
398	1721438	0001-01-15 A.D.	1721438	One Million Seven Hundred Twenty-One Thousand Four Hundred Thirty-Eight
399	1721439	0001-01-16 A.D.	1721439	One Million Seven Hundred Twenty-One Thousand Four Hundred Thirty-Nine
400	1721440	0001-01-17 A.D.	1721440	One Million Seven Hundred Twenty-One Thousand Four Hundred Forty

Two strange things to observe. If we look closely we can see that the year 1 BC repeats itself two times. For example line 17 and line 383 both show the 31st of December 1 BC. I now say 1 BC[1] and 1 BC[2] to distinguish the entries for those years. Second observation is that the 29th of February 1 BC[2] exists and is kind of undefined (0000-00-00 0000 0000000). The 29th of februar in year 1 BC[1] does not exist in the data. This is not shown in the sample set, but you can easily modify the query to see for yourself.

In older DB versions the second 1 BC occurence would be year 0. For which some of the date functions also return the error message

ORA-01841: (full) year must be between -4713 and +9999, and not be 0

I believe that there was some half baked bug fix at oracle. Year 0 was aliased as Year 1 BC. This also explains, why there seems to be a leap day. Year 0 would have a leap day according to the logical rules (as defined by Pope Gregory XIII). In older database versions any day from the year 0 resulted in 0000-00-00 0000. Which also created problems. Fortunatly the issue only seems to appear when playing around with the J(ulian) format. And even then only for some very large values. The chance to get bugs in real world systems is extremly slim.

some more examples
Date based comparison

select case when 1721057 = 1721423  
            then 'equal' else 'unequal' end as "check" 
from dual;
unequal

As expected those two numbers are not equal.
Compare the same numbers, but added to the same date.

select case when to_date('01-01-4712 BC','DD-MM-YYYY AD') + 1721057 
               = to_date('01-01-4712 BC','DD-MM-YYYY AD') + 1721423 
            then 'equal' else 'unequal' end as "check"
from dual;
equal

Now they are equal.

consecutive day check

Negative years within a date literal are BC years. Here follows a shot list of small date calculations around the problematic years and days.

select to_char(date '-0001-12-31','DD-Mon-YYYY AD') from dual;
31-Dec-0001 BC
select to_char(date '-0001-12-31' + 1,'DD-Mon-YYYY AD') from dual;
01-Jan-0001 AD

31-Dec-0001 BC + 1 = 01-Jan-0001 AD

select to_char(date '-0001-01-01' - 1,'DD-Mon-YYYY AD') from dual;
31-Dec-0002 BC

01-Jan-0001 BC – 1 = 31-Dec-0002 BC

select to_char(date '-0002-12-31' + 1,'DD-Mon-YYYY AD') from dual;
01-Jan-0001 BC

31-Dec-0002 BC + 1 = 01-Jan-0001 BC

select to_char(date '-0001-02-28' + 1,'DD-Mon-YYYY AD') from dual;
01-Mar-0001 BC

28-Feb-0001 BC = 01-Mar-0001 BC

Seems about right.

At least here Oracle adhers to ISO 8601 regulation “every date must be consecutive”.

And some more edge cases

select to_char(date '-4712-1-1','YYYY A.D.') from dual;
4712 B.C.
select to_char(date '-1000000','YYYY A.D.') from dual;
ORA-01841: (full) year must be between -4713 and +9999, and not be 0

So there is no 1 Million Years B.C. in the Oracle DB.

4) The first ISO-year day for 1st Januar 2021 is in 2019

select to_char(trunc(date '2021-01-01','IYYY'),'YYYY-MM-DD') from dual;

2019-12-30

Now this looks strange! It happens in all supported DB versions.
According to the docs IYYY is a supported format model for TRUNC. It is specified as “Year containing the calendar week, as defined by the ISO 8601 standard“.
Note that the doc mentions the calendar week. The 1st Januar 2021 is in calendar week (IW) 53. That week still is part of year 2020. And the first day (Monday) of the first ISO week of 2020 is 30th of December 2019.

So it seems as if

TRUNC(:date,'IYYY') 

really does:

TRUNC(
      TRUNC(
            TRUNC(:date,'IW')  -- first day of the iso-week
     ,'YEAR') -- first day of the calendar YEAR regardless of iso
,'IW') -- first day of iso-week

I think this is illogical behaviour and as such should be considered a bug. However it is correct in terms of the ISO 8601 – which calls it the “ISO week-numbering year”. So most likely it will not be fixed.

Remember: Always truncate useing ‘YEAR’ or ‘YYYY’, never use ‘IYYY’ it is just to confusing.

5) lateral SQL injection

Consider the following code. Is it safe?

create or replace procedure date_proc(p_date in date) is
  stmt varchar2(200);
  res  varchar2(100);
begin
  stmt:='select object_name from all_objects where created = ''' || p_date ||'''';
  dbms_output.enable;
  dbms_output.put_line(stmt);
  execute immediate stmt into res;
  dbms_output.put_line('result:'||res);
exception
  when no_data_found then
    dbms_output.put_line('result: no objects found');
end;
/

set serveroutput on
execute date_proc(sysdate);

Dynamic sql is always at risk for sql injection. However since that code makes sure the value is a correct date it doesn’t seem possible to tamper with the code.

This is just some demo code. It has other issues as well (like removing time from the input).

Afaik David Litchfield was the first to describe the potential security issue with this. He called it the Lateral SQL injection. I prefer the name SQL injection by NLS.

See also: http://www.hexatier.com/lateral-sql-injection-in-oracle-database/

Consider the following NLS setting. Any client can modify its NLS environment.

alter session set nls_date_format = 'DD/MM/YYYY"'' or 1=1"--';

Then run the statement again!

set serveroutput on
execute date_proc(sysdate);
select object_name from all_objects where created = '16/07/2018 ' or 1=1--'
ORA-01422: exact fetch returns more than requested number of rows

Ok this happens, because the code doesn’t expect multiple rows to be returned. We can modify that.

alter session set nls_date_format = 'DD/MM/YYYY"'' or 1=1 and rownum=1"--';

Fortunatly the possibilites via NLS format are limited. For example the length of the injected code can only be very small. When trying more complex modifications we get

ORA-01801: date format is too long for internal buffer

However sometimes this can potentially be used as a first step for more serious hacking.

A similar exploit is possible using the NUMBER datatype (via nls_numeric_characters).

The more interesting question is, how to correct code like this.
One option is to use bind variables.

create or replace procedure date_proc(p_date in date) is
  stmt varchar2(200);
  res  varchar2(100);
begin
  stmt:='select object_name from all_objects where created = :dat';
  dbms_output.enable;
  dbms_output.put_line(stmt);
  execute immediate stmt into res using p_date ;
  dbms_output.put_line('result:'||res);
exception
  when no_data_found then
    dbms_output.put_line('result: no objects found');
end;
/

Other options include validating any concatenated input for example by using dbms_assert.

Conclusion

Calendars are difficult. Oracle implemented calendar functionality very thoroughly.
It is the duty of a developer/DBA to understand the complexities around calendars, time zones, time conversions and Time Machines.

If you found any movie references – keep them!

Advertisements

plsql collection lookup with multiple keys

problem scenario

A recent question on otn asked how to access a plsql collection using multiple keys.

It is a typical scenario to build a second collection, to be able to access the main data in the first collection. The second collection fulfills the same role as an index on a table. It allows fast access to the main data record. I usually call the second collection an “index collection”.

Here is an article by Steven Feuerstein who explains the concept of index collections in more detail: https://blogs.oracle.com/oraclemagazine/on-the-pga-and-indexing-collections

But the forum question was not how to do a simple key => value lookup, but instead have two keys (based upon record set values) and use them to access the main collection.

solution

There are two general ways.
Combine the keys into a single key or build a nested collection. I will show a quick example for both ways.

Way 1) Combined key

We can consolidate the two keys into one single key. Typically using some delimiter. And then use that combined key for the index_collection.

combinedKey := key1||':'||key2;

Of cause we need to make sure the delimiter is some value that does not exist in any of the keys.

Example
If you try to copy and run this example please note that the syntax highlighter removed the label “build_index”
You might want to add this again, just after the “– index data” comment.

set serveroutput on
declare
  cursor c is (select trunc((level+2)/3) lvkey, chr(ASCII('A')+mod(level*2,3)) letter, round(dbms_random.value(1,100)) val 
               from dual 
               connect by level <=10
                );
  type tabdata_t is table of c%rowtype index by binary_integer;   
  tabdata tabdata_t;
  
  type keylookup_t is table of binary_integer index by varchar2(100);
  keylookup keylookup_t;
begin
  -- load data
  open c;
  fetch c 
  bulk collect into tabdata; -- notice no limit clause here. For larger record sets you need to use a loop and LIMIT!
  close c;
  
  -- index data
  <>
  for i in 1..tabdata.count loop
    keylookup(to_char(tabdata(i).lvKey)||':'||tabdata(i).letter) := i;
  end loop build_index;
  -- index is now complete. 
 
   -- Test the index first
  dbms_output.put_line('Index 1B=>'|| to_char(keylookup('1:B')));
  dbms_output.put_line('Index 3A=>'|| to_char(keylookup('3:A')));

  -- now fetch the data using the index
  dbms_output.put_line('Data 1B=>'|| tabdata(keylookup('1:B')).val);
  dbms_output.put_line('Data 3A=>'|| tabdata(keylookup('3:A')).val);
end;
/

Output

Index 1B=>2
Index 3A=>9
Data 1B=>32
Data 3A=>67

PL/SQL procedure successfully completed.

Way 2) collection of collection

A collection can be part of another collection. Nesting collections means we could do a lookup using two (or more) key values where each key is used for one collection.

lookup(key1) => col2
this returns a collection (e.g. col2). We can access the elements of the collection using the second key.

col2(key2) => element

or the short form:
lookup(key1)(key2) => element

For readability we could add a record layer in between, but that is not neccessary.
lookup(key1).list(key2) => element

Example
Notice the definition of the index collection is in line 14 with the matching type definitions in line 11 and 12.

set serveroutput on
declare
  cursor c is (select trunc((level+2)/3) lvkey, chr(ASCII('A')+mod(level*2,3)) letter, round(dbms_random.value(1,100)) val 
               from dual 
               connect by level <=10
                );
  type tabdata_t is table of c%rowtype index by binary_integer;   
  
  tabdata tabdata_t;
  
  type letterlookup_t is table of binary_integer index by varchar2(10);
  type keylookup_t is table of letterlookup_t index by binary_integer; -- number not allowed!
  
  keylookup keylookup_t;
  empty_key letterlookup_t;
begin
  -- load data
  open c;
  fetch c 
  bulk collect into tabdata; -- notice no limit clause here. For larger record sets you need to use a loop and LIMIT!
  close c;
  
  -- index data
  <>
  for i in 1..tabdata.count loop
    if keylookup.exists(tabdata(i).lvKey) then
      if keylookup(tabdata(i).lvKey).exists(tabdata(i).letter) then 
        -- same key twice?
        -- maybe add the values, maybe raise an error
        raise dup_val_on_index;
      else  
        dbms_output.put_line('build index KEY='||tabdata(i).lvKey||',+letter='||tabdata(i).letter);
        keylookup(tabdata(i).lvKey)(tabdata(i).letter) := i;
      end if;      
    else -- key not in index yet
      dbms_output.put_line('build index +KEY='||tabdata(i).lvKey||',+letter='||tabdata(i).letter);
      keylookup(tabdata(i).lvKey) := empty_key;
      keylookup(tabdata(i).lvKey)(tabdata(i).letter) := i;
    end if;      
      
  end loop build_index;
  -- index is now complete. 
 
  -- Lets access the data using some combinations of keyLv and letters
  -- Test the index first
  dbms_output.put_line('Index 1B=>'|| to_char(keylookup(1)('B')));
  dbms_output.put_line('Index 3A=>'|| to_char(keylookup(3)('A')));
  --dbms_output.put_line('1F='|| keylookup(1)('F')); -- this will raise NO_DATA_FOUND
  
  -- now fetch the data using the index
  dbms_output.put_line('Data 1B=>'|| tabdata(keylookup(1)('B')).val);
  dbms_output.put_line('Data 3A=>'|| tabdata(keylookup(3)('A')).val);
  
end;
/

Output
build index +KEY=1,+letter=C
build index KEY=1,+letter=B
build index KEY=1,+letter=A
build index +KEY=2,+letter=C
build index KEY=2,+letter=B
build index KEY=2,+letter=A
build index +KEY=3,+letter=C
build index KEY=3,+letter=B
build index KEY=3,+letter=A
build index +KEY=4,+letter=C
Index 1B=>2
Index 3A=>9
Data 1B=>62
Data 3A=>34

PL/SQL procedure successfully completed.

If we try to access a collection that does not exist we get an error (usually no_data_found). Since the index collections are always sparse, this is something to keep in mind. If you are not sure if the key combination is already indexed, then either check for existence or react on the NO_DATA_FOUND.

comparison

It is not easy to compare the two approaches. The first way looks slightly less complex. It depends also how familar other developers are with collections and especially with nested collections. For many the double parenthesis syntax “myCol()()” is a little complicated at the beginning.

For very complex scenarios the first way might be the better way. It depends on data distribution (the more sparsly populated the key combinations are, the better is this way) and on how many keys (=dimensions) we have.

I once measured the performance in a system where we needed 5 dimensions (5 different keys) to access some data. The combined key lookup was faster than building a complex collection of collection of collection of collection. But the additional time to concat the key values in the end also was a large performance burden.

So in my specific case
lookup('A:B:C:D:E') >>> lookup('A')('B')('C')('D')('E')

I do not think this performance experience is representativ.

conclusion

It is possible to use two keys as an index collection to lookup data in the main collection. Nested collections is a tool that every plsql developer should know about. Only when we know our tools, we can decide when to use them or when not.

dbms_scheduler 12c – run EXTERNAL_SCRIPT

Introduction

With 12c we have several new job types for our scheduler jobs. One of them is EXTERNAL_SCRIPT. The other new job types are SQL_SCRIPT and BACKUP_SCRIPT.

From Oracle 12.2 plsql packages and type reference

‘EXTERNAL_SCRIPT’

This specifies that the job is an external script that uses the command shell of the computer running the job. For Windows this is cmd.exe and for UNIX based systems the sh shell, unless a different interpreter is specified by prefixing the first line of the script with #!.

In the past we could run an external script using the EXECUTABLE job type. This type is still available.

‘EXECUTABLE’

This specifies that the job is going to be run outside the database using an external executable. External jobs are anything that can be executed from the command line of the operating system. Anydata arguments are not supported with a job or program type of EXECUTABLE. The job owner must have the CREATE EXTERNAL JOB system privilege before the job can be enabled or run.

In general both options could do the same thing. Execute something on the host OS.
So why should we change anything? Is there a difference?

Let’s find out.

schedule an EXTERNAL_SCRIPT

setup credentials first

We can create credentials using a dbms package or via SQL Developer.

dbms_credential.create_credential(credential_name   =>  'ORACLE_OS_CREDS',
                                  username          =>  'oracle',
                                  password          =>  'oracle',
                                  comments          =>  'run scripts using oracle OS account');

In the developer VM box, the password is always oracle. That’s why I included it here. You need to use your own correct password.

For real world environments I suggest to create a specific OS account that is only allowed to execute the script and to do anything that needs to be done for this specific task but not more. This account might need the “Log On As Batch Job” Right under windows (support note #2065024.1).

For demonstration purposes I stay here with the oracle credentials.

setup a scheduled job to run a linux script

Of cause this works under windows too, but I did test it only using Oracle Linux.

Here I setup 4 slightly different examples how to run an EXTERNAL_SCRIPT job. After that we check and compare the output.

These are our for slightly different test scenarios.

  1. run a simple bash script.
  2. run a script, that has an error
  3. run a script with an error, but an exitcode=0
  4. same as 3. and use undocumented FAIL_ON_SCRIPT_ERROR argument

To test what happens if the script itself has an error, I added a change directory command pointing to a non existent directory.
This command will result in an error.

cd /abcd/efgh/ijk

Here is the command to schedule the 4 jobs. Each job has a slightly different name. The differences between one and the previous job are marked.

-- First test a script that does not produce an error
declare
  v_jobname     varchar2(200);
  v_good_script clob;
begin
   v_jobname := upper('DEMO_SCHEDULED_EXTERNAL_SCRIPT_NO_ERROR');

  -- the following line breaks are important.
  -- Do NOT remove them, they are part of the linux script.
  v_good_script :=  '#!/bin/bash
echo "Job ok!"';

   dbms_scheduler.create_job(job_name   => v_jobname,
                              job_type   => 'EXTERNAL_SCRIPT',
                              job_action => v_good_script,
                              credential_name => 'ORACLE_OS_CREDS',
                              enabled    => false,
                              auto_drop  => false
                              );
   -- run the job
   dbms_scheduler.enable(v_jobname);         

end;
/

PL/SQL procedure successfully completed.

-- Now test a script that does produce an error
declare
  v_jobname    varchar2(200);
  v_bad_script clob;
begin
   v_jobname := upper('DEMO_SCHEDULED_EXTERNAL_SCRIPT_WITH_ERROR');

   v_bad_script :=  '#!/bin/bash
cd /abcd/efgh/ijk';

   dbms_scheduler.create_job(job_name   => v_jobname,
                              job_type   => 'EXTERNAL_SCRIPT',
                              job_action => v_bad_script,
                              credential_name => 'ORACLE_OS_CREDS',
                              enabled    => false,
                              auto_drop  => false
                              );
   dbms_scheduler.enable(v_jobname);
end;
/

PL/SQL procedure successfully completed.

-- Now test a script that does produce an error but uses exit 0
declare
  v_jobname    varchar2(200);
  v_bad_script clob;
begin
   v_jobname := upper('DEMO_SCHEDULED_EXTERNAL_SCRIPT_WITH_ERROR_EXIT0');

   v_bad_script :=  '#!/bin/bash
cd /abcd/efgh/ijk
exit 0';

   dbms_scheduler.create_job(job_name   => v_jobname,
                              job_type   => 'EXTERNAL_SCRIPT',
                              job_action => v_bad_script,
                              credential_name => 'ORACLE_OS_CREDS',
                              enabled    => false,
                              auto_drop  => false
                              );

   dbms_scheduler.enable(v_jobname);
end;
/

PL/SQL procedure successfully completed.

-- run script using attribute FAIL_ON_ERROR
declare
  v_jobname    varchar2(200);
  v_bad_script clob;
begin
   v_jobname := upper('DEMO_SCHEDULED_EXTERNAL_SCRIPT_WITH_ERROR_FAILONERROR');

   v_bad_script :=  '#!/bin/bash
cd /abcd/efgh/ijk
exit 0';

   dbms_scheduler.create_job(job_name   => v_jobname,
                              job_type   => 'EXTERNAL_SCRIPT',
                              job_action => v_bad_script,
                              credential_name => 'ORACLE_OS_CREDS',
                              enabled    => false,
                              auto_drop  => false
                              );
  -- Make sure script errors result in a job error and are noticed.
   dbms_scheduler.set_attribute(
            name => v_jobname,
            attribute => 'FAIL_ON_SCRIPT_ERROR',
            value => true);                              

   -- run the job
   dbms_scheduler.enable(v_jobname);
end;
/

PL/SQL procedure successfully completed.

Note that all anonymous blocks executed successfully. This only means that we could create and start a scheduled job. It does not tell us, what the result of the job was.

To find the job result, we need to check the data dictionary view all_scheduler_job_run_details. The information is in there but only AFTER the job finished.

Results

The data dictionary gives us information about the result of the scheduler runs in the view ALL_SCHEDULER_JOB_RUN_DETAILS.

select replace(job_name,'DEMO_SCHEDULED_EXTERNAL_SCRIPT') as job_name,
       log_date, status, output, error#, errors, additional_info
from all_Scheduler_job_run_details
where job_name like 'DEMO_SCHEDULED_EXTERNAL_SCRIPT%'
--and log_date >= sysdate - interval '5' minute;

For presentation purposes I flipped rows and columns. So the next picture shows the columns from the DD view as rows.
result_run_details_transposed

Let’s go through the results step by step.

The first testcase did not have an error. status of the run = SUCCEEDED. The output column also shows the stdev output which is nice. So there is no need to spool the output into an extra file, just to be able to see later what happend. The same column is also used for jobs of type SQL_SCRIPT to return dbms_output.
 
The second testcase had an error. The job status correctly reported that the script errored with exit code = 1 (column error#). And we see what kind of error happen in the errors column.

"/tmp/job_2078996_3568888_script: line 2: cd: /abcd/efgh/ijk: No such file or directory
"

We also see that the script itself was created as a file in the /tmp folder using job and log id for the file name “job_2078996_3568888_script”.
Also note that the additional_info column says

"EXTERNAL_LOG_ID="job_2078995_3568886",
ORA-27369: job of type EXECUTABLE failed with exit code: Operation not permitted
"

This is slightly misleading, since the job_type was EXTERNAL_SCRIPT and not EXECUTABLE. And “Operation not permitted” could lead us suspecting some issue with privs (might be the case when “cd” doesn’t work) or with the credentials (definitly not the case here).

The third testcase had an error in the script, however it finished with exit 0 (=success). It makes sense that in this case the job run status is also marked as SUCCEEDED. However the error “no such file or directory” still can be found in the errors column.

The forth testcase uses a new feature. I’m not sure if that is already there in the 12.1 db version, all my tests were done under 12.2. The all_scheduler_jobs view has a new column FAIL_ON_SCRIPT_ERROR. It defaults to FALSE. We can set this as an attribute for the scheduled job.

   dbms_scheduler.set_attribute(
            name => v_jobname,
            attribute => 'FAIL_ON_SCRIPT_ERROR',
            value => true);   

This is currently undocumented, but it works, as above demo shows. I believe it is a documentation bug. The new column in the view is documented, but FAIL_ON_SCRIPT_ERROR is not in the list of allowed attributes.

The result is, that even with EXIT=0, the job status goes to FAILED. We also see a different error number 27382 instead of 1. 27382 seems to be the ORA-Error number that we also see in the additional_details column. This column now says “ORA-27382: job type EXTERNAL_SCRIPT has errors in the job action”. Not a major difference, but slightly better additional_details when we use this attribute.

Quite some interesting behaviour. So if we want to run a script that always exits with 0, we can still make our job go to FAILED if there are any errors inside that script.

I would guess that we find the same behaviour for the other new job_types SQL_SCRIPT and BACKUP_SCRIPT. I didn’t test it yet.

Comparison between EXTERNAL_SCRIPT and EXECUTABLE

Executable is the old way.

It requires to configure and start an external agent. Often this is a fairly complex task. It involves configuration of the listener, tnsnames.ora, extproc.ora and some other parts. Those are usually not under control of a developer. Additionally all external jobs that are executed using this agent run with the same OS privileges. Using credentials gives us a little more control.

Starting from 12.1 such external jobs can alternativly run with credentials. Same as I already showed for running external_scripts. So the credentials argument is only a half-baked one.

It is also difficult to track down an error in case something goes wrong. This is where EXTERNAL_SCRIPT seem much better then EXECUTABLE. I have to admit I did only some very short tests about this, but I never found a disadvantage for running a scheduled job as EXTERNAL_SCRIPT instead of EXECUTABLE.

So if you have a choice go with the newer option.

Integrate it into plsql including status check

How to setup a plsql procedure that is able to run an external_script, but also returns an exception, in case the script runs into errors?

Lets assume the external script is something like this:

# set environment
export PATH=/usr/local/bin:$PATH
export ORACLE_SID=XE
export ORAENV_ASK=NO
. /usr/local/bin/oraenv

CLASSPATH=fonts
#CLASSPATH=$CLASSPATH:$APEX_HOME/utilities
CLASSPATH=$CLASSPATH:.

export CLASSPATH

cd /opt/jasper/report1

java -jar runJasperReport.jar
exit

So this executes some jar file. I don’t want to wait until the java logic is finished, but I want to be informed if something basic goes wrong, like if the jar file couldn’t be found.

Here is an example that I used in some APEX application. After the job is started, it hangs around for a couple of seconds and checks if anything surprising did happen. The code to start the job itself is not included. But you can assume that it is done in the same packaged procedure. The script is in the v_script variable.

Do not copy it 1:1, but understand it and adapt it to your needs.

...
    v_jobname             varchar2(128);
    r_job_details         all_scheduler_job_run_details%rowtype;
    r_job                 all_scheduler_jobs%rowtype;
    c_max_check_job_tries constant binary_integer :=3;
    v_message             varchar2(1000);
    v_script              clob;
begin
...
   -- check if the scheduled job did sucessfully start.
   -- In case some error happend after starting the job, we might not notice that, since it was started in the background.
   -- e.G. ORA-27369: Job of Typ EXECUTABLE not successfull with Exit-Code: No such file or directory
   -- In such a case check scheduler data dictionary almost immediatly after job creation.
   apex_util.pause(p_seconds => 0.5);
   <>
   for i in 1..c_max_check_job_tries loop 
      -- constant is set to 3, so max. wait time = 3.5 seconds.
   
        -- read info about scheduled job
        -- started/running jobs are in scheduler_jobs, fnished jobs including results are in scheduler_job_run_details
        begin
          select * into r_job from all_scheduler_jobs where job_name = upper(v_jobname);
          logger.trace(p_message=>'Job "'||v_jobname||'" created with state='||r_job.state );
       exception
         when no_data_found then 
           -- Job might have stopped already, check run details!
           r_job.state := 'NOT FOUND';
       end;    

       if r_job.state in ('RUNNING','SCHEDULED') then
         -- RUNNING+SCHEDULED => Looks ok, Job runs,just to make sure wait for another few seconds
         apex_util.pause(p_seconds =>  1);
         
       else
         begin
           -- for any other state check details
           select * into r_job_details from all_scheduler_job_run_details where job_name = upper(v_jobname);
           logger.trace(p_message=>'Job Details "'||v_jobname||'" with status='||r_job_details.status );
           if r_job_details.status= 'FAILED' or r_job.state= 'FAILED'  then
                v_message := 'Job "'|| v_jobname||'" with Error!';
                if r_job_details.additional_info like '%ORA-27369%' or r_job_details.additional_info like '%ORA-27382%' then
                  v_message := v_message ||' Jobaction='||v_script;
                end if;  
                pk_logging.pr_logError(p_message=>v_message);
                apex_error.add_error(   p_message => v_message,
                                        p_additional_info => r_job_details.additional_info,
                                        p_display_location => apex_error.c_inline_in_notification --apex_error.c_on_error_page
                                        );
                -- step out of loop and raise an error using the OS error message                
                Raise_application_error(-20001, r_job_details.errors);
           end if;      
         exception
           when no_data_found then
             -- Job not started yet or just about to finish...
             -- consider to wait a few sec first. Then raise an error if job still not there.
             -- last try?
             if i=c_max_check_job_tries then
                logger.error(p_message=>'Job "'||v_jobname||'" was not started!');
                Raise_application_error(-20001, 'Warning! Job "'|| v_jobname||'" wasn't started (yet)! Check application log!');
             else
                -- wait 1 second until job is hopefully created
                apex_util.pause(p_seconds =>  1);
             end if;
         end;
       end if;   
   end loop get_job_info;   
   if r_job.state = 'SCHEDULED' then
      -- still scheduled? inform user
      Raise_application_error(-20001, 'Warning! Job "'|| v_jobname||'" needs longer than expected to start. Please monitor closely and informa administrator!');
   end if;  
...

Security considerations

Running external jobs is always something where we need to take extra care – so that we do not put holes into our security defense system.

I carefully watch out for two major security risks:
1) If we add something dynamically to the script that we are executing, like an extra parameter, make sure to sanitize all the inputs. Otherwise we could get some kind of injection problem.

2) The agent/account that runs our script should only get the least needed privileges. So it shoud NOT be running under user oracle, like I did in the demo.
Create a separate account for that. Name the account to something that logically points to the task that it is supposed to do.

conclusion

The new job_type EXECUTE_SCRIPT is useful.

Some of the hurdles that developers face when trying to run a host command are lowered. The script itself does not need to be deployed on the database server.

Reacting to errors in the script is possible, but we need to check the correct columns and use the proper settings.

cleanup demo jobs

-- cleanup
-- remove all the jobs
execute dbms_scheduler.drop_job(job_name => 'DEMO_SCHEDULED_EXTERNAL_SCRIPT_NO_ERROR');
execute dbms_scheduler.drop_job(job_name => 'DEMO_SCHEDULED_EXTERNAL_SCRIPT_WITH_ERROR');
execute dbms_scheduler.drop_job(job_name => 'DEMO_SCHEDULED_EXTERNAL_SCRIPT_WITH_ERROR_EXIT0');
execute dbms_scheduler.drop_job(job_name => 'DEMO_SCHEDULED_EXTERNAL_SCRIPT_WITH_ERROR_FAILONERROR');

Fetch module name from line number in package with pl/scope

Here is a small statement I am using to find the name of a submodule based upon error stack data (utl_call_stack.error_line(x) or dbms_utility.format_error_stack).

The function IDENTIFY_MODULE helps to diagnose errors. If an error happens in plsql the error stack returns only the name of a package (=unit) and the line number. Using this line number we can look up the pl/scope information and make a solid guess about the module name. For various reasons this is only a good guess, not a guarantee (see problem section below).

solution

The following code snippets will only work if the relevant packages were compiled with PLSCOPE_SETTINGS=’IDENTIFIERS:ALL’.

ALTER SESSION SET plscope_settings='identifiers:all';

SQL statement

fetch the nearest procedure or function

select * --type, name, signature, line, usage_id, usage_context_id
from all_identifiers
where object_name = :PACKAGE_NAME
and object_type= 'PACKAGE BODY'
and usage in ( 'DEFINITION','DECLARATION' )
and type in ('PROCEDURE','FUNCTION')
and line <= :LINE_NUMBER
order by line desc
fetch first 1 row only
;

This will find procedures and functions in our code that were declared just before the line of error.

There are many cases where we get some false positives.

But it is a good start.

plsql enrichment

Using a little bit of plsql we can make this logic more robust. And even get data about submodule hierachies.

Put this function in the instrumentation package of your choice (for example logger) and then use it to improve log information. How this is done in detail is out of scope for this post.

create or replace function identify_module (p_owner in varchar2, p_unit in varchar2, p_line in number) return varchar2
is
  /**************************************************************
  /* Name          :  identify_module
  **************************************************************
  * description    :  uses PLSCOPE, to get additional info about the module name of a source code line 
  *  This only works reliably for code compiled with optimizationlevel = 1
  *  higher optimization levels might move code lines. The line reported in error and backtrace stacks (run time) can differ from the lines stored in PL/scope or user_source (compile time).
  *
  * @author: Sven Weller, syntegris information solutions GmbH
  **************************************************************
  * parameter      
  * @param  : p_owner = schema name of unit
  * @param  : p_unit = package name
  * @param  : p_line = line of code, for which we would like to see the name of the modul
  * @return : concatenated submodule names  
  **************************************************************/
 cursor c_search_by_line (cv_owner in varchar2, cv_unit in varchar2, cv_line in number)
  is
    select /*+ first_rows(1) */ i.type, i.name, i.line, i.usage_id, i.usage_context_id, i.usage, i.signature
    from ALL_IDENTIFIERS i
    where i.owner = cv_owner
    and i.object_name = cv_unit
    and i.object_type = 'PACKAGE BODY'
    and i.line <= cv_line 
    -- context must be in same package body
    and i.usage_context_id in (select i2.usage_id from ALL_IDENTIFIERS i2 where i2.owner = cv_owner and i2.object_name = cv_unit and i2.object_type = 'PACKAGE BODY')
    order by line desc, usage_id asc
    ;
  
  cursor c_search_by_usage (cv_owner in varchar2, cv_unit in varchar2, cv_usage_id in number)
  is
    select /*+ first_rows(1) */ type, name, line, usage_id, usage_context_id, usage, signature
    from ALL_IDENTIFIERS
    where owner = cv_owner
    and object_name = cv_unit
    and OBJECT_TYPE = 'PACKAGE BODY'
    and usage_id = cv_usage_id 
    order by decode (usage, 'DEFINITION',1, 'DECLARATION', 2, 3), line desc,  usage_id asc
    ;

  r_result_byLine  c_search_by_line%rowtype;
  r_result_byUsage c_search_by_usage%rowtype;
  r_last_result    c_search_by_usage%rowtype;
  v_owner          all_identifiers.owner%type;
  v_modul_name     all_identifiers.name%type;
  v_first_type     all_identifiers.type%type;
  v_max_hierarchy  binary_integer := 5;
begin
  -- If owner is missing, use the current schema
  v_owner := coalesce(p_owner,sys_context('userenv','current_schema'));
  
  -- find the closest line and check its context.
  open c_search_by_line(v_owner, p_unit, p_line);
  fetch c_search_by_line into r_result_byLine;
  close c_search_by_line;

  if r_result_byLine.usage_context_id = 1 then
    -- we seem to be already in main package body.
    -- this can be either a problem during a parameter call 
    -- or the error happened in the initialisatzion part of the package
    case r_result_byLine.usage 
      when 'DEFINITION' then
        v_modul_name :=r_result_byLine.type||' '||p_unit||'.'||r_result_byLine.name;
      when 'DECLARATION' then
        v_modul_name :='declaration of '||r_result_byLine.type||' '||p_unit||'.'||r_result_byLine.name;
      else 
        v_modul_name :='body of '||p_unit;
    end case;
    
  else    
      r_result_byUsage := r_result_byLine;
      --r_result_byUsage.usage_context_id := r_result_byLine.usage_id;
      
      -- find module names
      <>
      loop 
        if r_result_byUsage.usage in ('DEFINITION', 'DECLARATION')  
           and r_result_byUsage.type in ('PROCEDURE','FUNCTION')
           and (r_last_result.signature != r_result_byUsage.signature or r_last_result.signature is null)
           then
             -- concat multiple submodule names
             v_modul_name := r_result_byUsage.name
                            ||case when v_modul_name is not null then '.'||v_modul_name end;
             v_first_type := coalesce(v_first_type, r_result_byUsage.type);
             -- remember result to compare if we get duplicate entries because of declaration->definition
             r_last_result := r_result_byUsage;
        end if;
        -- stop when package body level is reached
        exit when r_result_byUsage.usage_context_id in (0, 1) or v_max_hierarchy = 0;


        -- it seems to be a submodule, so do an additional call and fetch also the parent module
        open c_search_by_usage(p_owner, p_unit, r_result_byUsage.usage_context_id);
        fetch c_search_by_usage into r_result_byUsage;
        close c_search_by_usage;
    
        
        -- safety counter to prevent endless loops
        v_max_hierarchy := v_max_hierarchy - 1;
      end loop parent_modules;  
    
      -- add info about type (FUNCTION/PROCEDURE)
      if v_modul_name is not null then
        v_modul_name :=v_first_type||' '||p_unit||'.'||v_modul_name;
      --else   
      --  v_modul_name := '--no submodule found--';
      end if;
  end if;  
  return v_modul_name;
exception
  when no_data_found then
    return null;
end identify_module;
/

Example

Check the result for each line of some test package.
You can run this example yourself in LiveSQL .

The function had to be modified slightly to use USER_IDENTIFIERS instead of ALL_IDENTIFIERS to be able to run in LiveSQL.

select line, identify_module(user, name, line) , text 
from user_source
where name='TEST_PACKAGE_FUNC_PROC'
and type = 'PACKAGE BODY';

Result

LINE	IDENTIFY_MODULE(USER,NAME,LINE)	TEXT
1		"package body Test_Package_Func_Proc "
2		"as "
3	declaration of VARIABLE TEST_PACKAGE_FUNC_PROC.GLOBAL_VAR	" global_var number := 0;"
4	FUNCTION TEST_PACKAGE_FUNC_PROC.TEST_FUNC	" function test_func (in_val in number) return number  "
5	FUNCTION TEST_PACKAGE_FUNC_PROC.TEST_FUNC	" is "
6	FUNCTION TEST_PACKAGE_FUNC_PROC.TEST_FUNC	" begin "
7	FUNCTION TEST_PACKAGE_FUNC_PROC.TEST_FUNC	" return in_val; "
8	FUNCTION TEST_PACKAGE_FUNC_PROC.TEST_FUNC	" exception "
9	FUNCTION TEST_PACKAGE_FUNC_PROC.TEST_FUNC	"  when others then "
10	FUNCTION TEST_PACKAGE_FUNC_PROC.TEST_FUNC	"    RAISE; "
11	FUNCTION TEST_PACKAGE_FUNC_PROC.TEST_FUNC	" end; "
12	FUNCTION TEST_PACKAGE_FUNC_PROC.TEST_FUNC	""
13	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC	" procedure test_proc  (in_val in number) "
14	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC	" is "
15	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC.SUBMODULE	"   procedure submodule( in_val in number) is"
16	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC.SUBMODULE	"   begin"
17	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC.SUBMODULE	"      dbms_output.put_line (in_val); "
18	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC.SUBMODULE	"   end;  "
19	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC.SUBMODULE	" begin "
20	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC	"   submodule(in_val); "
21	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC	" exception "
22	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC	"  when others then "
23	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC	"    RAISE; "
24	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC	" end; "
25	PROCEDURE TEST_PACKAGE_FUNC_PROC.TEST_PROC	"begin"
26	body of TEST_PACKAGE_FUNC_PROC	"  global_var := 1;"
27	body of TEST_PACKAGE_FUNC_PROC	"end; "

The test package was copied and modifed based upon Hemant K Chitales “Function and Procedure in Package” LiveSQL demo.

Some problems

  • Currently only ment for package bodies
  • Compiler optimization can move code. That means the line number of an error at runtime is not the same line number as during compile time. PL/Scope only gives us compile time information. So would all_source.
    The only (bad) workaround is to compile with optimization level 1 and recreate the error, so that the correct line of error is shown.
  • Errors that happen in the declaration part of a package can not always be resolved. We might get a false positive for the previously declared object.
  • Does not inform when package body is wrapped.

Further readings

10 more SQL and PLSQL things you might not know

Since my previous blog posts about 10 Oracle SQL features you might not know and “10 Oracle plsql things you probably didn’t know
raised quite some interest, I decided to add some more unknown features. Read careful, I started to write this blog post on 1st of April, so there is an easter egg hidden somewhere in this post. If you are not sure, always test and verify for yourself.

10. NVL can also handle unusual datatypes

NVL can also handle unusual datatypes like BOOLEAN, COLLECTIONS and Advanced Datatypes (ADTs).

Example

set serveroutput on
declare
  b boolean;
begin
  if nvl(b,true) then
     dbms_output.put_line('TRUE');
  else
     dbms_output.put_line('FALSE');
  end if;
end;
/
TRUE

This is part of the sys.standard implementation.
But since boolean is only supported in PLSQL we can’t do much with that in SQL.

9. secret column name “rowlimit_$$_rownumber”

We shouldn’t use “rowlimit_$$_rownumber” or “rowlimit_$$_total” as a column name.

Here is what could happen:

select dummy as "rowlimit_$$_rownumber"
from dual
fetch first 3 rows only;

ERROR at line 1:

ORA-00918: column ambiguously defined

The reason for this can be found when we use the new 12c functionality to expand a query. Typically this is used for views, but it can also be applied to some other features. Here for the logic that does the LIMIT action.

Special thanks to OTN forum members Solomon Yakobson and padders who pointed at the issue in this thread.

What happens behind the scene is that the limit clause “fetch first 3 rows” is changed (expanded) into a subquery that adds a second column “rowlimit_$$_rownumber” to the query. This column uses the row_number analytic function and is later used to filter upon the relevant rows of the LIMIT clause. The error happens because we now have two columns with the same name.

And here is one way to see the expanded code.

set linesize 1000
set longc 1000
set long 1000
variable c clob
exec dbms_utility.expand_sql_text('select dummy from dual fetch first 3 rows only',:c)
print c
SELECT "A1"."DUMMY" "DUMMY"
FROM (  SELECT
            "A2"."DUMMY" "DUMMY",
            ROW_NUMBER() OVER (
                ORDER BY NULL
            ) "rowlimit_$$_rownumber"
        FROM "SYS"."DUAL" "A2"
    ) "A1"
WHERE "A1"."rowlimit_$$_rownumber"

“rowlimit_$$_total” has the same problem. It appears when we use PERCENT in the limit clause.

Example

select dummy as "rowlimit_$$_total"
from dual
fetch first 3 percent rows only;

ORA-00918: column ambiguously defined

And if we expand the working query we see the reason.

SELECT "A1"."DUMMY" "DUMMY"
FROM ( SELECT
            "A2"."DUMMY" "DUMMY",
            ROW_NUMBER() OVER(
                ORDER BY
                    NULL
            ) "rowlimit_$$_rownumber",
            COUNT(*) OVER() "rowlimit_$$_total"
        FROM "SYS"."DUAL" "A2"
    ) "A1"
WHERE "A1"."rowlimit_$$_rownumber"
        <= ceil("A1"."rowlimit_$$_total" * 3 / 100)

The PERCENT keyword requires to do a total count and uses this total count as a filter.

Fortunatly the chance that we by accident name our columns so is very very low.

8. Do you know all plsql pragmas?

Pragmas are instructions for the plsql compiler. There are many of them. Here is the list of pragmas I know or heared about. Not all of them are documented. Not all of them can be used by developers. Several can only be used as sys and come with additional restrictions, so they are only for Oracle internal purposes. Still they pique our curiosity.

The documented and not deprecated pragmas are in bold. At least we should know all of those.

  • autonomous_transaction

    One of the most misunderstood things in plsql.

    Defines that the plsql logic runs independently from the main transaction.

    Typical use case: To log away an error, even if the main transaction is rolled back.

    It is not a workaround for mutating table errors!

  • builtin

    Defines SQL builtin functions and operators.

    This is an internal pragma for usage in package sys.standard.

  • coverage

    This is a new pragma in 12.2.

    from the 12.2. doc: COVERAGE pragma

    The COVERAGE pragma marks PL/SQL source code to indicate that the code may not be feasibly tested for coverage. The pragma marks a specific code section. Marking infeasible code improves the quality of coverage metrics used to assess how much testing has been achieved.

  • deprecate

    Adds a compile time warning if the object is referenced. The message of the warning can be influenced. This new pragma was introduced in 12.2. We can add it to code that should be replaced. Useful in environments where multiple teams of developers call/reference the same code.

  • exception_init

    Combines a plsql exception with an exception number.

  • fipsflag

    Another internal pragma that is used in package sys.standard.

    I guess that the FIPSFLAG pragma has something to do with FIPS from NIST.

    From https://www.howtogeek.com/245859/why-you-shouldnt-enable-fips-compliant-encryption-on-windows/

    FIPS stands for “Federal Information Processing Standards.” It’s a set of government standards that define how certain things are used in the government–for example, encryption algorithms. FIPS defines certain specific encryption methods that can be used, as well as methods for generating encryption keys. It’s published by the National Institute of Standards and Technology, or NIST.

    It seems that US-government computers have a FIPSFLAG enabled. Applications that run on these machines need to be FISMA compliant to be working on those machines.

    Also interesting in that context:
    https://blogs.technet.microsoft.com/secguide/2014/04/07/why-were-not-recommending-fips-mode-anymore/

    https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.200.pdf

  • inline

    Turns submodule inlining on or off. Submodule inlining is a plsql compiler feature implemented since 10g. The compiler can rewrite plsql code so that the resulting code runs faster. Among other options the compiler can add the code from inside a module directly at the point where that code is used (optimization level 3). This is called inlining. The performance advantage is that the expensive submodule call can be avoided. The disadvantage is that the same code is repeated everywhere where the submodule was originally. But we do not have to program this.

    So we as developers can follow the DRY (don’t repeat yourself) paradigm and the optimizer tunes this code for performance. This is the best of two worlds.

  • Because the compiler already does a good job, the pragma is usually not needed. In rare cases we might want to enforce inlining even if compiled with optimization level 2.

  • interface

    Gateway for internal oracle functions to c libraries.
    It is heavily used inside the sys.standard package spec.

    An example

      --#### interface pragmas
    
      --#### Note that for any ICD which maps directly to a PVM
      --#### Opcode MUST be mapped to pes_dummy.
      --#### An ICD which invokes another ICD by flipping operands is
      --#### mapped to pes_flip, and an ICD whose result is the inverse of
      --#### another ICD is mapped to pes_invert
      --#### New ICDs should be placed at the end of this list, and a
      --#### corresponding entry must be made in the ICD table in pdz7
    
      PRAGMA interface(c,length,"pes_dummy",1);
      PRAGMA interface(c,substr,"pes_dummy",1);
      PRAGMA interface(c,instr,"pesist",1);
      PRAGMA interface(c,UPPER,"pesupp",1);
      PRAGMA interface(c,LOWER,"peslow",1);
      PRAGMA interface(c,ASCII,"pesasc");
    
  • new_names

    This is an internal pragma that restricts the use of particular new entries in package standard. It is only valid in package standard.

  • poke_mon

    This is an internal pragma that can be added by database machine learning code. So it might appear by random somewhere in your code. If the schema is pokemon enabled you can use this pragma to train your modules to react faster or to eliminate invalid input data. The pragma was introduced in 19.1.4 using the multi lingual engine (MLE). So far it is only available in autonomous databases (cloud first). If your modules have collected enough power they can be combined to overtake and replace other packages during recompilation. The ultimate goal is to remove all bad performing code from the database.

  • restrict_references
    (RNPS, WNPS, WNDS, RNDS, TRUST)

    This is an outdated pragma. I can remember setting this in an Oracle 7.3 database.
    It informs the database about the intended scope of the module. An error is raised if this pragma is violated.

    RNPS = read no package state
    WNPS = write no package state
    RNDS = read no database state
    WNDS = write no database state
    TRUST = trust me, and don’t double check if all dependend objects do also behave correctly.

    This pragma shouldn’t be needed anymore. Instead make functions DETERMINISTIC.

  • serially_reusable

    Loose all state when the call is finished. Package variables, open cursors and other plsql state is reset when the package is declared with this pragma.

  • timestamp

    This pragma sets/modifies the timestamp value of a package. Valid only in SYS (and probably only for package standard).

  • udf

    Userdefined function

    This pragma can be used if a function is mostly referenced directly inside a SQL statement. It avoids some of the additional overhead during the switch from the SQL to the PLSQL engine. Especially a simplified (less expensive) datatype check is done.

    While the udf pragma is really a great performance feature it is currently very limited. For example the function can only have numeric parameters. If one parameter is a date, then the udf pragma will silently not work anymore, so we will not gain the performance benefit. If you want to improve that behaviour feel free to vote up this database enhancement idea by @LotharFlatz.

    Btw: There are some indications that udf for functions with varchar2 parameters seem to be working in 12.1 but not anymore in 12.2. I didn’t verify this.

How many of the documented pragmas did you know? And how many of the additional ones?
Did you catch them all?

7. LoC limit

There is a limit for how many lines of code (LoC) a plsql object can have.
The limit was increased to 226 DIANA (Descriptive Intermediate Attributed Notation for Ada) nodes (~6 million LoC) in Oracle 8i. Before that it was only about 3000 Lines of Code (215 Diana Nodes).

Nowadays there are other limits that are more likely to be encountered, before the LoC limit is reached. See also: https://docs.oracle.com/en/database/oracle/oracle-database/18/lnpls/plsql-program-limits.html#GUID-00966B4C-B9A5-47D4-94AA-54AEBCC07CE9

Remember: compiler optimizations like inlining might increase your lines of code quite a bit.

6. datatype signtype

There is a datatype signtype. It allows only the numbers -1, 0 and 1.

Example:

set serveroutput on
declare
  v_val pls_integer;
  v_sign signtype;
begin
  for i in 1..10 loop
    v_val := round(dbms_random.value(-5,5));
    v_sign := sign(v_val);
    dbms_output.put_line(v_sign);
  end loop;
end;
/
-1
1
-1
-1
-1
1
0
-1
-1
0

PL/SQL procedure successfully completed.

But this is PLSQL only. In SQL we can not use this type.

create table test(id number, s signtype);

ORA-00902: invalid datatype

Interesting, but so far I never found a need to use it.

5. functions without begin..end

We can declare functions that do not have a function body (no begin..end block).
Example:

create or replace function kommaSepariert(ctx in varchar2)
  return varchar2 deterministic parallel_enable
  aggregate using kommaSepariert_ot;

The secret here is that this function is an user defined aggregation function that uses an object type. And the type body holds the function logic.
 

Here is the matching type definition
create or replace TYPE          "KOMMASEPARIERT_OT"                                          as object(
 str varchar2(4000),

 static function odciaggregateinitialize(
   sctx in out kommaSepariert_ot)
   return number,

  member function odciaggregateiterate(
    self in out kommaSepariert_ot,
    ctx in varchar2)
    return number,

  member function odciaggregateterminate(
    self in kommaSepariert_ot,
    returnval out varchar2,
    flags in number)
    return number,

  member function odciaggregatemerge(
    self in out kommaSepariert_ot,
    ctx2 kommaSepariert_ot)
    return number);
/

create or replace TYPE BODY          "KOMMASEPARIERT_OT" as

  static function odciaggregateinitialize(
    sctx in out kommaSepariert_ot)
    return number
  is
  begin
    sctx := kommaSepariert_ot(null);
    return odciconst.success;
  end;

  member function odciaggregateiterate(
    self in out kommaSepariert_ot,
    ctx in varchar2)
    return number
  is
  begin
    if self.str is not null then
      self.str := self.str ||',';
    end if;
    self.str := self.str || ctx;
    return odciconst.success;
  end;

  member function odciaggregateterminate(
    self in kommaSepariert_ot,
    returnval out varchar2,
    flags in number)
    return number
  is
  begin
    returnval := self.str;
    return odciconst.success;
  end;

  member function odciaggregatemerge(
    self in out kommaSepariert_ot,
    ctx2 kommaSepariert_ot)
    return number
  is
  begin
    if self.str is not null then
      self.str := self.str ||',';
    end if;
    self.str := self.str || ctx2.str;
    return odciconst.success;
  end;
end;
/

 

Such functions have been used in the past to combine strings. Nowadays we can use LISTAGG.
Here is a quick demonstration how it works

with testdata as(select 'abc' t from dual union all
                select 'def' t from dual union all
                select 'ghi' t from dual union all
                select 'jkl' t from dual)
select kommasepariert(t)
from testdata
;
KOMMASEPARIERT(T)
abc,def,ghi,jkl

4. The select clause can influence the number of rows returned

I’m not talking about using select DISTINCT (this is another cruel way where the select clause can change the number of rows returned).

Here is a more surprising situation. Consider those two slightly different queries.

with tbl as (select 1 val from dual union all  
            select 2 val from dual union all  
            select 3 val from dual )  
    SELECT  CASE  0  
            WHEN  0         THEN  'A'  
            WHEN  SUM (val) THEN  'B'  
            END  AS c  
    FROM    tbl;  
Result (3 rows)
A
A
A
with tbl as (select 1 val from dual union all  
            select 2 val from dual union all  
            select 3 val from dual )  
    SELECT  CASE  6  
            WHEN  0         THEN  'A'  
            WHEN  SUM (val) THEN  'B'  
            END  AS c  
    FROM    tbl;  
Result (only 1 row)
B

So 3 rows are returned if we check against 0 and 1 row is returned if we check against 6.

This is a side effect of two rules.
Rule 1: A select with an aggregate function doesn’t need a group by clause and then it is guaranteed to return a single row.
Rule 2: case statements use short-circuit evaluation.

In the first example the sum(val) was never evaluated, so no aggregation took place.

See also this otn thread where the situation was discussed.

I tested the behaviour in 12.1.0.2 SE and in 11.2.0.4 EE.
I also think this should be treated as a bug. Small changes as this to the select clause should not influence the number of rows returned.

3. Default behaviour for windowing clause in analytic functions

This is something I learned from the great Kim Berg Hansen (@Kibeha).

The default windowing clause is “RANGE BETWEEN unbounded preceding and current row”. This can sometimes lead to wrong surprising results. In most cases we should switch and use ROWS BETWEEN. It is something a developer needs to be aware of.
default_windowing
from SQL reference – Analytic Functions

Whenever the order_by_clause results in identical values for multiple rows, the function behaves as follows:

CUME_DIST, DENSE_RANK, NTILE, PERCENT_RANK, and RANK return the same result for each of the rows.

ROW_NUMBER assigns each row a distinct value even if there is a tie based on the order_by_clause. The value is based on the order in which the row is processed, which may be nondeterministic if the ORDER BY does not guarantee a total ordering.

For all other analytic functions, the result depends on the window specification. If you specify a logical window with the RANGE keyword, then the function returns the same result for each of the rows. If you specify a physical window with the ROWS keyword, then the result is nondeterministic.

SUM is one of those “other” analytic functions.

Consider the following example. We have a table with a list of transactions. And we want to see the transaction value but also a cumulative sum for those values.

with testdata as 
   (select 1 trans_id, 10 transaction_value, trunc(sysdate-10) transaction_day from dual union all
    select 2 trans_id, 20 transaction_value, trunc(sysdate-8) transaction_day from dual union all
    select 3 trans_id, -10 transaction_value, trunc(sysdate-2) transaction_day from dual union all
    select 4 trans_id, 30 transaction_value, trunc(sysdate-2) transaction_day from dual union all
    select 5 trans_id, 100 transaction_value, trunc(sysdate) transaction_day from dual 
   )
select trans_id, transaction_day as trans_day, transaction_value as trans_value,  
       sum(transaction_value) over (order by transaction_day) cumulative_sum
from testdata
order by trans_id;   

Result

TRANS_ID TRANS_DAY	TRANS_VALUE	CUMULATIVE_SUM
1	 24.04.18 	10	        10
2	 26.04.18 	20	        30
3	 02.05.18 	-10	        50
4	 02.05.18 	30	        50
5	 04.05.18 	100	        150

As you can see the transaction 3 and 4 have the same cumulative sum. The reason is that our order criteria in the analytic window function does not separate those two rows.

There are two possible solutions. Either make sure that the order is not ambiquious. Or use “rows between”.

with testdata as 
   (select 1 trans_id, 10 transaction_value, trunc(sysdate-10) transaction_day from dual union all
    select 2 trans_id, 20 transaction_value, trunc(sysdate-8) transaction_day from dual union all
    select 3 trans_id, -10 transaction_value, trunc(sysdate-2) transaction_day from dual union all
    select 4 trans_id, 30 transaction_value, trunc(sysdate-2) transaction_day from dual union all
    select 5 trans_id, 100 transaction_value, trunc(sysdate) transaction_day from dual 
   )
select trans_id, transaction_day as trans_day, transaction_value as trans_value,  
       sum(transaction_value) over (order by transaction_day 
       rows between unbounded preceding and current row) cumulative_sum
from testdata
order by trans_id;   
TRANS_ID TRANS_DAY	TRANS_VALUE	CUMULATIVE_SUM
1	 24.04.18 	10	        10
2	 26.04.18 	20	        30
3	 02.05.18 	-10	        20
4	 02.05.18 	30	        50
5	 04.05.18 	100	        150

2. batched commits

The performance of many small commits can be improved when doing batched commits.

Instead of writing

commit;

we can do

commit work write batch;

commit_batch

Here are two real world examples where this was tested.

a) I recommended using batched commits to a colleague of mine, who tried to tune a set of java logic that run in highly parallel mode. The goal was to do 1 select + 2 inserts + commit in 1000 parallel sessions per second.

Switching to batched commits was so hugely successful, that they now raised the performance requirement to 2500 concurrent sessions per second. Which also means now the ball is passed back to the java developers to come up with a better mode to execute lots of small checks against the db.

b) I also tested batched commits in a different and more general context.

Most of our code has code instrumentation logic. That means we can turn on debugging with a certain trace level and while the code is running a lot of tracing information is written into a logging table. The instrumentation call (like logger.log_trace) uses an autonomous function to do so. Essentially it is a single insert followed by a commit. That also means that a lot of commits are executed. Which can put stress on the log writer and the storage system.

So I compared what happens when we do a commit vs. a batched commit while writing lots of tracing data.

The batched commit was orders of magnitude faster than the normal commit.
I plan to write a separate article to show the exact measurements.

EDIT+UPDATE: I need to double check and retest this improvement. It is possible that other factors influenced the measurements. Like other processes that throttled the log writer or hardware changes to the underlying storage system. Also it seems as if plsql already does a batched nowait commit per default (https://docs.oracle.com/en/database/oracle/oracle-database/18/lnpls/static-sql.html#GUID-56EC1B31-CA06-4460-A098-49ABD4706B9C). It might depend on the database version, but since 12.2 it is now documented. I couldn’t confirm this, and my tests so far seem to indicate an improvement.

So what is the disadvantage? Why not always use batched commits?
To be clear: For normal situations stay with the normal commit. But if you run into issues where the log writer is not fast enough then this can be a possibility.

The drawback (as I understand it) is that in the case of a database crash, you might not only loose the currently unfinished transactions but also some transactions, that were commited already, but which the logwriter didn’t finalize yet. Typically all transactions from the last 3 seconds are at risk.

1. “CRASH” is a reserved plsql word

I have no idea why.

Oracle 18.1 PL/SQL Reserved Words and Keywords

plsql_reserved_crash

Basic SQL: IF..THEN..ELSE

How to write conditional logic in SQL

This is a question that sometimes is asked by programmers who just started useing SQL.

If-Then-Else-diagram.svg
By P. Kemp – own work created using Dia, CC0, Link

Introduction

Programatic 3rdGL languages all have an IF..THEN..ELSE construct.

Pseudocode:

IF #condition# THEN #doSomething# ELSE #doAnotherThing#

Since SQL is a 4thGL language it has no need for conditional execution. This is sometimes surprising for beginners. IF..THEN..ELSE is very procedural thinking.

We do not tell SQL how to reach a certain result, instead we describe the wanted result.

Although a conditional execution is not needed, there are cases that come quite close. If we want to distinguish data dependend on other data, this can be done using conditional logic or conditional expressions.

In general expressions can not #doSomething# but they return a value (=data). SQL is all about data. A conditional expression in SQL is showing different data depending on other data.

SQL constructs

Here is a (non complete) list of different ways how we can write IF..THEN..ELSE in SQL. Be aware that most of those expressions are only ever useful in the SELECT clause of a query. The WHERE clause can do conditional logic simply by using AND+OR+NOT+() combinations. We do not need extra functional expressions there.

The functions towards the end of the list are a bit of an obscure choice for doing conditional logic. However they are useful to remember for some special scenarios.

  • CASE
  • DECODE(Oracle)
  • NULLIF
  • NVL or NVL2 or COALESCE
  • LNNVL
  • SIGN
  • ABS
  • GREATEST or LEAST

Each of those functions have advantages and disadvantages. I will try to give an indication where the usage seems proper.

All functions have a “translated” syntax in the following form.

SQL syntax: procedural code

The following business case is used
Requirement:
If a person is older than 65 years it should get a pension.

Or to say it in more technical terms (specification):
If today the age of a person – based upon its birthdate – is equal or greater than 65, then the retirement flag should be ‘Y’ else ‘N’.

CASE

CASE when a>b then x else z end: If a>b then x else z

The CASE function should always be the first choice. It is the best compromise between brevity and clarity of the expression.

case when add_months(birthday,12*65) >= trunc(sysdate)
     then 'Y'
     else 'N'
end

The add_months function will give us the day when the age of 65 is reached (retirement age). It will also consider some special calendar effects. For example a person born on 29th Feb. 1953 will get the flag already on 28th Feb. 2017.

We have to be very careful and check some special dates to make sure that our logic is the one we are looking for. For simplicity I assume that this logic for calculating the retirement candidates is correct.

CASE also has a second syntax (simple case expression). But this allows only to compare for equality.

CASE a when b then x else z end: If a=b then x else z

The syntax allows to stack multple case statements. But most of the times this is not needed. Instead use several WHEN sections. And we can profit from short-circuit evaluation.

case when a>b then x 
     when a>c then y
     when d

Short-circuit evaluation here means that if a>b=true the next conditions are not checked. Most importantly y, z and q are never calculated. And usually c,d and f also not. There are a few special exceptions. See this forum thread Oracle bug? Aggregates in CASE expressions negate short-circuit evaluation. for a discussion about the behaviour.

Code written using a CASE statement can sometimes get quite lengthy. But it is very close to natural language and therefore easy to maintain. CASE should be the first choice when conditional logic is needed.

DECODE

DECODE(a,b,x,c,y,z): if a=b then x elsif a=c then y else z

DECODE is nice if we need to check multiple values and provide an alternative value instead. I still like to use it in the ORDER BY clause to do rankings.

DECODE is not part of ANSI SQL and can only be used in Oracle databases.

If the condition is more complex than a simple equality comparison, then a combination of DECODE and SIGN can be used. Especially if numeric values play a role.

decode(sign(add_months(birthday,12*65) - trunc(sysdate),
        1, 'Y'
       -1, 'N'
        0, 'Y'))

The expression “add_months(birthday,12*65) – trunc(sysdate)” returns a positive value if the person is older than 65 and a negative value if he/she is younger.

DECODE+SIGN was used very frequently before CASE was introduced in Oracle 8 or 9. You still find it sometimes in older code.

DECODE also can compare with NULL values.

NULLIF

NULLIF(a,b): If a=b then NULL else a

This function returns a NULL value if input A matches Input B otherwise A is returned.

The following (slightly constructed) example will return the birthdate for all persons that reached retirement age. But NULL for those who didn’t. The GREATEST function is used to help us to level the values.

nullif(birthdate,
       greatest(add_months(trunc(sysdate)+1,-65*12),
                birthdate)
       )

It is not so useful for the choosen example. But it is very handy if we want to ignore some values. Especially in combination with aggregation functions, since they ignore NULL values during the aggregation.

NVL or NVL2 or COALESCE

NVL(a,b): If a is null then b else a
NVL2(a,b,c): If a is null then c else b
COALESCE(a,b,c,d): 
   If a is not null then a 
      elsif b is not null then b 
         elsif c is not null then c 
            elsif d is not null then d

NVL is often used for providing a default value, in case no value was found.
If the birthday of a person is an optional field, we might want to consider setting a default (=standard) age, for our calculation.

nvl(birthdate,date '1980-01-01')

The NVL2 syntax is a little less known but it is very useful and should be remembered.

COALECSE is very similar to NVL. It returns the first value that is not null. But it can be used for checking more than one value. A major advantage is that the second and later expressions are only checked if the first expression is NULL (short-circuit evaluation). This can give an performance advantage over NVL if the second expression is expensive. This performance advantage is why many programmers always prefer COALESCE over NVL. However there are also some special compiler optimizations that do kick in for NVL and not for COALESCE. As a rule of thumb I would stick with NVL if we have a simple second value. I use COALESCE if more than one value needs to be checked or if the second value is complex (like a plsql function or a subquery).

return the most recent activity (lastest date) for each shipped order

coalesce(arrival_date, shipping_date, sent_date, creation_date) as latest_date

LNNVL

LNNVL(a=b): if not(a=b) or a is null or b is null then true

LNNVL is a strange and hard to understand function. It negates a condition. It is used internally by Oracle to rewrite certain queries (not in into not exists). Contrary to all other functions LNNVL returns a boolean result and can be used in the where clause .

One usage is if we want to make sure two values are different and still want a result even if one of the values is null. So it can ease the working with NULL values.

This select returns something if the values a and b are different.

select * from dual
where LNNVL(a=b);

SIGN

See decode.

The SIGN function can be used to rephrase a non-equal comparison into an equality comparison.

if a > b then

is (for numbers only) mathematically the same as

If SIGN(a-b) = 1 then

SIGN is a typical helper function for DECODE. Since DECODE can only compare for equality, SIGN helps to enhance that to do more complex comparisons.

ABS

ABS is sometimes used in combination with SIGN. In rare cases it avoids sorting the input data for the SIGN function.

if a != b then

is (for numbers only) mathematically the same as

if sign(ABS(a-b)) = 1 then

Also much easier would be:

not(a=b)

ABS => Not really useful anymore.

GREATEST or LEAST

GREATEST(a,b,c): if a > b then a else b => result1; if result1 > c then result1 else c)

See NULLIF example.

GREATEST give us the maximum value from a list of values. LEAST gives us the minimum. GREATEST and LEAST can be used to harmonize certain values and then to allow an equality comparison for them. As such they are similiar to SIGN. However SIGN works only with numerical data, whereas GREATEST|LEAST can also be applied to strings.

As with all functions we have to be careful about NULL values. Remember NULL means “UNKNOWN”. So if one of the values in the list is NULL, then GREATEST|LEAST will return NULL (UNKNOWN).

further considerations

Of cause it is also possible to do conditional logic using DML commands.
The WHERE clause of the DML command matches the IF part, the SET part of an update, matches the THEN part. For a delete command the THEN part is to delete the object.

For example a procedural requirement like
“if the data is older than 3 years then delete it”
can be translated into sql like this:

Delete from myData
where insert_date < add_months(trunc(sysdate),-3*12)

As we can see the procedural requirement can be translated into SQL in a very elegant and straight forward way.

Another point to remember:

If you think a CASE expression is needed in the where clause you are most probably wrong. One notable exception to this rule is, if you want to access a function based index (fbi). If a function based index uses a CASE function, then you must use exactly the same function inside your where clause, to be able to profit from this index.

Conclusion

SQL can do conditional logic. The first place to look for it is the WHERE clause. Conditional expressions can be done using the SQL functions CASE, NVL, COALESCE, NVL2, DECODE and NULLIF (in that order). Other functions can help to adapt conditional logic to the specific business requirements.

The EBnoR Manifesto

The “EBnoR” Manifesto

about

Edition based not only redefinition

Author: Sven-Uwe Weller

suw_logo

ceo syntegris information solutions GmbH

Germany

 

 


Mission Statement

Edition based redefinition

EBR – is not well known.

It is barely used.

Edition based not only redefinition

EBnoR – will change that.

EBR is a tool with unique possibilities.

EBnoR deals with cases

beyond

the originally intended scope.

EBR is difficult to use.

EBnoR is easy to use.

EBnoR avoids complexities.

EBnoR is powered by the strengths of EBR.

 

 

 


EBR Basics

Edition based redefinition is a unique feature of the Oracle database that is available since version 11.2. It can be used at no additional license costs in all database editions (EE, SE, SE1, SE2, XE).

The term “edition” is used to describe a set of plsql based objects. Among those object types are plsql packages, views and synonyms. A complete list is here in the Oracle documentation.

A different edition can hold different code for the same plsql object. In essence each edition resembles a version of your database application code. EBR allows to store and run different versions of your application at the same time. The different code versions do exist inside the database at the same time.

This opens up a whole set of some very special possibilities that usually do not exist in other environments. EBR allows to do an upgrade to a new application release without shutting down the application. The new code version is installed in a new edition. Currently running sessions will still work with the old release. Deployment problems can be tested using the new release. Once those issues are solved, then the new edition is made available (as the new default edition). Only when a user ends his (database) session and reconnects then the new code version from the now new default edition will be used.

distinction to VCS

EBR should not be confused with a version control system like SVN or GIT. Although there are some similarities it serves a different purpose. A version control system supports the development and deployment process for a team of programmers. They can store and merge code there. Forks and code branches can be used to support development that temporarily goes into different directions. Code merges allow to combine branches again. This is something that EBR can and will not do. EBR is about running the different code versions in parallel. Application end users profit from EBR, not developers. At best an edition resembles a release in the version control system.

Sven says: A VCS saves and documents code, EBR executes code.

versions of data

Data resides in tables. Tables are not editionable objects. That means two things. Changes to table structures are not editionable directly. And the data itself is not subject to code release changes. That is a good thing. The data and the code that works with the data is separated.

Creating a new edition means that code is duplicated. Code can be duplicated easily,  duplicating data is a much bigger issue.

Sometimes a code change requires to change existing data. This increases the complexity to run the old and the new code version at the same time. EBR offers solutions using cross-edition triggers and editioning views. How to use cross edition triggers is described in Bryn Llewellyn’s excellent white paper from 2009 (http://www.oracle.com/technetwork/articles/grid/edition-based-redefinition-1-133045.pdf). Read and understand this paper first before you go on and read this manifesto.

“Executive Overview

Large, mission critical applications built on Oracle Database 11g Release 1 and earlier versions are often unavailable for tens of hours while the application’s database objects are patched or upgraded. Oracle Database 11g Release 2 introduces edition-based redefinition, a revolutionary new capability that allows online application upgrade with uninterrupted availability of the application. When the installation of the upgrade is complete, the pre-upgrade application and the post-upgrade application can be used at the same time. Therefore an existing session can continue to use the pre-upgrade application until its user decides to end it; and all new sessions can use the post-upgrade application. As soon as no sessions are any longer using the pre-upgrade application, it can be retired. In other words, the application as a whole enjoys hot rollover from the pre-upgrade version to the post-upgrade version.

This whitepaper explains how edition-based redefinition works, and how to write online application upgrade scripts using this capability, at the level of detail needed by engineers who will write such scripts.”


the current status

EBR is in the market for a long time now. Still many developers and DBAs do not know about it. If they know about it, they are not using it actively. I identified a set of problem areas that are responsible for this current status of EBR. I see possible solutions. Using EBR the focus should not only be on redefining the code, instead EBR can open up a set of new and unique possibilities. Zero downtime application upgrades are only one of them. I call the set of solutions “Edition based not only redefinition”  or short EBnoR.

problem areas under the influence of intelligent people (developers and DBAs)

the deployment dilemma shortly before GO-live it is too late to think about EBR
the design change gamble keep changes required by EBR at a minimum, design with EBR in mind
the zero downtime promise Perfectness – if even possible – is expensive. Don’t go for perfectness, almost perfect is good enough.
the cross edition complexity Cross edition triggers are needed for extrem cases. Huge effort for only a tiny result.

problem areas influencing decision makers (architects and project managers)

the cross plattform idependency nonsense EBR is only available for Oracle databases
the marketing confusion Editions are not license editions

EBnoR does offer solutions to each of those problem areas. Using small sidesteps from Oracles recommend path will lead to major improvements in development time, flexibility and ultimately results in a quicker learning curve.

The “deployment dilemma”

The advantages of EBR shine during one of the later steps in the software development lifecycle – the deployment phase. The dilemma is that at such a late point the additional programming effort to use EBR competes with adding more features or bug fixing of the application.

Oracle propagates that the advantage of using EBR to do application upgrades with minimal downtime is so big, that it is worth the additional effort and time. This is true in very rare cases only and mostly only for very large companies than can effort to spent money on this additional effort.

Problems

The problem is the decision point. Budget is reserved for a specific change in one application. There never is enough budget. There never is additional budget. Especially not shortly before the next application upgrade will go live. At this point there is not enough budget and not enough time.

EBR is a tool not an application itself. As such most companies hesitate to set up an extra project or a major software change just for using a tool.

Solution

The decision needs to be moved from budget owners to the developers and to an early point during the software life cycle. Only then will EBR find more integrations.

EBnoR: Start with EBR from day 1 in the project!

Use editions already during development!

The “design change gamble”

EBR requires some changes to the data design and to the schema.

One requirement is to add a view layer on top of all tables. Among others this allows to use cross edition triggers to do data changes. This is certainly a solid approach, however it is a major change to the data model. As such this will face critic from various other parties involved.

Typical arguments that are raised against such a change are:

  • Performance will suffer because of the view
  • The views do not confer to our naming conventions
  • The views hide complexity
  • Some tools work (better) with tables instead of views
  • Enabling EBR can not be undone

The gamble is that EBR will take the blame if implemented at a final stage during development. If any issue arises, e.g. performance in the production environment drops, some will argue that the reason is in the additional view layer. Just because that was the only thing that was changed. It might be hard to prove otherwise, as long as comparable performance data is missing.

Editions can be dropped. Switching on editioning in the database has no implications on using the feature or removing editioning features.

Solution

EBnoR: Avoid major design changes!

Use a view layer if possible. If not, still use EBR without this extra layer.

Test early while using editioning!

The “zero downtime promise”

Absolute zero downtime is very hard to reach. Even a simple ALTER TABLE statement will lock the full table. During that the table is not available. Sessions will wait (which is a different term for downtime)  or even break.

Focus on changes that have a minimal impact on downtime. For example adding a column is usually no problem. Even if the column is added to the old application without adding the new code it should not break anything. Write your application in such a way that added columns will not break existing code. This is easy to do. If the column is mandatory always provide a default value. Updating the data in this column might need more time. Dropping a column should be avoided because this usually breaks old application versions.

If the normal downtime window is a few days, then using EBR can change that window to less than an hour by focusing on the DML changes (tables mostly). For a normal user such a small downtime window is the same as zero. He will often not even notice such a short break. This does not mean that the new application version needs to be available after an hour. But just that after an hour the user is able to continue to work with the old version. After that downtime window the new edition can be installed, tested and then a gradual switch over of user sessions can be done.

Zero downtime is expensive. Even with EBR.

Solution

EBnoR: Aim for very low but not for zero downtime.

Don’t promise zero to users!

The “crossedition complexity”

Bryns paper describes how to use cross edition triggers to handle data changes between one application version and the next one. There is a major intrinsic problem with that. Let us first investigate how cross edition triggers work.

Cross edition triggers allow to manipulate data in one edition and add transformation logic so that the same dataset is correctly represented in the next or previous edition.

  • “A forward crossedition trigger is fired by application DML issued by sessions using the pre-upgrade edition. Such a trigger is used to implement transformations from the old representation forwards into the new representation.“
  • “A reverse crossedition trigger is fired by application DML issued by sessions using the post-upgrade edition. Such a trigger is used to implement transformations from the new representation backwards into the old representation.”

The problem is not how to write a trigger, but in the complexity of forward and backward data manipulations.

Sometimes the data manipulation needed to go to the next application version is simple, but can not be automated easily. Manual data cleansing actions can be such a case.

Some scenarios do not have a simple 1 to 1 transformation rule from data in the edition A and data in edition B. To do the forward transformation is usually part of the project. Very often it is much harder to provide the backward transformation. Sometimes this is only possible while keeping the old data structure thereby duplicating and replicating data. This defies the original reason for change.

The EBR solution is to have this change only temporarily. Therefore old editions should be dropped very soon after the application was switched to the new code (and data) version.

Problems

  • Often requires to store several versions of the same data. Usually in different columns.
  • Need to automate data transformation rules
  • Huge additional effort not related to a better application
  • Multiple editions require lots of cross edition triggers

Solution

Start slow and avoid cross edition data transformations!

This needs some explanation and examples.

Example

For example if a design decision was made to trim the time component from all date values in a column. So that instead of a range comparison an equal comparison can be made.

Old code:

dateCol >= trunc(:searchDate) and dateCol < trunc(:searchDate) + 1

New code:

dateCol = trunc(:searchDate)

The typical change includes a DML statement that updates the column using TRUNC.

update myTable
set dateCol = trunc(dateCol)
where dateCol != trunc(dateCol);

This change can not be undone because information (the time component) is lost.

So to write a reverse crossedition trigger to mimic the exact same source data is difficult.

One solution could involve keeping the old column including time precision and adding a second column that holds only the truncated date value. Then add (or change) a editioning view in the old and new edition that serves as an api to the relevant column. A forward crossedition trigger would truncate the data that is inserted using the old application version. But without knowing where the time component is coming from, it would be impossible to write a reverse edition trigger that adds the time correctly. A default logic would be needed. Maybe depending on some other values. Like the time should be between the time from previous and following rows. So we need several new objects, a fairly complex code to do the reverse change, the data model is more complex than the simple update solution but above all data would be duplicated.

But why care? In this specific example the old application code might still work, even if the time component is truncated. So there a two possible easy ways to avoid complexity

  1. run the update several times, for example after each batch load of new data
  2. add a normal db trigger in the old edition that truncates all newly inserted date values
:new.dateCol := trunc(:old.dateCol)

Yes this would require a change in the old edition and as such would change old application code. If this can not be done because of organisational hurdles, then create an intermediate “preparation” edition and add the trigger there.

Consider a step by step approach. There are many situations out there, where we can get away without the need to implement additional cross edition logic. Avoiding this helps to get EBR started and become acquainted with the feature. Once we got used to it, we can add more – eventually even using cross edition logic for specific use cases.

EBnoR: Do not depend on editioning views, do not use cross edition triggers.

Avoid deleting objects, don’t hesitate to add objects (especially columns).

If you need to delete objects this will influence all editions. After careful consideration just do it. After all, rules are there to be broken. Announce it as a major application version which will take a small downtime. Do not let the zero downtime paradigm make you write complex code. Complex code in the end costs more in terms of maintainability and testing effort.

learning curve considerations

A learning curve comparison shows how a reduced approach will ease the first steps into the feature.

  • plsql code versions: start using EBR already during development. Then the developers get used to working with editions. Building a new release and deploying it in a specific edition then is nothing strange anymore. Do development and tests in the same database but in different editions. Each step in the development lifecycle deserves a new edition.
  • design changes: avoid changes in the data model that would influence old editions in a negative way. For example avoid dropping columns. Also write your code in a way it does not break when a new column is added to a table. Model your data first before you start to build a new application.
  • low or zero downtime: often a tiny downtime is ok. If the aim is to be fast, but not to be perfect then the additional effort suddenly is way reduced.  Do not aim for perfectness!
    Example: Adding a column to a table requires that the table is locked for a moment. Especially if the new column is not null and is filled with some extra values. Other session might wait for this. If the table is very very large such a lock can take some time. If it takes a  minute thats fine. It will not be zero downtime, but almost every project can effort to wait a minute during a new deployment.  If it takes half a day, then you must invest effort to optimise it.
  • cross edition data: avoid using cross edition triggers. They serve only a very specific very special purpose. Build your data changes in a way that you do not depend on cross edition triggers. The need for cross edition triggers is the major contributor to why there is a increased programming effort when EBR is used.

The “cross platform independency nonsense”

“EBR can not be used in our company because we want to stay database vendor independent”

Such an argument can often be heard from java developers or other programmers outside of the database world.

An analogy

You brought a Porsche 911. This is an expensive but very fast car. However you choose not to drive it in 5th and 6th gear because there are other cars out there that do not have a 5th and sth gear. Does this make sense?

In my opinion the claim for database independency is often made to hide the incompetency or inability of the developer to learn about the unique features that their specific database allows. In general it is easy to structure code in such a way that vendor specific features are encapsulated and properly modularized. This will made future changes to another database less problematic.

This is also true for EBR. If you decide to enable and use editions you can do it in such a way that all the specific implementation details are hidden from the normal code. The developers and the DBAs should be “edition aware”. But after some initial setup, there is no extra effort that EBR requires. In general you would simply use a connect string that connects to the edition of your choice. There is almost no need to consider editions directly in the code.

There are a few exceptions. Like if you want to start a scheduled job then you proabably want to start that job in the same edition as the session that is currently running and creates that job. This is possible using a job class using a service that is tied to an edition. Or more general: Everytime you have a kind of client access to your database, you want to make sure that you can influence the edition that is used at connection time.

Solution

Modularize and structure your code well!

Immediately counter the argument by using strong analogies.

The “marketing confusion”

I think the EBR name is suboptimal (nicely spoken). First of all the term “edition” is confusing, because it is also used as a name for database editions (EE, SE, XE). The R stands for “redefinition”. The term is a poor description of adding a new code release to your application.

I admit I’m not creative enough to suggest a better name, but as an analyst I’m able to pinpoint a flaw when I see one.

Nowadays every “thing” or “technology” has a smart speaking name and a funny logo.  The name usually is also an abbreviation, and one could think this is only of minor importance. It is not. This is the name that feature/tool goes by. This name and logo is used in slides, presentations, and social media channels.

some examples 

Node JS packages: https://www.npmjs.com

Here are just a few of them. But every of those javascript packages has a small but recognisable logo added.

ebnor_npm_packages

Architecture components for Spark: https://www.linkedin.com/pulse/hadoop-summit-2015-takeaway-lambda-architecture-laurent-bride

ebnor_hadoop_summit_logos

Just look at all those small but important looking pics in this simple architecture diagram. It makes you feel as if you missed something important, if you don’t know them all.

Problems

  • EBR is hard to pronounce and to spell out.
  • The term “edition” is already established in the Oracle namespace. And usually it stands close to “license costs”.
  • A nice enchanting logo is missing.

Suggestions

  • EBnoR / EbanoR = Edition based not only Redefinition
  • Edi
  • ebar
  • PRETI = plsql runtime environment and test integration
  • or something completely different = OSCD (could be in use already by some other organisation)

Solution

Oracle, get some logo wizards and marketing people to work on that!

 

 

 

 


beyond EBR

The EBR feature allows to do more than the standard model intended by the Oracle development team.

The standard model is

  1. install a new application version in a new edition in production
  2. test if the deployment went well
  3. switch the current edition to the new edition
  4. after some short grace period drop the old edition

During this whole time the impact on connected users is minimal.

This is great!

But there is more…

Using EBR in certain ways can open up new possibilities and ease the upgrade pain.

New possibilities include

restore and run old code

Assume a problem is reported, where you suspect that the reason is gone meanwhile. But the problem existed inside some older code version. Useing EBR it is now possible to quickly prove that assumption. Create a new edition in the environment where you want to reproduce the issue (usually DEV, but might be some integration or QA test database). Fetch the old plsql code from your versioning system and add it into this new edition.

Run the test in both editions to see if the problem occurs with the old code but not with the new code.

After that test simple throw away the whole edition (drop edition cascade).

develop and test in the same database

Build releases in ORA$BASE

=> test in edition TST$BETA

=> develop in edition DEV$ALPHA.

This is something I discovered when actively developing and working with EBR. It is one of the best things to do. It might even save license costs because you can eliminate a complete test database. Although some test databases do not need additional licenses. It does not mean you can save up upon tests. But certain types of tests can profit hugely from EBR. Testers use one edition, developers use another (child) edition. Developers can actively analyse bugs that testers found without the need to move data from the test to the dev database. And developers can fix bugs without disturbing tests running in the same database.

Build the test environment automatically from your versioning system.

provide backward compatibility

This is often useful for API development (see also ThickDB | SmartDB paradigm).

APIs provide a specific interface (usually views and plsql packages). APIs expose a limited set of objects and methods for a specific task. APIs do not expose data directly. This is were EBR shines. All those objects are editionalble objects!

Often we have multiple and different consumers using the same api. If API functionality is enhanced, then we can provide the newer version of the API in a new edition. If we keep the old edition instead of dropping it almost immediately, consumers can still use their old code without the need to immediatly switch to the new version.

As long as cross edition triggers can be avoided, there are no major problems with keeping multiple API versions for a longer period of time.

Tipp: All editioned objects should be compiled in each edition even if there was no change. This actualizes a separate code version in each new edition.

 


Recommendation

the EBnoR (Edition Based not only Redefinition) approach

  • Start using EBR on day 1 in the project
  • Aim for low but not zero downtime
  • Avoid (complex) cross edition logic
  • Develop and test in the same database in different editions

 

 


Some final words

I had planned to publish this manifesto for quite a long time time (several years now). Since I started to put down those thoughts, EBR was enhanced. New recommendations have been made by Oracle plsql development. Other people, including the AskTom Team, encountered similar things with EBR. Some of those ideas and recommendations match with EBnoR. Still I expect EBnoR to be very controversial. This is intended.

For sake of brevety I skipped a larger part of examples and direct code demos.

EBR also still has issues where I see no immediate solution. For example editions are database level objects. As a developer I would prefere them to be schema level objects. The problems connected to this are beyond the scope of even EBnoR.

 

Exotic SQL: Hints that can change results

In general hints are suggestions for the Oracle cost based optimizer (CBO) that can influence performance but will never change the result. The CBO will comply with the hint at all costs – if it is possible.

True? Well not always!

There is a very small set of hints which can change the result of a select or DML statement.

The CBO is an extremly complicated piece of software, so it can happen that there are bugs in it, which in turn lead to different results. This is not what I am considering here.

So here is my collection of hints that are able to change the output of a statement.

ignore_row_on_dupkey_index

This hint allows to avoid UK errors during an insert. So only those rows are inserted that do not violate the unique index. The other rows are silently ignored.

See this example by Richard Foote about the hint:
https://richardfoote.wordpress.com/2010/12/20/oracle11g-ignore_row_on_dupkey_index-hint-micro-cuts/

Similar hints are
(CHANGE_DUPKEY_ERROR_INDEX and RETRY_ON_ROW_CHANGE)

They are all called semantic hints. Which has the side effect that they are not disabled by setting _optimizer_ignore_hints=TRUE

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/Comments.html#GUID-4664D3D8-6312-4C15-8E8F-4872DD7A44F8

For more examples see this older post from Rob van Wijk
http://rwijk.blogspot.de/2009/10/three-new-hints.html

driving_site

I consider the driving_site hint to be one of the “good” hints. There are many cases where we need it. It is good to set it, because it enables behaviour that otherwise the optimizer can not choose.

However there is one strange special issue that I found.
I described this behaviour in an older blog post: https://wordpress.com/post/svenweller.wordpress.com/108

If we have a distributed query then SYSTIMESTAMP and also DBTIMEZONE is affected by the driving site hint. It will return the timezone from the local (no hint) or the remote (with hint) database. SYSDATE however was not affected.

This effect happend on an older 10g database, I wasn’t able to test if the behaviour still exists in 12c.

A similar effect can be reached using the materialize hint.

opt_param(‘OPTIMIZER_SECURE_VIEW_MERGING’,’true|false’)

The opt_param hint will allow to set instance parameters only for the time while the select is running. Some of those parameters can potentially influence the outcome of queries.

One of them is OPTIMIZER_SECURE_VIEW_MERGING
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/refrn/OPTIMIZER_SECURE_VIEW_MERGING.html

An example from Tom Kyte:

the goal of this parameter [OPTIMIZER_SECURE_VIEW_MERGING] is to prevent a function owned by some user B from seeing data of some other user A that is should not see. For example, …

and after some setup…

b%ORA11GR2> set autotrace on explain
b%ORA11GR2> alter system set optimizer_secure_view_merging = true;

System altered.

b%ORA11GR2> select * from a.v v1 where f(y) = ‘ok to see’;

X Y
———- ——————————
1 ok to see

I see: ok to see

Execution Plan
———————————————————-
Plan hash value: 1931062764

—————————————————————————
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
—————————————————————————
| 0 | SELECT STATEMENT | | 10000 | 292K| 27206 (1)| 00:05:27 |
|* 1 | VIEW | V | 10000 | 292K| 27206 (1)| 00:05:27 |
|* 2 | TABLE ACCESS FULL| T | 10000 | 292K| 27206 (1)| 00:05:27 |
—————————————————————————

Predicate Information (identified by operation id):
—————————————————

1 – filter(“F”(“Y”)=’ok to see’)
2 – filter(“Y_F”(“X”)=1)

there it is clear that the Y_F(X)=1 is evaluated and then f(y) is

b%ORA11GR2> alter system set optimizer_secure_view_merging = false;

System altered.

b%ORA11GR2> select * from a.v v1 where f(y) = ‘ok to see’;

X Y
———- ——————————
1 ok to see

I see: ok to see
I see: NOT ok to see

whoops, my function saw data that it should not

Execution Plan
———————————————————-
Plan hash value: 1601196873

————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————–
| 0 | SELECT STATEMENT | | 100 | 3000 | 27207 (1)| 00:05:27 |
|* 1 | TABLE ACCESS FULL| T | 100 | 3000 | 27207 (1)| 00:05:27 |
————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – filter(“F”(“Y”)=’ok to see’ AND “Y_F”(“X”)=1)

and now we can see the converse is true

b%ORA11GR2> alter system set optimizer_secure_view_merging = true;

System altered.

b%ORA11GR2> set autotrace off

There are also some side effects with VPD. See https://antognini.ch/2011/09/optimizer_secure_view_merging-and-vpd/

Also in some cases this parameter can result in errors.
Here is a report of such a case: https://support.esri.com/en/technical-article/000010620

Another example for a parameter that can influence result sets would be RESULT_CACHE_REMOTE_EXPIRATION
https://docs.oracle.com/cd/B28359_01/server.111/b28320/initparams208.htm#REFRN10294

The default is 0. Changeing it to a positiv integer, will allow us to get “stale” data from the cache instead of fetching a fresh value from the remote source.

fresh_mv

This 12.2 hint can make a real time materialised view to refresh itself. So if the data is stale then the hint will use the MV and somehow add the missing data to it. Without the hint we would still see old data, but with it we will see new data.

https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/Comments.html#GUID-5EF4198B-50B3-40D8-B12A-3D3115C69D9B

The FRESH_MV hint applies when querying a real-time materialized view. This hint instructs the optimizer to use on-query computation to fetch up-to-date data from the materialized view, even if the materialized view is stale.

A similar connected hint would be
no_rewrite/rewrite_or_error
All of those hints have the ability to change the behaviour, especially the usage of stale materialized views. Which then can give a different result.

Disclaimer

This list has no pretence to be complete. Also some of the mentioned behaviours might change in different database versions. Especially if it is not documented and not intended behaviour.

There are more such hints. I have recorded some more, but couldn’t consistently reproduce the issue. It might have been related to some bug or some other obscure circumstance, that I can’t remember.

Here is the current list for those “watch out” candidates:

  • first_rows
  • (no_)query_transformation
  • (no_)result_cache

Result cache in combination with deterministic functions, that are not truely deterministic can easily produce wrong/different results. However the main culprit then is the wrong usage of the deterministic pragma.

And some more strange things

The SELECT clause can also influence the number of rows. And not only via distinct.

    with tbl as (select 1 val from dual union all  
                select 2 val from dual union all  
                select 3 val from dual )  
        SELECT  CASE  0  
                    WHEN  0  
                        THEN  'Aardvark'  
                    WHEN  SUM (val)  
                        THEN  'Baracuda'  
                END  AS c  
        FROM    tbl;  

Aardvark
Aardvark
Aardvark
3 rows selected.

    with tbl as (select 1 val from dual union all  
                select 2 val from dual union all  
                select 3 val from dual )  
        SELECT  CASE  6  
                    WHEN  0  
                        THEN  'Aardvark'  
                    WHEN  SUM (val)  
                        THEN  'Baracuda'  
                END  AS c  
        FROM    tbl;  

Baracuda
1 row selected.

Discovered by Frank Kulash and discussed and tested in this OTN thread
https://community.oracle.com/thread/4097998?start=15&tstart=0

Exotic SQL: Bubble Sort with Model Clause

Introduction

This is a “just for fun” experiment. But I demonstrate some model clause effects on the way.

Task: Sort a pipe separated list using SQL.

For example this string ’29|1|3004|3|2|24′ has 6 elements separated by |. The elements should be sorted in numerical order so that the resulting string looks like this ‘1|2|3|24|29|3004’.

Standard way

The “normal” way would be to tokenize the input string, put the tokens into rows, sort the rows and aggregate the sorted result to reassemble the string.

Here is one possible solution.


with inputdata(src)
     as (select '29|1|3004|3|2|24' from dual)
    ,num_rows (part, remains)
     as (select to_number(substr(i.src,1,coalesce(nullif(instr(i.src,'|'),0)-1,length(i.src)))) as part
               ,substr(src,instr(i.src,'|')+1) as remains
         from inputdata i
         UNION ALL
         select to_number(substr(r.remains,1,coalesce(nullif(instr(r.remains,'|'),0)-1,length(r.remains)))) as part
               ,substr(remains, nullif(instr(r.remains,'|'),0)+1) as remains
         from num_rows r
         where r.remains is not null
         )
    , combine (resultdata)
as (select listagg(to_char(part,'FM9999999999'),'|') within group (order by part) from num_rows)
select * from combine;

The num_rows subquery in this example is a recursive with clause that splits the input string into rows, one row for each token. The logic to split the string into tokens is based around substr . Essentially it just cuts of an element from the beginning of the string and keeps the remainder until there are no elements left.
The combine subquery then uses listagg to sort and convert the rows into a string again.

Standard SQL technologies that each decent developer should know about.

There are multiple ways how to tokenize,  how to create the rows and how to aggregate the rows again. But the general principle is still the same.

Here is a second example solution. This time using connect by to create rows and some regular expressions to count and split.

with inputdata(src) as (select '29|1|3004|3|2|24' from dual )
select listagg(token,'|') within group (order by to_number(token)) as res
from (
  select src, regexp_substr(src,'[^|]+', 1, level) token
  from inputdata
  connect by level <= regexp_count(src,'[|*]') + 1
  )
group by src;

Exotic way

I was wondering if we can use a totally different approach.

Instead of letting the database do the hard work and implement a sort mechanism, we can do it ourselves. Here comes bubbles sort. It is one of the most inefficient sort mechanisms you can think of (unless the list is already sorted), but it is fairly easy to implement.

Since bubble sort is an iterative approach with a simple set of rules, the model clause immediately jumped to my mind.

So here is a solution based around a slightly optimised bubble sort mechanism (gnome sort). We will take it apart afterwards.
The solution here creates several rows, but this is just to see and check each step of the iteration. It is possible to do this using one row (dim=0) only, although it is much harder to develop and to understand.

with inputdata(src) as (select '29|1|3004|3|2|24' from dual)
select *
from inputdata
model dimension by (0 as dim)
measures (  src
          , 1 ele_pos1
          , 2 ele_pos2
          , cast (regexp_substr(src,'[^\|]+',1,1) as varchar2(100)) token1
          , cast (regexp_substr(src,'[^\|]+',1,2) as varchar2(100)) token2
          , regexp_count(src,'\|')+1 max_element
          , 2 hwm
          , cast (src as varchar2(500)) as res
          )
rules iterate (500)
      until (-- we can stop if the high water mark is at the last position and no switch needed
             hwm[iteration_number] >= max_element[0]
             and (ele_pos1[iteration_number] = 1 or to_number(token1[iteration_number]) < to_number(token2[iteration_number])))
( -- switch tokens if left token > right token
  res[iteration_number+1]
    = case when to_number(token1[iteration_number]) > to_number(token2[iteration_number]) then
      -- do the switch
      regexp_replace(
             regexp_replace(res[iteration_number],'[^\|]+',token2[iteration_number],1,ele_pos1[iteration_number])
                            ,'[^\|]+',token1[iteration_number],1,ele_pos2[iteration_number])
      else res[iteration_number] -- no switch needed
      end ,
  -- calculate next position for token1
  ele_pos1[iteration_number+1]
  = case
      when ele_pos1[iteration_number] = 1 then
        -- we reached first position, so go back to hwm
        hwm[iteration_number]
      when to_number(token1[iteration_number]) > to_number(token2[iteration_number]) then
        -- after a switch, move one position the the left and check there
        ele_pos1[iteration_number] - 1
      when ele_pos1[iteration_number] + 1 <= max_element[0]-1 then
        -- no more switch, so go to hwm and look for next element
         hwm[iteration_number]
       else
         -- just in case, don't do anything
         ele_pos1[iteration_number]
       end,
   -- calculate next position for token2
   ele_pos2[iteration_number+1]
    = case
       when ele_pos2[iteration_number] = 2 then
         --ele_pos2[iteration_number] + 1
         hwm[iteration_number] + 1
       when to_number(token1[iteration_number]) > to_number(token2[iteration_number]) then
        ele_pos2[iteration_number] - 1
      when ele_pos2[iteration_number] + 1 <= max_element[0] then
        hwm[iteration_number] + 1
      else
        ele_pos2[iteration_number]
      end,
  -- calculate high water mark
  hwm[iteration_number+1]
  = greatest(hwm[iteration_number],ele_pos2[iteration_number+1]),
  -- get token1 for new calculated position1
  token1[iteration_number+1]
  = regexp_substr(res[iteration_number+1],'[^\|]+',1,ele_pos1[iteration_number+1]),
  -- get token2 for new calculated position2
  token2[iteration_number+1]
  = regexp_substr(res[iteration_number+1],'[^\|]+',1,ele_pos2[iteration_number+1])
 );

Here is the result, but I show only the important column. Feel free to run the statement yourself (works on all supported database versions) and see the other helper columns.

Result (only Res column)
29|1|3004|3|2|24
1|29|3004|3|2|24
1|29|3004|3|2|24
1|29|3|3004|2|24
1|3|29|3004|2|24
1|3|29|3004|2|24
1|3|29|2|3004|24
1|3|2|29|3004|24
1|2|3|29|3004|24
1|2|3|29|3004|24
1|2|3|29|24|3004
1|2|3|24|29|3004
1|2|3|24|29|3004

Fairly easy SQL isn’t it?

65 rows instead of 8. So if your aim is to write obfuscated SQL then this certainly is the way to go.

If you want to understand it, skip to the “algorithm explained” chapter.

Rant

Now what is wrong with the model clause!? A simple algorithm like bubble sort looks this complex?

In the recent months I have written a number of model clause solutions for different problems. However none of them is easy. Apart from very specific cases all model solutions are terrible to look at and terrible complex to build. In comparison it is much easier to do the same in excel.

I attribute this to several things.

  1. The way to reference a cell is verbose. And when we look at the code then the brackets break the eye scanning mechanism of a trained developer eye. If more than one dimension is involved it is even worse.
    Here are some examples:
    measure1[cv(),3] or  resultdata[iteration_number+1]
    It is hard to see this as one cell reference.
    In Excel this would simply be: B$3 or C2.
  2. The way to reference a cell from the previous row is complex. It requires a working calculation for the cell address. We can do this by applying the row_number() analytic function or using iteration_number (for iterative models). But this requires some extra logic.
  3. There is no simple way to set several different measures at once. A rule can only be applied for one measure. This means that sometimes we need to repeat the same logic for a different measure.
    In the bubble sort example this happened for the calculation for the position of token 1 and for position of token 2. In this case it can be simplified a lot because the position of token 2 is always exactly +1 from position of token 1. In other cases it is not that simple. Additional measures can serve as a kind of variable to capture rules that then need to applied to several other columns.

Excel Comparison

The model clause resembles Excel very much.

  • Dimensions are rows.
  • Measures are columns.
  • Rules are formulas

But why does the model clause feels so more complex that some simple excel cells.

Excel has a similar way to address cells. It is called the R1C1-style reference. It uses number coordinates to find a cell. But most of times we use the A1-Style reference. This is a little more intuitive and much shorter.

But more importantly the process to build formulas (=rules) is different. While building formulas we click on a source cell and its coordinates are used automatically. Excel then calculates the difference from the current cell to the referenced cell.

The main difference to Excel is that is has a separate formula for each cell, and we use just a simple way to fill out the formulas to the other cells. We don’t need that copy mechanism with the model clause, because the rules automatically are applied for all relevant cells. But it is hard with the model clause to give the direct reference for a different row.

Also the different styles and the different ways to use absolute and relative cell references in Excel (e.g. B$2) are more convenient than doing the same in the model clause.

Pro-Tipp: A good way to develop model rules is to build the formulas in excel first, then copy the logic to SQL.

The model clause also has several advantages over excel. Among others are

  • Excel only has 2 dimensions and no decent concept of partitions.
  • Excel columns can not be named (although single cells and cell ranges can).
  • Model allows to return only updated rows.
  • Model has a easy way to work with invalid cell references (IGNORE NAV)

Algorithm explained

Before we look at the statement, let’s look at the bubble sort logic first.

bubble sort logic

Start: 29|1|3004|3|2|24

I start with the first two tokens. 29 and 1. Token1 is always the left and token2 is always the right token.

Rule 1: If token1 is bigger than token2 we need to switch them.
Rule 2: If we are at position 1, we can go one step to the right.
Rule 3: If we don’t need to switch, then we can also go one step to the right.
Rule 4: If we did switch then go one step to the left.

Now lets see how the rules can be applied to the string.

Step 1: 1|29|3004|3|2|24 – Rule 1
Step 2: 1|29|3004|3|2|24 – Rule 2
Step 3: 1|29|3004|3|2|24 – Rule 3
Step 4: 1|29|3|3004|2|24 – Rule 1
Step 5: 1|3|29|3004|2|24 – Rules 4+1
Step 6: 1|3|29|3004|2|24 – Rule 3 (here I used an additional optimisation, by storing a high water mark and jumping as far right as the HWM allows)
Step 7: 1|3|29|2|3004|24 – Rule 1

Stop: We can finish the iteration if the HWM is to the far right and if no switch is needed anymore.

model SQL explained in detail

So how is this logic implemented in the model clause?

A token is found using a regular expression with the position of the token.

regexp_substr(src,'[^\|]+',1,4)

This finds the fourth token.

We start with a way to number our rows. The initial row is defined as 0.

dimension by (0 as dim)

0 is only used because later we address our rows using an iteration_number. If we would start with 1, then we can potentially overwrite our data. Often 0 is a good starting row.

We then define several columns. And the initial value for those columns.

measures (  src
   , 1 ele_pos1
   , 2 ele_pos2
   , cast (regexp_substr(src,'[^\|]+',1,1) as varchar2(100)) token1
   , cast (regexp_substr(src,'[^\|]+',1,2) as varchar2(100)) token2
   , regexp_count(src,'\|')+1 max_element
   , 2 hwm
   , cast (src as varchar2(500)) as res
   )

The datatype and size of the columns is automatically deducted from the initial value. This is why cast is sometimes needed, if the column size should be bigger than what the initial value indicates.

Now we define that an iterative model is to be used.

rules iterate (500) until (stop criteria)

Iterative models are good for implementing procedural logic. Or for tasks when it is unknown beforehand what the area for the calculation should be.

It gives us a variable called iteration_number which can be used as a cell address.

Now lets look at a few rules.

-- switch tokens if left token > right token
res[iteration_number+1] 
    = case when to_number(token1[iteration_number]) > to_number(token2[iteration_number]) then
      -- do the switch
      regexp_replace(
             regexp_replace(res[iteration_number],'[^\|]+',token2[iteration_number],1,ele_pos1[iteration_number])
             ,'[^\|]+',token1[iteration_number],1,ele_pos2[iteration_number])
      else res[iteration_number] -- no switch needed
      end ,

The left hand side of the rule “res[iteration_number+1]” creates a new row for the res column. The right hand side of the rule references data from the previous row “token1[iteration_number]” to do the switch logic.

If data from the same row is needed, it is possible to use “token1[cv()]”. cv() stands for cell value of the current dimension. It is possible to calculate with those values. So instead of token1[iteration_number] we could also write “token1[cv()-1]” to fetch the value from the previous row. For iterative models I find it more convenient to stay consistent and use iteration_number instead.

This rule implements rule 1 (switch tokens) from the bubble sort logic. However it is not a 1:1 matching of rules. We can see this when looking at the next rule.

Instead the logic needs to be implemented for each column. To calculate the next position a similar case construct is needed.

-- calculate next position (for token1)
ele_pos1[iteration_number+1] 
       = case
         when ele_pos1[iteration_number] = 1 then
           -- we reached first position, so go back to hwm
           hwm[iteration_number]
         when to_number(token1[iteration_number]) > to_number(token2[iteration_number]) then
           -- after a switch, move one position the the left and check there
           ele_pos1[iteration_number] - 1
         when ele_pos1[iteration_number] + 1 <= max_element[0]-1 then           -- no more switch, so go to hwm and look for next element            hwm[iteration_number]  
         else 
           -- just in case, don't do anything
           ele_pos1[iteration_number]
         end

Bubble sort rules 2-4 are implemented by a case construct.

As we can see in this example, the rules from the model clause do not match our business rules. Sometimes they do, but more often they do not.

conclusion

The Model clause can open up new ways to do solve SQL puzzles. Bubble sort is a good candidate for an iterative model.

For real world cases the Model clause is hardly maintainable. If we find a solid way to match business rules to model clause rules, then we have a good way to react to future changes of those business rules.

Sometimes model clause can provide a performance advantage, because of they way how the data is handled.

Model clause is a tool in our toolbox, although an extremely sophisticated and complex one. We need to train using this tool on a regular basis.

btw: Some of the constructs from the model clause can be rediscovered in other statements. For example the new analytic views in 12.2 also have a “dimension by” and a “measures” part. But the rules are missing (one can argue that attribute dimensions and hierarchies resemble the rules).

is {JSON}, or not is {JSON}, that is the question

Whether ’tis nobler in the mind to suffer
The [] and “” of outrageous json,
Or to take Arms against a Sea of troubles,
And by opposing {} end them: to select, to browse
No more; and by a browse, to say we end
the json array, and the thousand json objects
that Flesh is heir to? ‘Tis a consummation
devoutly to be wished. To select, to browse…

Intro

So I have this logging table where sometimes the message can be a json document. Most of the times it is an error message or some tracing information. But sometimes I want to show what data currently is inside a plsql collection. I wrote a small conversion function, that returns a clob which should contain JSON. This json ends up as a message in my logging table.

Inside an apex application I added the possibility to show this JSON as a modal page. I use the JSONView Plugin, however most browsers now have a default JSON display capability.

I detect if the message contains json data by using the 12.1 IS JSON operator.

In case it is json, I provide a link to the modal page and print the data there.

case 
  when LOG_MESSAGE is json 
  then
    '<a href="'||
    apex_util.prepare_url('f?p='||v('APP_ID')||':123:'||v('APP_SESSION')
    ||'::::P123_LOG_ID:'||to_char(LOG_ID))
    ||'" title="show JSON data">{JSON}</a>'
else 
  '<pre>'||LOG_MESSAGE||'</pre>'
end as log_message_enhanced

 
The modal page uses the application/json mime type.

The result looks similar to this:

 
modal_json_ok
 

How build such an apex page or how to do a conversion from a plsql collection to json is not part of this blog post. If your are interested in that, please leave a comment and I might consider to publish how to do so.

Problem

For certain documents the browser was not able to show the json. Instead it returned an error message and the json in text format. Here is an example.
modal_json_error

Sorry for the German Message, but I was not able to switch my Firefox to english mode easily. The error essentially says: “We could not parse the json, it seems the document does not comply to the specifications.”

So the database says it is json, but the browser says it is not!

What is going on? That is the question.
 

Solution

The reason for the behaviour is that Oracles IS JSON check uses the lax json syntax (by default). LAX json allows several things, among others it allows to have a list of objects with a trailing comma at the end. Exactly my issue.

Here is a basic SQL demonstration. Note the comma after the “Larry” inside the json object step.

select * from dual
where '[{"index":1,"name":"Larry",}]' 
      is json;
DUMMY
-----
X

So it is JSON. But only lax json.

Fortunatly we can also do a check for the more strict json interpretation.

select * from dual
where '[{"index":1,"name":"Larry",}]' 
      is json (STRICT);
No rows selected.

Can you spot the difference? The STRICT keyword including parenthesis tells the database that the document needs to confirm to the more strict specification.

This is of cause documented: About Strict and Lax JSON Syntax

The main differences are

  • STRICT: each JSON field and each string value must be enclosed in double quotation marks (“).
    LAX: An object literal can also be enclosed in single quotation marks (‘).
  • LAX: Case variations for keywords true, false, and null (for example, TRUE, True, TrUe, fALSe, NulL).
  • LAX: An extra comma (,) after the last element of an array or the last member of an object (for example, [a, b, c,], {a:b, c:d,}).
  • LAX: Numerals
    • with one or more leading zeros (for example, 0042.3).
    • Fractional numerals that lack 0 before the decimal point (for example, .14 instead of 0.14).
    • Numerals with no fractional part after the decimal point (for example, 342. or 1.e27).
    • A plus sign (+) preceding a numeral, meaning that the number is non-negative (for example, +1.3).

But there are more differences.

Using the (STRICT) keyword, solved my problem. Only those json documents where linked, that could be shown in the browser. The others were rendered as normal text.

Conclusion

 
Be aware that there are slightly different JSON specifications available.

If you want to show json inside a browser, then make sure the json document confirms to the strict json specification.

If you want to use json inside javascript, then probably the lax version is better suited.