Some quick facts about sequence caches and gaps in IDs

  1. Oracle sequences generate key values without gaps (minor exception from the rule: scalable sequences).
  2. Surrogate keys can and will have gaps.
  3. Gaps exist because it is not worth the huge effort to avoid them.

The biggest misunderstanding is probably the difference between the value coming from a sequence and the primary key value inside a database column (ID column). We can use a sequence to set a surrogate key. Typically we do this by a database trigger or an identity column or even as a value directly during the insert.

“The history of sequence caches is a history of misunderstandings.”

somebody on the internet, 1963

Essentially it is a two step process. The new ID value is created by sequence.nextval and then used during the insert. Oracle provided the logic to get a new sequence value in a fast, performant and multi-session friendly way. The developer is responsible to make sure this value is not wasted.

The sequence cache is a performance instrument. A lost cache value means we will see gaps in future ID values. There is only a very limited number of activities that makes cached sequence values “disappear”. The majority of gaps originates from other activities, like failed inserts or deletes. Many developers seem to think, that such a gap is caused by the sequence cache. I find it important to separate what reason ultimately lead to a gap in an ID value.

A gap analysis can be the first step. The following select finds gaps and orders them by size. Put your own table and PK column into the “ids” subquery and find out how the gaps are distributed in that table.

-- gap analysis
with ids as (select region_id as id, lead(region_id) over (order by region_id) as nextid from regions)
   , gaps as (select id, nextid - id as gap_size from ids where nextid > id + 1)
select gap_size , count(*) gaps_found  
from gaps
group by gap_size
order by count(*) desc;

Example result

gap_sizegaps_found
205
32
22
191

Here the data shows a few small gaps. Either some insert didn’t make it or rows had been deleted. But there are also several gaps of size 19 and 20, which means the sequence cache was probably lost at least 6 times.

To make it clear, IDs are allowed to have gaps. However when an end user complains why he constantly sees gaps in some ID, then we should investigate what is going on. Frequent gaps should not be the expected normal behaviour – it is worth finding the root cause for such an issue.

So here is a (non complete) list of activities that do or do not influence the caching of sequences and the existence of gaps in an ID column.

action result
multiple sessionscache used
multiple sessions with one session slow to commitIDs not in order of commit => not a gap eventually, but looks like a gap until final commit
alter system flush shared_pool ;cache values lost => gap in ID
shutdown immediate;cache values not lost
shutdown abort;cache values lost => gap in ID
insert; commit;no gap
insert; rollback;gap in ID
failed insertgap in ID
delete; commit;gap in ID
switch sequence from cache to nocachecache values not lost
kill sessionpossible gap in ID
(values not used|commited are lost)
different RAC sessionscache values not lost
(but separate caches on each RAC node)
SGA size too smallcache values lost => gap in ID
(sequence cache ages out of shared pool per LRU list)
session sequencesnot useful for unique ID values => duplicate ID
scalable sequencescache values not lost => planned large gap in ID
(because of prefix)
alter system cancel SQL ‘SID, serial’;
(insert killed)
gap in ID
alter system disconnect session post_transaction;no gap
(unless sequence was fetched before the transaction started)
alter system disconnect session immediate; gap in ID
sequence restartpossible duplicate IDs
sequence interval > 1planned gap in ID

That a sequence ages out of the shared pool is in my experience the most common cause of lost sequence values that are noticed. A strong indication is when frequent gaps of 20 values exist.

Possible workarounds are

a) set the sequence to nocache if the sequence is used rarely
alter myseq nocache;
This might cost performance.

b) pin the sequence in the shared pool
dbms_shared_pool.keep('MYSCHEMA.MYSEQ','Q')
Pinning the sequence is useful when you have times with lots of concurrent activities followed by larger idle periods.
This costs a little memory.

c) increase SGA size – sometimes it helps to increase the restricted shared pool area.
This costs memory (and money) – not available for other non-database processes.

Conclusions

  • Small gaps in surrogate keys (ID columns) are natural, mostly because of delete statements or failed inserts.
  • The sequence object typically does not produce gaps.
  • Very frequent gaps of size 20 (default cache size) are not natural and might need investigation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.