- Oracle sequences generate key values without gaps (minor exception from the rule: scalable sequences).
- Surrogate keys can and will have gaps.
- Gaps exist because it is not worth the huge effort to avoid them.
The biggest misunderstanding is probably the difference between the value coming from a sequence and the primary key value inside a database column (ID column). We can use a sequence to set a surrogate key. Typically we do this by a database trigger or an identity column or even as a value directly during the insert.
“The history of sequence caches is a history of misunderstandings.”somebody on the internet, 1963
Essentially it is a two step process. The new ID value is created by
sequence.nextval and then used during the
insert. Oracle provided the logic to get a new sequence value in a fast, performant and multi-session friendly way. The developer is responsible to make sure this value is not wasted.
The sequence cache is a performance instrument. A lost cache value means we will see gaps in future ID values. There is only a very limited number of activities that makes cached sequence values “disappear”. The majority of gaps originates from other activities, like failed inserts or deletes. Many developers seem to think, that such a gap is caused by the sequence cache. I find it important to separate what reason ultimately lead to a gap in an ID value.
A gap analysis can be the first step. The following select finds gaps and orders them by size. Put your own table and PK column into the “ids” subquery and find out how the gaps are distributed in that table.
-- gap analysis with ids as (select region_id as id, lead(region_id) over (order by region_id) as nextid from regions) , gaps as (select id, nextid - id as gap_size from ids where nextid > id + 1) select gap_size , count(*) gaps_found from gaps group by gap_size order by count(*) desc;
Here the data shows a few small gaps. Either some insert didn’t make it or rows had been deleted. But there are also several gaps of size 19 and 20, which means the sequence cache was probably lost at least 6 times.
To make it clear, IDs are allowed to have gaps. However when an end user complains why he constantly sees gaps in some ID, then we should investigate what is going on. Frequent gaps should not be the expected normal behaviour – it is worth finding the root cause for such an issue.
So here is a (non complete) list of activities that do or do not influence the caching of sequences and the existence of gaps in an ID column.
|multiple sessions||cache used|
|multiple sessions with one session slow to commit||IDs not in order of commit => not a gap eventually, but looks like a gap until final commit|
|alter system flush shared_pool ;||cache values lost => gap in ID|
|shutdown immediate;||cache values not lost|
|shutdown abort;||cache values lost => gap in ID|
|insert; commit;||no gap|
|insert; rollback;||gap in ID|
|failed insert||gap in ID|
|delete; commit;||gap in ID|
|switch sequence from cache to nocache||cache values not lost|
|kill session||possible gap in ID|
(values not used|commited are lost)
|different RAC sessions||cache values not lost|
(but separate caches on each RAC node)
|SGA size too small||cache values lost => gap in ID |
(sequence cache ages out of shared pool per LRU list)
|session sequences||not useful for unique ID values => duplicate ID|
|scalable sequences||cache values not lost => planned large gap in ID|
(because of prefix)
|alter system cancel SQL ‘SID, serial’;|
|gap in ID|
|alter system disconnect session post_transaction;||no gap|
(unless sequence was fetched before the transaction started)
|alter system disconnect session immediate;||gap in ID|
|sequence restart||possible duplicate IDs|
|sequence interval > 1||planned gap in ID|
That a sequence ages out of the shared pool is in my experience the most common cause of lost sequence values that are noticed. A strong indication is when frequent gaps of 20 values exist.
Possible workarounds are
a) set the sequence to
nocache if the sequence is used rarely
alter myseq nocache;
This might cost performance.
b) pin the sequence in the shared pool
Pinning the sequence is useful when you have times with lots of concurrent activities followed by larger idle periods.
This costs a little memory.
c) increase SGA size – sometimes it helps to increase the restricted shared pool area.
This costs memory (and money) – not available for other non-database processes.
- Small gaps in surrogate keys (ID columns) are natural, mostly because of delete statements or failed inserts.
- The sequence object typically does not produce gaps.
- Very frequent gaps of size 20 (default cache size) are not natural and might need investigation.