I found the following interesting issue by accident during an performance analysis using ASH and real time SQL monitoring.
We (@gassenmj and myself) saw that all connections from an application server connection pool were suddenly “hanging” at the same SQL statement. The single session did have wait times ranging from a few seconds up to 30 seconds and more. Additionally some CPU throttleling was going on.
After a couple of minutes the system healed itself and went back to normal behaviour. Those were the symptoms. Although that was not directly related to the performance analysis we had planned, we decided to investigate further.
Here follows a list of statements that helped to analyze the problem.
check session infos
select * from v$session; select * from v$session_wait --where sid=sys_context('userenv','sid') order by wait_time+seconds_in_wait desc;
We saw a high number of “cursor: pin S wait on X” events with a high wait time. At that time I didn’t know what the issue was, but it was obvious that there is some kind of problem. If you encounter a high number of sessions all with the same wait event, then you might want to investigate further. Especially if the system seems generally slow while this event happens.
check the active session history (ASH) report
If you are allowed to use ASH, then run the following selects for a quick check. ASH typically holds information only for the last 30 minutes. Then the data is consolidated into the AWR report. If possible I prefere ASH because it samples information every second.
-- ash info select ash.sample_time, ash.session_id, ash.session_serial#, ash.sql_id, ash.event, -- ash.p1, ash.p1text, ash.p2, ash.p2text, ash.p3, ash.p3text, ash.wait_time, ash.time_waited, ash.blocking_session, ash.blocking_session_serial#, ash.in_parse --, ash.in_hard_parse, ash.in_sql_execution from v$active_session_history ash where ash.event = 'cursor: pin S wait on X' order by ash.sql_id, ash.sample_time;
Eventually the data is moved into the dba_hist_active_sess_history view.
-- awr info select sql_id, sample_time, session_id, wait_time, time_waited, module, machine from dba_hist_active_sess_history where event = 'cursor: pin S wait on X' order by sql_id, sample_time;
We can consolidate the infomation over all sessions that seem to be connected to the issue. This summary statment also includes the session that seems to cause the issue.
-- ash summary with sum_by_session as ( select ash.sql_id, ash.session_id, trunc(ash.sample_time,'HH') timeframe, min(ash.sample_time) min_time, max(ash.sample_time) max_time, sum(ash.wait_time) + sum(ash.time_waited) total_wait, ash.event, ash.p1, ash.p1text, ash.p2, ash.p2text, ash.p3, ash.p3text, ash.in_hard_parse from v$active_session_history ash join v$active_session_history sqlids on sqlids.sql_id = ash.sql_id where (ash.event like 'cursor: pin S%' or ash.in_hard_parse = 'Y' ) and sqlids.event = 'cursor: pin S wait on X' group by ash.sql_id, ash.session_id, ash.event, ash.p1, ash.p1text, ash.p2, ash.p2text, ash.p3, ash.p3text, ash.in_hard_parse ,trunc(ash.sample_time,'HH') ) select s.sql_id, to_char(s.timeframe,'dd-Mon-RR HH24') timeframe, to_char(min(s.min_time),'HH24:MI:SS')||'-'||to_char(max(s.max_time),'HH24:MI:SS') timeperiod, round(sum(total_wait)/1000000,2) total_wait_in_s, s.event, s.p1, s.p1text, s.p2, s.p2text, s.p3, s.p3text, s.in_hard_parse, listagg(s.session_id,',') within group (order by s.session_id) as sids from sum_by_session s group by s.sql_id, s.event, s.p1, s.p1text, s.p2, s.p2text, s.p3, s.p3text, s.in_hard_parse, s.timeframe order by s.sql_id, s.in_hard_parse desc, s.timeframe;
Here is some sample output from a test database. It shows a list of sessions (column SQLIDs) that are waiting for cursor: pin S wait on X.
SQL_ID TIMEFRAME TIMEPERIOD TOTAL_WAIT_IN_S EVENT P1 P1TEXT P2 P2TEXT P3 P3TEXT IN_HARD_PARSE SIDS a7s0bjm8467xg 22-mai-18 04 04:21:10-04:21:10 0,46 cursor: pin S wait on X 3494059951 idn 674309865472 value 21474836480 where N 20,255,301 cq42gcnqfs5vd 22-mai-18 04 04:21:09-04:21:09 0,04 1 file# 31620 block# 1 blocks Y 247 cq42gcnqfs5vd 22-mai-18 04 04:21:10-04:21:10 0,03 1 file# 31323 block# 1 blocks Y 247 cq42gcnqfs5vd 22-mai-18 04 04:21:09-04:21:10 610,9 cursor: pin S wait on X 753669997 idn 1060856922112 value 21474836480 where N 18,51,70,73,127,169,191,229,326,379,390,426,436,456 3320xrzqy5t5c 22-mai-18 12 12:21:13-12:21:13 0 1 file# 22750 block# 1 blocks Y 269 3320xrzqy5t5c 22-mai-18 12 12:21:13-12:21:13 0,11 cursor: pin S wait on X 3991069868 idn 1155346202624 value 21474836480 where N 314 76fxznczqskp8 22-mai-18 00 00:21:07-00:21:07 0 1 file# 589 block# 1 blocks Y 15 76fxznczqskp8 22-mai-18 00 00:21:08-00:21:08 0,05 1 file# 31323 block# 1 blocks Y 15 76fxznczqskp8 22-mai-18 04 04:21:08-04:21:08 0,04 1 file# 507 block# 1 blocks Y 73 76fxznczqskp8 22-mai-18 00 00:21:07-00:21:08 404,66 cursor: pin S wait on X 1064061608 idn 64424509440 value 21474836480 where N 8,67,90,151,167,193,221,253,314,351,364,367,419,456 76fxznczqskp8 22-mai-18 04 04:21:08-04:21:08 419,54 cursor: pin S wait on X 1064061608 idn 313532612608 value 21474836480 where N 18,51,70,127,169,191,247,326,379,390,426,436,456
In this example we had many sessions with the “cursor: pin S wait on X” wait event for the cursor “cq42gcnqfs5vd”. Those sessions combined waited for more than 10 minutes! And we also see two lines for the session that created this cursor (hard_parse=Y). Two lines because the hard parse took considerable time and ASH samples every second. The P2 info changed during that (last two bytes changed). I assume this means that the hard parse was progressing.
In general the summary statement shows if this wait event happend recently and also that many sessions where waiting for this event.
Finding the problem statement
select * --sql_id, sql_text, sql_fulltext from v$sql where sql_id = 'cq42gcnqfs5vd';
select sql_id, module, sql_fulltext, cpu_time, elapsed_time, concurrency_wait_time, version_count, parse_calls, invalidations from v$sqlarea where sql_id = 'cq42gcnqfs5vd';
The column SQL_FULLTEXT will hold the statement. But this information might age out of the SGA. So the sooner after the issue you run this the better. The good news is that the hashed sql_id is stable. Meaning if the same statement is run later again, you will see the same sql_id. And since we are investigating an issue that happened because the exactly same statement was run several times at the same moment (many sessions where waiting), there is a good chance that we get lucky and see this command in v$sql.
Very useful are the event_histograms. It can give an indication how severe the problem is. If we see a recent LAST_UPDATE_TIME for one of the high numbers (>= 4096) then some action might be needed. Also the v$event_histogram view helps to monitor if the issue disappears or is lowered once it is resolved.
select * from v$event_histogram where event = 'cursor: pin S wait on X';
EVENT# EVENT WAIT_TIME_MILLI WAIT_COUNT LAST_UPDATE_TIME 282 cursor: pin S wait on X 1 393 22.05.18 07:09:40,837552 +02:00 282 cursor: pin S wait on X 2 72 06.05.18 20:21:14,069132 +02:00 282 cursor: pin S wait on X 4 302 19.05.18 20:52:37,773557 +02:00 282 cursor: pin S wait on X 8 2043 22.05.18 07:09:41,041724 +02:00 282 cursor: pin S wait on X 16 6586 22.05.18 01:07:19,804808 +02:00 282 cursor: pin S wait on X 32 14719 22.05.18 07:09:41,054201 +02:00 282 cursor: pin S wait on X 64 16058 22.05.18 12:21:14,725227 +02:00 282 cursor: pin S wait on X 128 7514 22.05.18 12:21:13,702598 +02:00 282 cursor: pin S wait on X 256 10496 22.05.18 07:09:40,366636 +02:00 282 cursor: pin S wait on X 512 11360 22.05.18 07:09:40,364821 +02:00 282 cursor: pin S wait on X 1024 2123 22.05.18 07:00:05,691792 +02:00 282 cursor: pin S wait on X 2048 2240 22.05.18 04:21:11,172316 +02:00 282 cursor: pin S wait on X 4096 284 03.05.18 12:22:09,853604 +02:00 282 cursor: pin S wait on X 8192 169 30.04.18 16:22:30,975458 +02:00 282 cursor: pin S wait on X 16384 180 30.04.18 16:22:30,985937 +02:00 282 cursor: pin S wait on X 32768 31 19.09.17 18:50:00,401702 +02:00 282 cursor: pin S wait on X 65536 2 04.05.16 22:09:14,572347 +02:00
I run this on May 22nd. We see mutex sleeps for up to 2 seconds (column wait_time_milli = 2048) at 4 a.m. And a high number of mutex sleeps for 1 second at around 7 a.m. Nothing too critical but so that it might be worth checking what is going on. More about mutex sleeps in the section that explains the problem.
We also see an entry for 16 seconds from 30th of April. This was the time before we fixed the problem. The event history shows that since that time, we never had such long sleeps anymore. Which proves that our fix is working.
What the event histogram does not show, is that the waits from May 22nd and from April 30th are from a different sql_ids. So it give us only a very crude monitoring.
Event detail and mutex analysis
select * from GV$MUTEX_SLEEP_HISTORY where blocking_session = 247;
This should show several sesssions during the problematic time frame. The location is the most interesting column. It will probably be this location
We can use this location to check out the general mutex sleep history.
select MUTEX_TYPE,LOCATION,REQUESTING_SESSION,BLOCKING_SESSION,SLEEP_TIMESTAMP from GV$MUTEX_SLEEP_HISTORY where location='kkslce [KKSCHLPIN2]'
And to get an impression about the severity of the issue, lets compare the event against other mutex sleeps.
select * from v$mutex_sleep --where location='kkslce [KKSCHLPIN2]' order by sleep time desc;
MUTEX_TYPE LOCATION SLEEPS WAIT_TIME Cursor Pin kkslce [KKSCHLPIN2] 1555183 0 Library Cache kglpndl1 95 116623 0 Library Cache kglpin1 4 107578 0 Library Cache kgllkdl1 85 105747 0 Library Cache kglhdgn2 106 83336 0 Library Cache kglllal3 111 45584 0 Library Cache kgllkc1 57 41674 0 Library Cache kglpnal1 90 37040 0 Library Cache kglget2 2 12203 0 Library Cache kgllldl2 112 11882 0 Cursor Pin kksfbc [KKSCHLFSP2] 11303 0 ...
A high number of sleeps here shows that we have some issue.
But enough about such statements. I will explain a little why they are relevant for the problem and also what kind of problem there is.
Some technical background information
There are several recommended publications that helped me to understand and solve the issue. See links to Tanel Poder and Jonathan Lewis articles in the link section at the end of this post.
About this wait event: doc about “cursor: pin S wait on X”
The doc explains a little bit what happens. This wait event occurs when a session requires a shared mutex pin, while another session holds an exclusive mutex pin. Mutex pins are essentially locks on memory objects.
It is a lock to create the execution plan for a statement. The mutex organizes access to a shared cursor. During a hard parse the cursor (=execution plan) is created. The hard parse sets the exclusive lock. Other sessions are checking if there is already a cursor for the SQL statement (based upon the hashed sql_id). This is done by a shared pin call (cursor pin S). However the first session (247 in my case) did not free up this ressource yet, because it is still hard parsing. That is why we see a cursor: pin S wait on X for all the other sessions.
The issue was caused by a select that was extremly expensive to (hard) parse but which was very fast to execute.
The hard parse needed several seconds (usually 9 but up to 24 seconds). Any consecutive statement call was a soft parse using bind variables and executed in way less than 1 second.
This statement was send by some java application at the exact same time to all sessions in the connection pool. So the exact same statement, but with different bind parameters needed to run.
Each database session did this:
- First it builds a hash (SQL_ID) for the sql statement.
- Then it searches in the SGA if an entry for this SQL_ID already exists in a memory lookup table.
- Since another session, at the same time, already tries to create an execution plan (hard parse). Therefore the later session waits for the outcome of this. This is a mutex wait.
- The mutex then goes to sleep for a short time.
- It wakes up and since the hard parse is still running, it goes to sleep again. This time a little longer.
The problem with mutex sleeps is twofold.
- the sleep times tend to increase exponentially. We can see this in the mutex_sleep_history and event_histogram views. It starts with 1 microsecond,
but very soon reaches an order of several seconds.
- When a mutex tried to access a latch and sleeps on it – the latch is not free for other sessions to grab it. This can throttle the CPU.
So the mutex sleeps get longer and longer while the hard parse is still running.
Eventually the next mutex sleep will be longer than what the original hard parse time was. So even if the blocking session finished, the waiting session still doesn’t execute because the mutex sleeps now for the next 8 or 16 seconds. This can cascade to other sessions. I have seen total wait times of more than 30 seconds because of this cascading effect.
The process is partially documented (shared pool check), although not in enough detail to cover this mutex problem scenario.
For more information about how mutex sleeps are coupled with CPU starvation, check out the first article in the links section by Tanel Poder.
Two ways to address the problem immediatly spring to mind:
a) Avoid calling the same statement so many times. Maybe do a single call where all the relevant binds are added as an in-clause. Or as a subquery.
b) Improve the performance of the hard parse.
Best idea is probably to do both. Decrease the number of calls and improve hard parse times.
But often we can influence only the database side, not the caller side. Here are some ways to make the hard parse faster. There are various ways like SQL profiles, pinning the cursor, etc. available. But if you are a developer those are usually not your first choises.
Note that using SQL Baselines will probably not help (see https://jonathanlewis.wordpress.com/2014/11/23/baselines/).
As a developer think about the SQL statement first and if there is a way to help the optimizer.
Ask yourself and try out
Are all the subqueries/views really needed to get the result? Often when adding views, there are joins to tables included that are not needed. While the optimizer can use a feature called table elimination, it still needs time to consider this. Rewrite the SQL and instead of joining the view, join only the needed the base tables.
If there are lookup tables that are joined to the main queries FROM clause consider to move them to the SELECT part. This will not work, if the lookup is part of the filtering. However many lookup tables have a multi-language component. Which in turn often means, they are considered child tables for the optimizer. So that the driving table can not be correctly identified anymore. Changing that will change how the optimizer thinks about the other tables (also 1 less table for the permutations count). Sometimes moveing the lookup logic to plsql is also an option. Those values will stay while the whole process is running, so fetch them once, then use the IDs for filtering.
If larger repeating subqueries exist, consider subquery factoring (WITH clause).
Check if you have unnessecary DISTINCT operations. Sometimes this prevents optimizer possibilities like combining UNION ALL queries (join factorization). Typically this will not help to reduce hard parse times, but I’ve seen one example with lots of joined tables in the UNION ALL query, where it happend.
Hints might be a final option to use. Especially the leading hint is a way to massivly decrease the number of permutations that the optimizer needs to consider. While hints should not be the first choice, here we have a scenario where the developer has better information than the CBO. An approach that I call “minified hinted outlines” can be used. Essentially create a plan and grab all the leading hints from the outline. Then put those leading hints into the statement. This will freeze the table order but still allows the optimizer to choose between access paths like index access vs. full table scan and between many other options. I blogged about it before: SQL parse time issue – use “minified hinted outlines”. It is a solid compromise to reduce hard parse time, without totally freezing the execution plan.
This is the hint we ended up with.
/* Do not remove the LEADING hints. The next statement has an extremly long hard parse time. The hint minimized the hard parse time. Execution of the statement is very fast. Without the hint we see major blocking issues (CURSOR: PIN S WAIT ON X) for multiple concurrent sessions */ select /*+ LEADING(@"SEL$9E38E278" "OLIS"@"SEL$11" "ERRORS"@"SEL$11" "EXTRAINFO"@"SEL$11" "WFI"@"SEL$2" "ILI"@"SEL$19" "PL"@"SEL$27" "TUBELOC"@"SEL$29" "RACK"@"SEL$31" "ICH"@"SEL$17" "SOL"@"SEL$25" "BAT"@"SEL$21" "CPD"@"SEL$23") LEADING(@"SEL$FC92CB92" "OSM"@"SEL$12" "O"@"SEL$12" "S"@"SEL$12" "LI"@"SEL$12") LEADING(@"SEL$5B04CACE" "SOA"@"SEL$13" "CE"@"SEL$13") LEADING(@"SEL$AB668856" "POS"@"SEL$8" "SOA"@"SEL$7" "CWFI"@"SEL$6" "LI"@"SEL$6") LEADING(@"SEL$6A1B73E7" "SOA"@"SEL$4" "CWFI"@"SEL$3" "LI"@"SEL$3") */ r.barcode ,r.orderid ...
Test and reproduce the issue
I tried to build a test script to reproduce such long hard parses. Essentially one needs to create a lot of joins with different filter conditions on all of the tables. Maybe add a couple of union all + distinct makes sure. And we could cheat slightly. By increasing the value of the hidden parameter _optimizer_max_permutations we increase the hard parse time if we join enough tables.
So far I didn’t have the time to completly rebuild the scenario. Mostly because I have multiple statements for a real customer, which I can use to test.
This is a very rare edge case and the first time I see an effect where using evenly distributed bind variables is slower than not using them. But it must be mentioned, that both cases are far from the optimal execution time.
Forcing the hard parse by not using bind variables would have resulted in an average parsing time of 7-9 seconds. Which is incredibly slow. However it would not have caused any mutex sleeps and therefore no CPU throtteling.