In January of 2011,
Oracle Support approached me about a Service Request they received from a customer. This customer had an APEX 4.0.2 application in production with more than 1,000 users. After they upgraded their Oracle database to version 11.2.0.2, the system ran fine for 2 days and then eventually "locked up", as they put it. They identified the problem as being related to library cache locks, and the offending statement they reported was:
begin wwv_flow_log.g_content_length := sys.htp.getcontentlength; end;
I reviewed the problem myself, reviewed it with a couple of other folks on the APEX development team, and we were mystified how this could possibly be related to library cache locks.
Fast forward to today, and I was contacted by a completely different customer who was running a benchmark with 100 simulated concurrent users (concurrent as in truly concurrent with no think time). Their conclusion after the benchmark was that there was library cache latch contention due to Application Express, and lo and behold, the statement they identified as being the source of library cache latch contention was:
begin wwv_flow_log.g_content_length := sys.htp.getcontentlength; end;
This was beyond coincidence that two different customers, in highly concurrent use of an APEX application, identify latch locks and/or latch contention issues with the very same line of the APEX source code.
The PL/SQL block in question was in the activity logging PL/SQL package of APEX. This is the package that writes to the activity log of Application Express, which tracks information like the application, page, user, elapsed time, etc. One of the changes in Application Express 4.0 was a new attribute in the activity log called content length. It was populated via a block like:
begin
execute immediate ('begin wwv_flow_log.g_content_length := sys.htp.getcontentlength; end;');
exception when others then
g_content_length := 0;
end;
I'm sure some will ask why this was done dynamically. Simply stated, because it could not be statically compiled. Most versions of the PL/SQL Web Toolkit will not contain the function getcontentlength() in the package specification. So for those installations that do contain this function, the dynamic execution will succeed. For those that don't (the vast majority), the parse of the PL/SQL block will fail, an exception will be raised, and the exception block will catch it and set a variable to 0. But why would something like this cause library cache latch contention?
I really had no idea. I found no bugs on this issue.
apex.oracle.com was on APEX 4.0.2 on 11.2.0.2 and this was never an issue. So I diagnosed this problem the only remaining way I knew how...I
asked Tom.
I knew in advance that Tom was going to rail about the
"when others", but I told him I wasn't the author of this code. He also pointed out that doing this dynamically would not result in optimal performance. Sounds good...but that's not the source of the latch contention. In a nutshell, Tom suspected that there is a change in the way the parse happens in 11.2, and a bad parse (i.e., failed parse) is more expensive in 11.2 than earlier database versions. Is it a database bug? Maybe - but it also could simply be changed behavior.
Rather than wait for this problem to be diagnosed as potentially a database bug and maybe some database patch set may contain a fix, I decided to work around this. This particular piece of code, for all practical purposes, has been commented out - no dynamic execution, no failed parse and exception processing, for each and every invocation of the activity logging procedures.
I believe this issue is quite important. For our customers on large, highly concurrent production sites on APEX 4.0, when they decide to upgrade to Database version 11.2.0.2, they may experience degradation in performance for the same application with the identical load. There is now a patch set exception (a "one-off patch") for this issue for APEX 4.0.2.00.07 on
My Oracle Support. Look for patch 12404581. The actual patch itself is not database version specific. Hopefully this blog post (and patch) will save some of our customers from future headaches.