Wednesday, December 14, 2011

High CPU Usage and APEX (again)

A couple months ago, I wrote about a customer who was experiencing High CPU Waits presumably from Oracle Application Express.  Well, earlier this week I was at another customer site with exactly the same problem.

This was a case of "the business users" using Oracle Application Express to build an application.  The application was built in short order and well received by the end users.  The application became popular, they expanded the application and grew the user base, and after moving it to a different and faster environment, they hit a wall.  The DBA's performed some analysis on the application, and based upon the Active Session History reports, they concluded that the statements using the most CPU were all from Application Express and of the form:

   rc__ number; 
   simple_list__ owa_util.vc_arr; 
    complex_list__ owa_util.vc_arr; 
owa.init_cgi_env(:n__, :nm__, :v__); 
htp.HTBUF_LEN := 63; 
null;  null; 
simple_list__(1) := 'sys.%'; 
simple_list__(2) := 'dbms\_%'; 
simple_list__(3) := 'utl\_%'; 
simple_list__(4) := 'owa\_%'; 
simple_list__(5) := 'owa.%'; 
simple_list__(6) := 'htp.%'; 
simple_list__(7) := 'htf.%'; 
simple_list__(8) := 'wpg_docload.%'; 
simple_list__(9) := 'ctxsys.%'; 
simple_list__(10) := 'mdsys.%'; 
if ((owa_match.match_pattern(p_string => 'f' /* */, p_simple_pattern => simple_list__ , p_complex_pattern => complex_list__ , p_use_special_chars => false))) then
    rc__ := 2; 
    null; null; f(p=>:p); 
if (wpg_docload.is_file_download) then
   rc__ := 1; 
   null;  null;   null; 
  rc__ := 0;  null;      null;   null;  commit; 
   owa.get_page(:data__, :ndata__); 
end if; 
end if; 
:rc__ := rc__; 

The problem was, as I tried to explain to the DBA's, this PL/SQL block merely reflected the execution of a page view in Application Express. The ASH report will aggregate the "top" sections by top-level call, but the problem really lies within these calls, with what the application developers coded in their application.  For all practical purposes, the overhead of the APEX engine is fixed, and anything else beyond that is what is coded inside of the application by the application developer.  The DBA's asked a very reasonable question, namely "what do we do next?"

We started the process by analyzing the activity log of Oracle Application Express for the particular time window, to identify what was consuming the vast majority of time.  And we were able to produce a nice listing of applications and pages, sorted by average execution time (really no differently than what I posted earlier).  But once we did this, the DBA's wanted to be able to drill down into what was executing on those pages so they could analyze it and tune it.  Given that the vast majority of this customer's primary application used SQL Reports and Interactive Reports, we joined the activity log with the APEX view which contained the SQL statements for the reporting regions on those pages (APEX_APPLICATION_PAGE_REGIONS).  We ended up with the following query:

select sum(elapsed_time), count(*), 
       to_char(avg(elapsed_time),'9999.99') avg_elap, 
       a.application_id||':'||a.page_id app_and_page,
  from apex_workspace_activity_log a
  join apex_application_page_regions b 
    on a.application_id = b.application_id and a.page_id = b.page_id
 where a.view_date between to_date('201112142100','RRRRMMDDHH24MI') 
                       and to_date('201112142105','RRRRMMDDHH24MI')
   and a.page_id is not null
   and b.source_type like '%Report%'
 group by a.application_id||':'||a.page_id, dbms_lob.substr(b.region_source,2000,1)
having count(*) > 10
 order by 3 desc

This query can also be issued directly from SQL*Plus by the DBA's.  If they connect as SYS or as a user who has been granted the database role APEX_ADMINISTRATOR_ROLE, in Application Express 4.1, they will be able to query across all workspaces and APEX applications on the instance.

Understand that there are many places in an application which could be causing performance issues.  There could be slow queries in an authorization scheme that is used in multiple places on a page.  There could be processes or validations or elements on page 0 or a slow page sentry or a host of other things that could be causing degraded performance and which would become apparent when debugging a page.  But to identify the report queries on pages and associate those with the activity log and slowest performing pages, the above query is a great place to start.


Kevan Gelling said...
This comment has been removed by the author.
Natalio GUada said...

We have the same problem, CPU is hitting the roof but whe using your query (Changing the dates) we get no rows. Our environment is using the runtime version.

Our DBAs are pointing to the follwing queries as culprits, but we have no idea how to lessen the impact.

1. begin f(p=>:1 ); commit; end;

2. select /*+ connect_by_filtering */ privilege#,level
from sysauth$ connect by grantee#=prior privilege# and privilege#>0 start with grantee#=:1 and privilege#>0


Joel R. Kallman said...

Hi Natalio,

I'm familiar with your SR which is being worked on by Oracle Support.

1) The first statement is the APEX engine being invoked. It is what is underneath this call (your queries and application logic) where the problem lies. I talked about this here:

2) This is a normal, internal query that is performed by the database, checking if the parsing user has privileges on a particular object. I see that Support has recommended running GATHER_FIXED_OBJECTS_STATS, which is probably sound advice.

3) I've not seen this one before. Are you using cached pages/regions? I would ensure that statistics are collected on the APEX_040100 schema.

Lastly, I see that Support may have recommended a high number for session_cached_cursors. I'm not so sure that's a good idea. See