The ability to trace a whole SQL session together with all wait events, sql plans, bind variables and other instrumentation facilities are one of the greatest features of the Oracle database. When SQL response times are slow, there is simply nothing better than to have a look at where that time is spent by which exact SQL statement. All this information is collected when you enable tracing for a session and analysis with tkprof is also quite straight-forward and many books have been written about this subject.
The biggest challenge I usually face is how to enable SQL trace for just the right session or finding the right log after having enabled tracing globally. In environments where multiple applications use connection pools to connect to a single database, this can get very frustrating. I also seem to have a hard time convincing application developers to simply put this into a debugging mode of their applications. After all, it would make things so much easier if they would simply enable tracing before entering the piece of code that ‘runs slow’ for just a single debugging session and then disable it afterwards.
But life just is not as simple as that, so the best I could come up with is to enable tracing for all sessions of a single application user. You may be able to also check for more paramters within the trigger code like application server IP address or service name used. Anyway the first step is to set an identifier and also to set timed_statistics just in case if this is not set globally.
CREATE OR REPLACE TRIGGER USERNAME.set_identifier AFTER logon ON USERNAME.schema
EXECUTE immediate 'ALTER SESSION SET TIMED_STATISTICS=TRUE';
Now you can enable monitoring for all sessions with this flag through DBMS_MONITOR
DBMS_MONITOR.client_id_trace_enable (client_id => 'TRACEME', waits =>
TRUE, binds => TRUE );
Deactivate the same way and if you are really done with this, delete the trigger aswell
DBMS_MONITOR.client_id_trace_disable (client_id => 'TRACEME');
drop trigger USERNAME.set_identifier;
Last weekäs analysis was fun, too. Development approached me with a weird problem. Sometimes, SQL would run significantly longer than usual. So they checked everything they could within the application server. Things hinted at the database: the same SQL with the same bind variables sometimes took more than 10 seconds for a statement that usually returns in less than a second. I explained that this could not be an issue with the optimizer since it was really the exact same statement and even if it would get re-parsed the optimizer would be using the same statistics so something funky had to be going on.
We produced SQL trace files with the method described above and indeed notices something: The ‘slow’ sessions had OPTIMIZER_MODE=’FIRST_ROWS’ set while the better performing session was set to the default ‘ALL_ROWS’. Still, this was puzzling since we could not understand why this parameter could be different for the same piece of code.
A little more code inspection revealed what was going on: A batch job was running regularly from the same application server and it set the parameter on a session level. Under some circumstances, the parameter was not set back after completion and this ‘dirty’ session was returned to the connection pool. Now sometimes an unlucky person would get this session back from the connection pool for regular work and this would screw with SQL execution.