Looking at Oracle’s performance with SQL Trace and TKPROF April 9, 2012
Posted by Mich Talebzadeh in Oracle.trackback
The SQL Trace facility and TKPROF (ToolKit Profiler) are two comprehensive and powerful performance diagnostic and summarising tools that can be used to identify the performance issues with Oracle code. Unlike AUTOTRACE that was covered here, SQL Trace covers PL/SQL blocks and can be used for all sorts of performance investigation.
There are a number of methods to trace oracle sessions so I am not going to cover them here. Rather, we will focus using SQL Trace within a session. For the examples here I will use DBMS_MONITOR package
Notes:
- DBMS_MONITOR package once installed, its execution is granted to DBA user account.
- The DBA user account can grant execution on the DBMS_MONITOR package to other users. For example grant execute on DBMS_MONITOR to sh;
- DBA user account can trace it for other sessions for example EXECUTE DBMS_MONITOR.SESSION_TRACE_ENABLE(session_id => 51, serial_num => 60, waits => TRUE);
Tagging and identifying trace files
Trace output is written to the database’s trace directory that can be identified by
SHOW PARAMETERS user_dump_dest
The default name for a trace files is in the format INSTANCE_PID_ora_TRACEID.trc where:
- INSTANCE is the name of the Oracle instance
- PID is the operating system process ID from V$PROCESS.SPID
- TRACEID is a character string of your choosing. This can be set via ALTER SESSION SET TRACEFILE_IDENTIFIER = “some identifer”
- trc is the file extension
SQL Trace for a simple query
Let us try the query that I used to illustrate SQL tracing with AUTOTRACE here. The code for creating table t1 and populating it is shown below
CREATE TABLE t1 AS
SELECT
ROWNUM k1,
DECODE(ROWNUM,1,1,0) c,
LPAD('A',255,'A') c3
FROM
DUAL
CONNECT BY
LEVELUSER,TABNAME=>'T1',CASCADE=>TRUE)
Note that column k1 is monolithically increasing number. Column c depends on k1. When k1 = 1 then it is otherwise it is zero. Column lpad is just padding. In this case for column c the first entry would be 1 and the rest would be zero. This will create a highly skewed column. We will then use DBMS_MONITOR.SESSION_TRACE_ENABLE procedure to diagonise the query (note waits=>true means that we want wait information to be displayed) and we tag it as test_t1_with_trace:
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'test_t1_with_trace'; EXEC DBMS_MONITOR.SESSION_TRACE_ENABLE ( waits=>true ); SELECT k1, c FROM t1 WHERE c <> 0;
Once the code is run the trace file will look something like below
-rw-r----- 1 oracle dba 13450 Apr 9 18:38 mydb_ora_18420_test_t1_with_trace.trc
It is a detailed file and large. One can use Ttkprof utility (lowercase in UNIX/Linux, located in $ORACLE_HOME/bin) to summarize the TRACE output. A simple output in this case excluding system calls will be sufficient
tkprof <TRACE_FILE> <OUTPUT_FILE> sys=no tkprof mydb_ora_18420_test_t1_with_trace.trc mydb_ora_18420_test_t1_with_trace.txt sys=no
And it returned
K1 C ---------- ---------- 1 1
You can use tkprof with sort option that sorts traced SQL statements in descending order of specified sort options. These are:
PRSCNT Number of times parsed PRSCPU CPU time spent parsing PRSELA Elapsed time spent parsing PRSDSK Number of physical reads from disk during parse PRSMIS Number of consistent mode block reads during parse PRSCU Number of current mode block reads during parse PRSMIS Number of library cache misses during parse EXECNT Number of executes EXECPU CPU time spent executing EXEELA Elapsed time spent executing EXEDSK Number of physical reads from disk during execute EXEQRY Number of consistent mode block reads during execute EXECU Number of current mode block reads during execute EXEROW Number of rows processed during execute EXEMIS Number of library cache misses during execute FCHCNT Number of fetches FCHCPU CPU time spent fetching FCHELA Elapsed time spent fetching FCHDSK Number of physical reads from disk during fetch FCHQRY Number of consistent mode block reads during fetch FCHCU Number of current mode block reads during fetch FCHROW Number of rows fetched
A simple example with sort would be
tkprof mydb_ora_18420_test_t1_with_trace.trc mydb_ora_18420_test_t1_with_trace.txt sys=no sort = "(PRSDSK, EXEDSK, FCHDSK)"
The interest would be in getting various matrices for the query
SELECT k1, c FROM t1 WHERE c 0 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.01 0.08 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.17 0.35 3911 3930 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.18 0.44 3911 3930 0 1 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 97 Rows Row Source Operation ------- --------------------------------------------------- 1 VIEW index$_join$_001 (cr=3930 pr=3911 pw=0 time=0 us cost=6388 size=8 card=1) 1 HASH JOIN (cr=3930 pr=3911 pw=0 time=0 us) 1 INDEX FAST FULL SCAN IND_T1_C (cr=1826 pr=1818 pw=0 time=0 us cost=2276 size=8 card=1)(object id 87301) 1000000 INDEX FAST FULL SCAN IND_T1_K1 (cr=2104 pr=2093 pw=0 time=622503 us cost=2621 size=8 card=1)(object id 87300)
Misses in library cache during parse: 1 indicates that this was the first time we ran this query (after a reboot). The Row Source Operations provide the number of rows processed for each operation executed on the rows and additional row source information. There is an index fast full scan of the unique index IND_T1_K1 hashing it as build stream with 1,000,000 resultset, followed by index fast full scan of IND_T1_C returning one row that satifies the filter c != 0. The resultset from scanning this index is chosen as probe stream. Index fast full scan of IND_T1_K1 cost pr=2,093 physical block reads taking time=622,503 microseconds. The consistent reads were cr=2,104 (logical block reads). The optimizer estimated the costing for this operation to be cost=2,621 units returning size=8 bytes of data and cardinality card=1 rows. The scan of IND_T1_C cost 1,818 physical blocks reads taking 0 microseconds. The consistent reads were 1,826 logical block reads. The optimizer costing estimate was 2,276 units returning 8 bytes of data and 1 row.
It is important to note that the execution plan has a hierarchical structure, and parents I/O, cost, timing etc. is a sum of corresponding metrics for all its children plus the metrics of the operation itself. In the above, the hash join operation is the parent operation and the two index fast full scans are the children operations. The children statistics are rolled into the parent’s statistics. Turning to hash join operation we see a total of cr= 2104 + 1826 = 3930, pr= 2093 + 1818 = 3911. There was no physical block writes to disks pr=0 as expected. Previously we mentioned that the children statistics are rolled into the parent’s statistics. However, with default statistics=typical, the rowsource/plan execution statistics are disabled (I will cover this later). Hence that time=0 does not mean that no time was spent on hash join operation.
That one row was returned in one execution of the SQL statement. To retrieve that one row it cost 0.18 sec CPU time and 0.44 sec elapsed time. There were 3,930 queries (logical block I/Os) that included 3,911 physical block reads. Note the CPU time and elapsed time differences.
Further we will need to investigate the cause of this elapsed time. Oracle Wait Event section in TKPROF report shows
Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 2 0.00 0.00 Disk file operations I/O 3 0.00 0.00 db file sequential read 2 0.00 0.00 db file scattered read 62 0.00 0.12 SQL*Net message from client 2 0.00 0.00
To retrieve those records in total they were 62 db file scattered read Physical I/O taking 0.12 sec (note Physical I/O not blocks).
db file scattered read happens when multiple I/O blocks are read and scattered into the buffer cache. These buffer caches may or may not be contigious. This could be as a result of table scan or fast full scan of an index (as in this case). Oracle records the time using the wait event facility and gives the system call a standard name of db file scattered reads. There were no appreciable waits for other events.
Before moving to the second example, it is worthwhile to mention wait events in Oracle.
Oracle Wait events
When Oracle executes an SQL statement, it is not constantly executing. Sometimes it has to wait for a specific event to happen before it can proceed. For example, if Oracle (or the SQL statement) wants to read data, and the corresponding database block is not currently in the database buffer cache, Oracle waits for this block to be read from the underlying storage.
When a server process asks the operating system for data blocks (block I/O) and places them into the buffer cache, Oracle records the time using the wait event facility and gives the system call a standard name depending on the nature of block I/O. As a result, three wait events namely; db file sequential read, db file scattered read and direct path read are prominent dealing with block I/O, hence we will explain them here.
db file sequential read
db file sequential read is usually a single block IO, it is an indexed read and not a result of a full scan. It is also possible to see sequential reads for more than one block access. Single block I/Os go to the mid point insertion.
db file scattered read
db file scattered read happens when multiple I/O blocks are read and scattered into the buffer cache. These buffer caches may or may not be contigious. This could be as a result of table scan or fast full scan of an index. Oracle records the time using the wait event facility and gives the system call a standard name of db file scattered reads.
Direct path read
To read large tables Oracle deploys another strategy to have the server process read the blocks, and process them in its own PGA memory, as opposed to placing them into the buffer cache. This is deemed to be quicker for this one process because buffer cache management overheads like finding a free buffer and latching it does not occur. There is also another dimension to this. In a multi-user environment you ideally do not want to fill the buffer cache with blocks from large tables. For these reasons Oracle can deploy direct path read to satisfy the need of a given session.
The execution engine decides whether a scattered read (multi-block IO call with the results stored in Oracle’s buffer cache) or a direct path read (multi-block IO call with the results processed in the server process’s PGA memory) is performed.
SQL Trace for a PL/SQL block
In the example below we will use SQL Trace and tkprof output for diagonising the performance of a PL/SQL block. What we are trying to establish here is the waits caused by reading data from the underlying disk.
We start with a table of random data. We will call this table tdash and we ensure that this table has enough rows to provide meaningful test values. The structure of this table is as follows and is based on all_objects table:
CREATE TABLE "TDASH" ( "OWNER" VARCHAR2(30 BYTE) NOT NULL ENABLE, "OBJECT_NAME" VARCHAR2(30 BYTE) NOT NULL ENABLE, "SUBOBJECT_NAME" VARCHAR2(30 BYTE), "OBJECT_ID" NUMBER NOT NULL ENABLE, "DATA_OBJECT_ID" NUMBER, "OBJECT_TYPE" VARCHAR2(19 BYTE), "CREATED" DATE NOT NULL ENABLE, "LAST_DDL_TIME" DATE NOT NULL ENABLE, "TIMESTAMP" VARCHAR2(19 BYTE), "STATUS" VARCHAR2(7 BYTE), "TEMPORARY" VARCHAR2(1 BYTE), "GENERATED" VARCHAR2(1 BYTE), "SECONDARY" VARCHAR2(1 BYTE), "NAMESPACE" NUMBER NOT NULL ENABLE, "EDITION_NAME" VARCHAR2(30 BYTE), "PADDING1" VARCHAR2(4000 BYTE), "PADDING2" VARCHAR2(4000 BYTE) );
This table is populated with 1,729,204 rows. It is created on an 8K database and in order to ensure that each record in the table is larger than a block, two columns PADDING1 VARCHAR2(4000) and PADDING2 VARCHAR2(4000) were added and populated.
For the first part of the test, we create a primary key on object_id column of tdash
ALTER TABLE tdash ADD CONSTRAINT tdash_pk PRIMARY KEY (object_id);
To progress this further we will need a mechanism to read rows of tdash table into memory. It is also important that the way we do it is best done via a construct in memory. In other words, we need to loop over the index key values of table tdash, match them and read them into buffer cache. Ideally an associative array could help resolve this issue. So the approach would be to deploy an associative array to load rows from all_objects table into memory (to be precise into PGA) and then go through every row of tdash and read each row into buffer cache:
ALTER SESSION SET TRACEFILE_IDENTIFIER = 'test_with_tdash_tester';
DECLARE
type array is table of tdash%ROWTYPE index by binary_integer;
l_data array;
l_rec tdash%rowtype;
BEGIN
SELECT
a.*
,RPAD('*',4000,'*') AS PADDING1
,RPAD('*',4000,'*') AS PADDING2
BULK COLLECT INTO
l_data
FROM ALL_OBJECTS a;
DBMS_MONITOR.SESSION_TRACE_ENABLE ( WAITS=>true,
PLAN_STAT=>'ALL_EXECUTIONS' );
FOR rs IN 1 .. l_data.count
LOOP
BEGIN
SELECT * INTO l_rec FROM tdash WHERE object_id =
l_data(rs).object_id;
EXCEPTION
WHEN NO_DATA_FOUND THEN NULL;
END;
END LOOP;
END;
Note the enabling of trace and wait events:
DBMS_MONITOR.SESSION_TRACE_ENABLE ( WAITS=>true, PLAN_STAT=>'ALL_EXECUTIONS' );
Take note of PLAN_STAT=>’ALL_EXECUTIONS’ in the above line. As of 11g, the procedures in dbms_monitor accept an additional parameter: PLAN_STAT. This parameter is used to specify the frequency at which the row source statistics (i.e. information about execution plans) are written to trace files. The accepted values are the following (the default value is NULL):
- NEVER – no information about the execution plan is written in trace files.
- FIRST_EXECUTION (equivalent to NULL) – information about the execution plan is written just after the first execution.
- ALL_EXECUTIONS – information about the execution plan is written for every execution.
So for every execution in the loop the detailed statistics will be written to the trace file.
Back to the code, this is a random read via index scan on object_id column of tdash. The stats about the table and its primary key are as follows:
OBJECT OBJECT_ID SIZE/MB Blocks ROWS/BLOCK --------- ---------- ------- --------- ---------- TDASH 84459 15,240 1,943,824 0.9 TDASH_PK 84460 31 3,968 431.0
Tkprof will be used to summarize the TRACE output. Again a simple output in this case excluding system calls will be sufficient
TKPROF <TRACE_FILRE> <OUTPUT_FILE> SYS=NO
The interest would be in getting various matrices for the loop in the PL/SQL block. These are shown below:
SELECT * FROM TDASH WHERE OBJECT_ID = :B1 call count cpu elapsed disk query current rows ---- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.01 0.00 0 62 0 0 Execute 72186 0.31 1.04 0 40 0 0 Fetch 72186 0.67 131.70 81364 368663 0 71726 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 144373 0.99 132.74 81364 368765 0 71726 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 93 (recursive depth: 1) Rows Row Source Operation ------- --------------------------------------------------- 1 TABLE ACCESS BY INDEX ROWID TDASH (cr=5 pr=11 pw=0 time=0 us cost=3 size=4155 card=1) 1 INDEX UNIQUE SCAN TDASH_PK (cr=3 pr=6 pw=0 time=0 us cost=2 size=0 card=1)(object id 84460)
Misses in library cache during parse: 1 indicates that this was the first time we ran this query (after a reboot). The Row Source Operation output means a typical cost to retrieve one record from table tdash, using unique index scan was 11 physical blocks which included 6 physical blocks from the unique index on OBJECT_ID column. It is important to note that the execution plan has a hierarchical structure, and parents I/O, cost, timing etc. is a sum of corresponding metrics for all its children plus the metrics of the operation itself. In the above, the TABLE ACCESS BY INDEX ROWID operation is the parent operation and the INDEX UNIQUE SCANis the child operation. The child’s statistics are rolled into the parent’s statistics.
Those 71,726 rows were returned in 72,186 executions of the SQL statement. To retrieve those 71,726 rows it cost 0.99 sec CPU time and 132.74 sec elapsed time. There were 368,765 queries (logical block I/Os) that included 81,364 physical block reads. Note the CPU time and elapsed time differences
Further we will need to investigate the cause of this elapsed time. Oracle Wait Event section in tkprof report shows
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ Disk file operations I/O 3 0.00 0.00 db file sequential read 12471 0.15 72.45 db file scattered read 14587 0.11 56.53 ********************************************************************************
So to retrieve those records in total they were 12,471 db file sequential read Physical I/O taking 72.45 sec (note Physical I/O not blocks). Additionally, there were 14,587 db file scattered read Physical I/O taking 56.53 sec in total.
Filtering the TRACE file on the primary key Waits (object_id = 84460), we get 38 physical I/Os of which 21 were db file scattered read and the remainder was db file sequential read. For the table itself (object_id = 84459), the total waits were 27,022 of which 14,566 were due to db file scattered read and 12,456 were db file sequential read. Note the smaller number of physical I/Os attributed to the primary key
To make this clearer we will look at a typical db scattered file read for table in the TRACE file:
WAIT #13: nam='db file scattered read' ela= 8033 file#=11 block#=1386920 blocks=3 obj#=84459 tim=1328711960280846
What this is telling us is that it took ela = 8033 microseconds elapsed time to fetch blocks = 3 blocks in one physical I/O from the table.

Thank you. It was very helpful for me understanding the details of TKPROF report.