jump to navigation

Looking at Oracle’s performance with SQL Trace and TKPROF April 9, 2012

Posted by Mich Talebzadeh in Oracle.
trackback

The SQL Trace facility and TKPROF (ToolKit Profiler) are two comprehensive and powerful performance diagnostic and summarising tools that can be used to identify the performance issues with Oracle code. Unlike AUTOTRACE that was covered here, SQL Trace covers PL/SQL blocks and can be used for all sorts of performance investigation.

There are a number of methods to trace oracle sessions so I am not going to cover them here. Rather, we will focus using SQL Trace within a session. For the examples here I will use DBMS_MONITOR package

Notes:

  • DBMS_MONITOR package once installed, its execution is granted to DBA user account.
  • The DBA user account can grant execution on the DBMS_MONITOR package to other users. For example grant execute on DBMS_MONITOR to sh;
  • DBA user account can trace it for other sessions for example EXECUTE DBMS_MONITOR.SESSION_TRACE_ENABLE(session_id => 51, serial_num => 60, waits => TRUE);

Tagging and identifying trace files

Trace output is written to the database’s trace directory that can be identified by

SHOW PARAMETERS user_dump_dest

The default name for a trace files is in the format INSTANCE_PID_ora_TRACEID.trc where:

  • INSTANCE is the name of the Oracle instance
  • PID is the operating system process ID from V$PROCESS.SPID
  • TRACEID is a character string of your choosing. This can be set via ALTER SESSION SET TRACEFILE_IDENTIFIER = “some identifer”
  • trc is the file extension

SQL Trace for a simple query

Let us try the query that I used to illustrate SQL tracing with AUTOTRACE here. The code for creating table t1 and populating it is shown below

CREATE TABLE t1 AS
SELECT
ROWNUM k1,
DECODE(ROWNUM,1,1,0) c,
LPAD('A',255,'A') c3
FROM
DUAL
CONNECT BY
LEVELUSER,TABNAME=>'T1',CASCADE=>TRUE)

Note that column k1 is monolithically increasing number. Column c depends on k1. When k1 = 1 then it is otherwise it is zero. Column lpad is just padding. In this case for column c the first entry would be 1 and the rest would be zero. This will create a highly skewed column.  We will then use DBMS_MONITOR.SESSION_TRACE_ENABLE procedure to diagonise the query (note waits=>true  means that we want wait information to be displayed) and we tag it as test_t1_with_trace:

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'test_t1_with_trace';
EXEC DBMS_MONITOR.SESSION_TRACE_ENABLE ( waits=>true );
SELECT k1, c FROM t1 WHERE c <> 0;

Once the code is run the trace file will look something like below

-rw-r----- 1 oracle dba 13450 Apr  9 18:38 mydb_ora_18420_test_t1_with_trace.trc

It is a detailed file and large. One can use Ttkprof utility (lowercase in UNIX/Linux, located in $ORACLE_HOME/bin) to summarize the TRACE output. A simple output in this case excluding system calls will be sufficient

tkprof  <TRACE_FILE> <OUTPUT_FILE> sys=no
tkprof mydb_ora_18420_test_t1_with_trace.trc mydb_ora_18420_test_t1_with_trace.txt sys=no

And it returned

K1          C
---------- ----------
1          1

You can use tkprof with sort option that sorts traced SQL statements in descending order of specified sort options. These are:

PRSCNT  Number of times parsed
PRSCPU  CPU time spent parsing
PRSELA  Elapsed time spent parsing
PRSDSK  Number of physical reads from disk during parse
PRSMIS  Number of consistent mode block reads during parse
PRSCU   Number of current mode block reads during parse
PRSMIS  Number of library cache misses during parse
EXECNT  Number of executes
EXECPU  CPU time spent executing
EXEELA  Elapsed time spent executing
EXEDSK  Number of physical reads from disk during execute
EXEQRY  Number of consistent mode block reads during execute
EXECU   Number of current mode block reads during execute
EXEROW  Number of rows processed during execute
EXEMIS  Number of library cache misses during execute
FCHCNT  Number of fetches
FCHCPU  CPU time spent fetching
FCHELA  Elapsed time spent fetching
FCHDSK  Number of physical reads from disk during fetch
FCHQRY  Number of consistent mode block reads during fetch
FCHCU   Number of current mode block reads during fetch
FCHROW  Number of rows fetched

A simple example with sort would be

tkprof mydb_ora_18420_test_t1_with_trace.trc mydb_ora_18420_test_t1_with_trace.txt sys=no sort = "(PRSDSK, EXEDSK, FCHDSK)"

The interest would be in getting various matrices for the query

SELECT k1, c
FROM
t1 WHERE c 0
call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.08          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.17       0.35       3911       3930          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.18       0.44       3911       3930          0           1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 97

Rows     Row Source Operation
-------  ---------------------------------------------------
1  VIEW  index$_join$_001 (cr=3930 pr=3911 pw=0 time=0 us cost=6388 size=8 card=1)
1   HASH JOIN  (cr=3930 pr=3911 pw=0 time=0 us)
1    INDEX FAST FULL SCAN IND_T1_C (cr=1826 pr=1818 pw=0 time=0 us cost=2276 size=8 card=1)(object id 87301)
1000000    INDEX FAST FULL SCAN IND_T1_K1 (cr=2104 pr=2093 pw=0 time=622503 us cost=2621 size=8 card=1)(object id 87300)

Misses in library cache during parse: 1 indicates that this was the first time we ran this query (after a reboot). The Row Source Operations provide the number of rows processed for each operation executed on the rows and additional row source information. There is an index fast full scan of the unique index IND_T1_K1 hashing it as build stream with 1,000,000 resultset, followed by index fast full scan  of IND_T1_C returning one row that satifies the filter c != 0. The resultset from scanning this index is chosen as probe stream.  Index fast full scan of IND_T1_K1 cost pr=2,093 physical block reads taking time=622,503 microseconds. The consistent reads were cr=2,104 (logical block reads). The optimizer estimated the costing for this operation to be cost=2,621 units returning size=8 bytes of data and cardinality card=1 rows.  The scan of IND_T1_C cost 1,818 physical blocks reads taking 0 microseconds. The consistent reads were 1,826 logical block reads. The optimizer costing estimate was 2,276 units returning 8 bytes of data and 1 row.

It is important to note that the execution plan has a hierarchical structure, and parents I/O, cost, timing etc. is a sum of corresponding metrics for all its children plus the metrics of the operation itself. In the above, the hash join operation is the parent operation and the two index fast full scans are the children operations. The children statistics are rolled into the parent’s statistics. Turning to hash join operation we see a total of cr= 2104 + 1826 = 3930 pr= 2093 + 1818 = 3911. There was no physical block writes to disks pr=0 as expected. Previously we mentioned that the children statistics are rolled into the parent’s statistics. However, with default statistics=typical, the rowsource/plan execution statistics are disabled (I will cover this later). Hence that time=0 does not mean that  no time was spent on hash join operation.

That one row was returned in one execution of the SQL statement. To retrieve that one row it cost 0.18 sec CPU time and 0.44 sec elapsed time. There were 3,930 queries (logical block I/Os) that included 3,911 physical block reads. Note the CPU time and elapsed time differences.

Further we will need to investigate the cause of this elapsed time. Oracle Wait Event section in TKPROF report shows


Elapsed times include waiting on following events:

Event waited on                             Times   Max. Wait  Total Waited
----------------------------------------   Waited  ----------  ------------
SQL*Net message to client                       2        0.00          0.00
Disk file operations I/O                        3        0.00          0.00
db file sequential read                         2        0.00          0.00
db file scattered read                         62        0.00          0.12
SQL*Net message from client                     2        0.00          0.00

To retrieve those records in total they were 62 db file scattered read Physical I/O taking 0.12 sec (note Physical I/O not blocks).

db file scattered read happens when multiple I/O blocks are read and scattered into the buffer cache. These buffer caches may or may not be contigious. This could be as a result of table scan or fast full scan of an index (as in this case). Oracle records the time using the wait event facility and gives the system call a standard name of db file scattered reads. There were no appreciable waits  for other events.

Before moving to the second example, it is worthwhile to mention wait events in Oracle.

Oracle Wait events

When Oracle executes an SQL statement, it is not constantly executing. Sometimes it has to wait for a specific event to happen before it can proceed. For example, if Oracle (or the SQL statement) wants to read data, and the corresponding database block is not currently in the database buffer cache, Oracle waits for this block to be read from the underlying storage.

When a server process asks the operating system for data blocks (block I/O) and places them into the buffer cache, Oracle records the time using the wait event facility and gives the system call a standard name depending on the nature of block I/O.  As a result, three wait events namely; db file sequential read, db file scattered read and direct path read are prominent dealing with block I/O, hence we will explain them here.

db file sequential read

db file sequential read is usually a single block IO, it is an indexed read and not a result of a full scan.  It is also possible to see sequential reads for more than one block access. Single block I/Os go to the mid point insertion.

db file scattered read

db file scattered read happens when multiple I/O blocks are read and scattered into the buffer cache. These buffer caches may or may not be contigious. This could be as a result of table scan or fast full scan of an index. Oracle records the time using the wait event facility and gives the system call a standard name of db file scattered reads.

Direct path read

To read large tables Oracle deploys another strategy to have the server process read the blocks, and process them in its own PGA memory, as opposed to placing them into the buffer cache. This is deemed to be quicker for this one process because buffer cache management overheads like finding a free buffer and latching it does not occur. There is also another dimension to this. In a multi-user environment you ideally do not want to fill the buffer cache with blocks from large tables. For these reasons Oracle can deploy direct path read to satisfy the need of a given session.

The execution engine decides whether a scattered read (multi-block IO call with the results stored in Oracle’s buffer cache) or a direct path read (multi-block IO call with the results processed in the server process’s PGA memory) is performed.

 SQL Trace for a PL/SQL block

In the example below we will use SQL Trace and tkprof output for diagonising the performance of a PL/SQL block. What we are trying to establish here is the waits caused by reading data from the underlying disk.

We start with a table of random data.  We will call this table tdash and we ensure that this table has enough rows to provide meaningful test values. The structure of this table is as follows and is based on all_objects table:

CREATE TABLE "TDASH"
(
"OWNER"          VARCHAR2(30 BYTE) NOT NULL ENABLE,
"OBJECT_NAME"    VARCHAR2(30 BYTE) NOT NULL ENABLE,
"SUBOBJECT_NAME" VARCHAR2(30 BYTE),
"OBJECT_ID"      NUMBER NOT NULL ENABLE,
"DATA_OBJECT_ID" NUMBER,
"OBJECT_TYPE"    VARCHAR2(19 BYTE),
"CREATED" DATE NOT NULL ENABLE,
"LAST_DDL_TIME" DATE NOT NULL ENABLE,
"TIMESTAMP"    VARCHAR2(19 BYTE),
"STATUS"       VARCHAR2(7 BYTE),
"TEMPORARY"    VARCHAR2(1 BYTE),
"GENERATED"    VARCHAR2(1 BYTE),
"SECONDARY"    VARCHAR2(1 BYTE),
"NAMESPACE"    NUMBER NOT NULL ENABLE,
"EDITION_NAME" VARCHAR2(30 BYTE),
"PADDING1"     VARCHAR2(4000 BYTE),
"PADDING2"     VARCHAR2(4000 BYTE)
);

This table is populated with 1,729,204 rows.  It is created on an 8K database and in order to ensure that each  record in the table is larger than a block, two columns PADDING1 VARCHAR2(4000) and PADDING2 VARCHAR2(4000) were added  and populated.

For the first part of the test, we create a primary key on object_id column of tdash

ALTER TABLE tdash ADD CONSTRAINT tdash_pk PRIMARY KEY (object_id);

To progress this further we will need a mechanism to read rows of tdash table into memory. It is also important that the way we do it is best done via a construct in memory.  In other words, we need to loop over the index key values of table tdash, match them and read them into buffer cache. Ideally an associative array could help resolve this issue. So the approach would be to deploy an associative array to load rows from all_objects table into memory (to be precise into PGA) and then go through every row of tdash and read each row into buffer cache:

ALTER SESSION SET TRACEFILE_IDENTIFIER = 'test_with_tdash_tester';
DECLARE
type array is table of tdash%ROWTYPE index by binary_integer;
l_data array;
l_rec tdash%rowtype;
BEGIN
SELECT
a.*
,RPAD('*',4000,'*') AS PADDING1
,RPAD('*',4000,'*') AS PADDING2
BULK COLLECT INTO
l_data
FROM ALL_OBJECTS a;

DBMS_MONITOR.SESSION_TRACE_ENABLE ( WAITS=>true,
PLAN_STAT=>'ALL_EXECUTIONS' );

FOR rs IN 1 .. l_data.count
LOOP
BEGIN
SELECT * INTO l_rec FROM tdash WHERE object_id =
l_data(rs).object_id;
EXCEPTION
WHEN NO_DATA_FOUND THEN NULL;
END;
END LOOP;
END;

Note the enabling of trace and wait events:

DBMS_MONITOR.SESSION_TRACE_ENABLE ( WAITS=>true, PLAN_STAT=>'ALL_EXECUTIONS' );

Take note of PLAN_STAT=>’ALL_EXECUTIONS’ in the above line. As of 11g, the procedures in dbms_monitor  accept an additional parameter: PLAN_STAT.  This parameter is used to specify the frequency at which the row source statistics (i.e. information about execution plans) are written to trace files. The accepted values are the following (the default value is NULL):

  • NEVER – no information about the execution plan is written in trace files.
  • FIRST_EXECUTION (equivalent to NULL) – information about the execution plan is written just after the first execution.
  • ALL_EXECUTIONS – information about the execution plan is written for every execution.

So for every execution in the loop the detailed statistics will be written to the trace file.

Back to the code, this is a random read via index scan on object_id column of tdash. The stats about the table and its primary key are as follows:

OBJECT       OBJECT_ID       SIZE/MB   Blocks     ROWS/BLOCK
---------    ----------      -------   ---------  ----------
TDASH             84459      15,240    1,943,824        0.9
TDASH_PK          84460          31        3,968      431.0

Tkprof will be used to summarize the TRACE output. Again a simple output in this case excluding system calls will be sufficient

TKPROF <TRACE_FILRE> <OUTPUT_FILE> SYS=NO

The interest would be in getting various matrices for the loop in the PL/SQL block. These are shown below:

SELECT *
FROM
TDASH WHERE OBJECT_ID = :B1

call     count       cpu    elapsed       disk      query    current        rows
---- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.00          0         62          0           0
Execute  72186      0.31       1.04          0         40          0           0
Fetch    72186      0.67     131.70      81364     368663          0       71726
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   144373      0.99     132.74      81364     368765          0       71726

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 93     (recursive depth: 1)
Rows     Row Source Operation
-------  ---------------------------------------------------
1  TABLE ACCESS BY INDEX ROWID TDASH (cr=5 pr=11 pw=0 time=0 us cost=3 size=4155 card=1)
1   INDEX UNIQUE SCAN TDASH_PK (cr=3 pr=6 pw=0 time=0 us cost=2 size=0 card=1)(object id 84460)

Misses in library cache during parse: 1 indicates that this was the first time we ran this query (after a reboot). The Row Source Operation output means a typical cost to retrieve one record from table tdash, using unique index scan was 11 physical blocks which included 6 physical blocks from the unique index on OBJECT_ID column. It is important to note that the execution plan has a hierarchical structure, and parents I/O, cost, timing etc. is a sum of corresponding metrics for all its children plus the metrics of the operation itself. In the above, the TABLE ACCESS BY INDEX ROWID operation is the parent operation and the INDEX UNIQUE SCANis the child operation. The child’s statistics are rolled into the parent’s statistics.

Those 71,726 rows were returned in 72,186 executions of the SQL statement. To retrieve those 71,726 rows it cost 0.99 sec CPU time and 132.74 sec elapsed time. There were 368,765 queries (logical block I/Os) that included 81,364 physical block reads. Note the CPU time and elapsed time differences

Further we will need to investigate the cause of this elapsed time. Oracle Wait Event section in tkprof report shows

Elapsed times include waiting on following events:

Event waited on                             Times   Max. Wait  Total Waited
----------------------------------------   Waited  ----------  ------------
Disk file operations I/O                        3        0.00          0.00
db file sequential read                     12471        0.15         72.45
db file scattered read                      14587        0.11         56.53
********************************************************************************

So to retrieve those records in total they were 12,471 db file sequential read Physical I/O taking 72.45 sec (note Physical I/O not blocks). Additionally, there were 14,587 db file scattered read Physical I/O taking 56.53 sec in total.

Filtering the TRACE file on the primary key Waits (object_id = 84460), we get 38 physical I/Os of which 21 were db file scattered read and the remainder was db file sequential read. For the table itself (object_id = 84459), the total waits were 27,022 of which 14,566 were due to db file scattered read and 12,456 were db file sequential read. Note the smaller number of physical I/Os attributed to the primary key

To make this clearer we will look at a typical db scattered file read for table in the TRACE file:

WAIT #13: nam='db file scattered read' ela= 8033 file#=11 block#=1386920 blocks=3 obj#=84459 tim=1328711960280846

What this is telling us is that it took ela = 8033 microseconds elapsed time to fetch blocks = 3 blocks in one physical I/O from the table.

Comments»

1. Trevor Romeli - April 15, 2012

Thank you. It was very helpful for me understanding the details of TKPROF report.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: