jump to navigation

Index Join in Oracle and the equivalent Index Intersection in Sybase ASE September 1, 2012

Posted by Mich Talebzadeh in Oracle and Sybase.
trackback

Both Oracle and Sybase ASE provide multipole access paths to the optimizer. Many people have this perception that for a given table only one index can be used by the optimizer. Although this is a widely expected outcome and it holds true for most cases, under certain conditions where the optimizer decides that it does not need to access the base table, more than one index on the same table can be used to satisfy the query.

Consider a simple case of a select list returning the values for a column or columns based on columns used in a where caluse.  In olther words the select list is entirely satisfied by the columns in the where clause. Let us look at a simple example in each case

Oracle (11gR2)

First create the table:

create table #generator(id int not null)
go
CREATE TABLE index_joins AS
 SELECT
 ROWNUM AS c1,
 (ROWNUM + trunc(dbms_random.value(0,1000))) AS c2,
 LPAD('A',255,'A') AS c3
 FROM
 DUAL
 CONNECT BY
 LEVEL <=10000;
CREATE UNIQUE INDEX ind_c1 ON index_joins(c1);
CREATE INDEX ind_c2 on index_joins(c2);
EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=USER,TABNAME='INDEX_JOINS',CASCADE=>TRUE)

Let us query this table to get columns c1 and c2

alter system flush buffer_cache;
alter system flush shared_pool;
set timing on
set autotrace on
SELECT c1, c2
FROM index_joins
where   c1 between 100 and 1000
and     c2 between 50 and 100;

The select list contains column c1 and c2. The index ind_c1 covers c1 column and the index ind_c2 is on c2. These two indexes on index_joins table satisfy the query. Hence there is no need for the optimizer to visit the base table. Let us look at the plan:

---------------------------------------------------------------------------------------
| Id  | Operation          | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                  |     4 |    36 |     6  (17)| 00:00:01 |
|*  1 |  VIEW              | index$_join$_001 |     4 |    36 |     6  (17)| 00:00:01 |
|*  2 |   HASH JOIN        |                  |       |       |            |          |
|*  3 |    INDEX RANGE SCAN| IND_C2           |     4 |    36 |     3  (34)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN| IND_C1           |     4 |    36 |     4  (25)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
 ---------------------------------------------------
 1 - filter("C2"<=100 AND "C1"<=1000 AND "C1">=100 AND "C2">=50)
 2 - access(ROWID=ROWID)
 3 - access("C2">=50 AND "C2"<=100)
 4 - access("C1">=100 AND "C1"<=1000)

Let us look at the plan outpout. The optimizer does a index range scan of the two indexes extracting the rowid and c1 from index ind_c1 and the rowid and c2 from ind_c2. Then the optimizer performs a hash join between threse two results set on the rowid values. It is doing this because anytime the two rowsources have a rowid in common, that is the interestion that we want and both c1 and c2 values will be sent back as the result of the query.

Sybase ASE (ASE 15.7 ESD 2)

First we create the equivalent table in Sybase

create table #generator(id int not null)
go
create table index_joins
(
c1 int not null
, c2 int not null
, c3 varchar(255)
)
lock datarows
go
CREATE UNIQUE INDEX ind_c1 ON index_joins(c1)
CREATE INDEX ind_c2 on index_joins(c2)
go
--
declare @i int
declare @rows int
set @rows = 100
set @i = 1
set nocount on
while @i <=@rows
begin
 insert into #generator values (@i)
 set @i = @i + 1
end
set nocount off
insert into
 index_joins
select
 rs.id
 , rs.id + abs(rs.random % 1000)
 , REPLICATE('A',255)
from    ( select
 @rows * ( g1.id - 1 ) + g2.id AS id
 , CAST(NEWID() AS VARBINARY)  AS random
 from
 #generator g1,
 #generator g2
 ) rs
go
update index statistics index_joins
go

So that is the equivalent table in Sybase.

Unlike Oracle, the optimizer in Sybase will not consider using two indexes on the same table. The only way one could use two different indexes for the same table in a query is to have an  index intersection plan. Index Intersection is a technique to enable ASE optimizer to use more than one index on a table to satisfy a given query. This should work similar to Oracle “index join” whereby each index satisfies the predicate on the query. The hash join between two indexes results in common Row IDs (RID) to be selected on intersection and base table rows accessed by the intersection RowIDs. In ASE this index interesection has to be enabled explicitly by the following set command:

set index_intersection on

In addition, you need to supply multiscan abstract plan operator m_scan. The query will look like the following code:

set switch on 3604
go
declare @dbname varchar(30)
set @dbname=db_name()
dbcc cachedataremove(@dbname)
go
set statement_cache off
go
set showplan on
set statistics time,io,plancost on
go
set index_intersection on
go
SELECT c1, c2
FROM index_joins
where   c1 between 100 and 1000
and     c2 between 50 and 100
plan '(m_scan index_joins)'
go

The output of the Lava tree operator shows:

==================== Lava Operator Tree ====================
                     Emit
                     (VA = 6)
                     r:0 er:1
                     cpu: 0
                /
              RIDJoin
              (VA = 5)
              r:0 er:1
              l:4 el:7
              p:0 ep:0
     /                                              \
    HashJoin                                     Restrict
    Inner Join                                   (0)(0)(0)(13)(0)
    (VA = 2)                                     (VA = 4)
    r:0 er:1                                     r:0 er:1
    l:4 el:5
    p:0 ep:0
    bufct: 128
   /                      \                        /
IndexScan               IndexScan               TableScan
ind_c2                  ind_c1                  index_joins
(VA = 0)                (VA = 1)                (VA = 3)
r:0 er:1                r:0 er:900              r:0 er:1
l:2 el:2                l:0 el:3                l:0 el:934
p:2 ep:2                p:0 ep:3                p:0 ep:5
============================================================<br />

The execution plan steps are described as follows:

  1. index scan of ind_c2 to get the RIDs of the qualifying rows for “index_joins.c2 between 50 and 100”
  2. index scan of ind_c1 to get the RIDs of the qualifying rows for “t.index_joins.c1 between 100 and 1000”
  3. does HashJoin between those two sets of RIDs to return the common RID
  4. The stream of unique RIDs is passed to the RID JOIN operator. This operator creates a worktable (aka Dynamic Index) and fills it with a single-column row with each RID.
  5. does a RID Scan on the table index_joins using those RIDs from the worktable and gets back the rows.

Note that there is a subtle difference in behaviour between Oracle and ASE. Oracle does not require to visit the base table. However, ASE will visit the base table to feth the result set for RIDs from the worktable created. This method will also work for cases where the result set includes other columns from the table. for example a select * from the base table will also use this access path in ASE.

Comments»

1. Amit - April 22, 2015

wow index intersection ..it is one of the lesser known additions of Sybase 15.. never realized it was there!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: