Home » SQL & PL/SQL » SQL & PL/SQL » Query performance condundrum. (Oracle 12c Linux)
Query performance condundrum. [message #678198] Fri, 15 November 2019 16:13 Go to next message
uman2631
Messages: 16
Registered: November 2011
Junior Member
I have a table as shown below, where I have a key field, an effective date and a termination date (there are, in reality, many more fields, but these are the ones that matter).
I'm trying to find "bad" data, that is, records where the termination_dt < effective_dt.
The table contains hundreds of millions of records, and I am trying to avoid a full table scan.
I can add indexes, but I cannot add columns to the table.
Less than 1/10 of 1% of the table will meet the query condition, and I'm trying to think of a way to query this table that won't require a full table scan.

Since both termination_dt and effective_dt can be any date and cannot be predetermined, an index on either field doesn't really help. It seems like such a simple problem.

Any ideas?

CREATE TABLE foo
(mykey VARCHAR2(5), EFFECTIVE_DT DATE, TERMINATION_DT DATE);

INSERT INTO foo VALUES
('AAA','01-JAN-2019','01-JAN-2018');

SELECT * FROM foo where TERMINATION_DT < EFFECTIVE_DT;
Re: Query performance condundrum. [message #678200 is a reply to message #678198] Fri, 15 November 2019 16:55 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
Add 2 indexes, one index on each of the DATE columns.
SELECT MYKEY FROM FOO WHERE TERMINATION_DT, EFFECTIVE_DT in (SELECT TERMINATION_DT, EFFECTIVE_DT FROM foo where TERMINATION_DT < EFFECTIVE_DT)
Re: Query performance condundrum. [message #678206 is a reply to message #678198] Sat, 16 November 2019 01:41 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
orclz>
orclz> create index fooi on foo(TERMINATION_DT - EFFECTIVE_DT);

Index created.

orclz> set autot on exp
orclz> SELECT * FROM foo where TERMINATION_DT - EFFECTIVE_DT < 0;

MYKEY EFFECTIVE_DT        TERMINATION_DT
----- ------------------- -------------------
AAA   0001-01-20:19:00:00 0001-01-20:18:00:00


Execution Plan
----------------------------------------------------------
Plan hash value: 2246344638

--------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |    20 |   700 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| FOO  |    20 |   700 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | FOOI |     4 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("TERMINATION_DT"-"EFFECTIVE_DT"<0)

orclz>
btw, I wish you would not say "record" and "field" when you mean "row" and "column".
Re: Query performance condundrum. [message #678207 is a reply to message #678206] Sat, 16 November 2019 07:31 Go to previous messageGo to next message
Solomon Yakobson
Messages: 3269
Registered: January 2010
Location: Connecticut, USA
Senior Member
If table is quite large and assuming TERMINATION_DT < EFFECTIVE_DT is a rare case we can save on index size:

SQL> create index foo_idx1
  2  on foo(case when TERMINATION_DT < EFFECTIVE_DT then 1 end);

Index created.

SQL> explain plan for
  2  SELECT * FROM foo where 1 = case when TERMINATION_DT < EFFECTIVE_DT then 1 end;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------
Plan hash value: 359656133

------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |          |     1 |    25 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| FOO      |     1 |    25 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | FOO_IDX1 |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(CASE  WHEN "TERMINATION_DT"<"EFFECTIVE_DT" THEN 1 END =1)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

18 rows selected.

SQL> 
SY,
Re: Query performance condundrum. [message #678209 is a reply to message #678207] Sat, 16 November 2019 10:56 Go to previous messageGo to next message
uman2631
Messages: 16
Registered: November 2011
Junior Member
Some great ideas here, thank you!!
Re: Query performance condundrum. [message #678210 is a reply to message #678206] Sat, 16 November 2019 10:58 Go to previous messageGo to next message
uman2631
Messages: 16
Registered: November 2011
Junior Member
So far, this seems like the best idea. Thanks! (Oh, and at least I didn't say instance and attribute, eh?)
Re: Query performance condundrum. [message #678211 is a reply to message #678210] Sat, 16 November 2019 11:41 Go to previous message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
I think "attribute" is fine, as long as we use it with tuple and relation rather than row and table. "Instance" I never understood. Isn't there also some junk about nulls and uniqueness with that terminology? If Ted Codd (RIP) had not invented the relational database, none of us would have jobs. But he did make it complicated.
Previous Topic: Return TYpe (tablet) as output parameters.
Next Topic: Record Plus Prior Record - Single Row
Goto Forum:
  


Current Time: Thu Mar 28 04:48:00 CDT 2024