Friday, January 23, 2009

SQL Tuning & Troubleshooting

Recently, an interesting performance issue cropped up at one of my clients' site. I thought I will share the issue, troubleshooting, and the takeaways with you.

It was stated that a job that used to run exceptionally well was taking a long time. Especially, considering the fact that it is a simple "DELETE" statement. The interesting part was that the "Commit" issued after the "DELETE" took a long time.

Below is a screenshot.

08:35:30 SQL> DELETE FROM apc.prep
08:35:38 2 WHERE event_id = '122608T09' AND lead_or_assoc = 'A';
0 rows deleted.
Elapsed: 00:00:00.03
08:35:40 SQL> commit;
Commit complete.
Elapsed: 00:00:32.89

That is 32 seconds for a commit of essentially "0" rows! Something was up.

Tanel Poder discussed a while ago, about the discipline of methodical common sense tuning and troubleshooting. He even had a nice presentation on this topic. This is one of the classic examples.

Instead of looking into, I/O issues, redo logs, log file sync issues etc. I decided to try a simple "delete/commit" on a sample table. It worked just fine. So I decided that something in this "candidate" object was the root cause.

By looking to see if there were any triggers on the object, I did find one. Great. Perhaps this is the problem for the slowdown. Upon looking at the code for the trigger...

CREATE OR REPLACE TRIGGER APC.prep_sheet_audit
BEFORE INSERT OR UPDATE
ON apc.prep
FOR EACH ROW
BEGIN
IF INSERTING
THEN
:NEW.added_by := SUBSTR (USER, 1, 10);
:NEW.date_added := SYSDATE;
:NEW.date_last_modified := SYSDATE;
:NEW.last_modified_by := SUBSTR (USER, 1, 10);
:NEW.status_flag := 'A';
END IF;
END;

Tt only applies to "INSERTS" and we are doing a delete. So I decided to take a 10046 trace to see what was happening.

Upon reviewing the tkprof'd trace:

I came across some interesting statements.

delete from "APC"."VW_APC_LEAD_STYLE"

call count cpu elapsed disk query current rows

------- ------ -------- ---------- ---------- ---------- ---------- ----------

Parse 1 0.00 0.00 0 0 0 0

Execute 1 7.07 15.39 1464 8703 515264 120497

Fetch 0 0.00 0.00 0 0 0 0

------- ------ -------- ---------- ---------- ---------- ---------- ----------

total 2 7.07 15.39 1464 8703 515264 120497

INSERT /*+ */ INTO "APC"."VW_APC_LEAD_STYLE"("EVENT_ID","MARKET_CODE",
"PAGE_ID","PAGE_SECTION_CODE","STYLE_NUM","LEAD_OR_ASSOC"
) SELECT
<.....sniped .....>
"PREP_SHEET"."DESCRIPTION",MAX("PREP_SHEET"."BLOCK_NUMBER") FROM
"APC"."PREP_SHEET" "PREP_SHEET" WHERE "PREP_SHEET"."LEAD_OR_ASSOC"<>'A'
GROUP BY "PREP_SHEET"."EVENT_ID","PREP_SHEET"."MARKET_CODE",
"PREP_SHEET"."PAGE_ID","PREP_SHEET"."PAGE_SECTION_CODE",
<...>

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------

Parse 1 0.02 0.00 0 34 0 0

Execute 1 7.68 17.72 63327 64462 388792 120497

Fetch 0 0.00 0.00 0 0 0 0

------- ------ -------- ---------- ---------- ---------- ---------- ----------

total 2 7.70 17.72 63327 64496 388792 120497

The total elapsed time of these two statements is approximately 32 seconds.

As we know (pasted again below..) the Commit, we noticed, the total time was

08:35:30 SQL> DELETE FROM apc.prep
08:35:38 2 WHERE event_id = '122608T09' AND lead_or_assoc = 'A';
0 rows deleted.
Elapsed: 00:00:00.03
08:35:40 SQL> commit;
Commit complete.
Elapsed: 00:00:32.89

So, these are the culprits. But a View (based on the name starting with VW)? what is this? where did this come from?

The delete followed by an insert gave me an idea that this may not be an ordinary view but perhaps a Materialized View.

Upon checking the DBA_OBJECTS,

SQL> select owner, object_name,object_id,object_type from dba_objects where object_name='VW_APC_LEAD_STYLE';

OWNER OBJECT_NAME OBJECT_ID OBJECT_TYPE

------------------------------ ------------------------------ ---------- -------------------

APC VW_APC_LEAD_STYLE 50243 TABLE

APC VW_APC_LEAD_STYLE 50245 MATERIALIZED VIEW

With the existence of a materialized view confirmed, the last piece of the puzzle is to make sure that the materialized view was created on the base table

"PREP_SHEET" though this definitely must be the case, we need proof.

So upon checking the DDL for the materialized view,

CREATE MATERIALIZED VIEW APC.VW_APC_LEAD_STYLE
BUILD IMMEDIATE
USING INDEX
TABLESPACE APC_DATA
REFRESH COMPLETE ON COMMIT
WITH PRIMARY KEY
AS
select EVENT_ID,
<....sniped...>
max(BLOCK_NUMBER) as BLOCK_NUMBER
from apc.prep_sheet
where LEAD_OR_ASSOC != 'A'
GROUP BY
<......sniped.....>
DESCRIPTION;

Now it was proved that the reason for the commit to take all that time with no rows getting deleted was not a problem with the performance of the database but due to objects that were created from behind the scenes.

Later on, this issue started throwing ORA-603 errors like the one below.

ksedmp: internal or fatal error
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4080], [1], [131], [], [], [], [], []
Current SQL statement for this session:
COMMIT
Checking the alert.log confirmed
Errors in file /dkha3090/ora01/app/oracle/admin/apcu/udump/apcu_ora_10641626.trc:
ORA-00600: internal error code, arguments: [4080], [1], [131], [], [], [], [], []
Mon Jan 19 16:54:37 2009
Following on-commit snapshots not refreshed :
APC.VW_APC_LEAD_STYLE
Error 600 trapped in 2PC on transaction 4.31.61701. Cleaning up.
Error stack returned to user:
ORA-00600: internal error code, arguments: [4080], [1], [131], [], [], [], [], []
Mon Jan 19 16:54:37 2009
Errors in file /dkha3090/ora01/app/oracle/admin/apcu/udump/apcu_ora_10641626.trc:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00600: internal error code, arguments: [4080], [1], [131], [], [], [], [], []
Mon Jan 19 16:54:41 2009
DISTRIB TRAN APCU.WORLD.b6b63999.4.31.61701
is local tran 4.31.61701 (hex=04.1f.f105)
insert pending collecting tran, scn=83987547675 (hex=13.8e0c461b)
Mon Jan 19 16:54:42 2009
DISTRIB TRAN APCU.WORLD.b6b63999.4.31.61701
is local tran 4.31.61701 (hex=04.1f.f105))
delete pending collecting tran, scn=83987547675 (hex=13.8e0c461b)

which led us to note Bug 1385495

With the materialized view changed to refresh on DEMAND. The problem was resolved.

Takeaways:

1. Simplistic Logical approach to performance troubleshooting always works.

2. Naming convention is very imporrtant. Generally, VW is used to refer to views though this is not written in stone. Perhaps, a prefix

of MVIEW would have helped instantly recognize that it was a materialized view instead of a view. Though this was identified quickly.

3. Someone in the past mentioned that 10046 trace is a treasure trove. This still is valid.

Wednesday, January 14, 2009

OBI Repository - Incremental Change Maintenance

I recently came across an interesting challenge: To be able to incrementally merge changes into the OBI repository. Since this is an ongoing process of production support and maintenance, the scenario is all too usual. Migrate change from DEV -> PROD (I’m skipping the QA/TEST environment just for the reasons of brevity). I’m sure you get the point.

PROD is live and well. Now changes are being made to the development environment as part of ongoing maintenance/enhancements. Now, the DEV environment needs to be rolled into production without causing disruption to the production repository.

I was browsing to see if this has been addressed in forums and blogs. Though I came across a couple of articles, they had some issues or the other. One instance suggested a way to add totally new information. In another instance, the delta information resulted in duplicate entries of the previously existing information which required a manual cleanup which is not very ideal. Especially if your repository (rpd) is too big and complex.

I found a tip in the OBIEE Forums where Madan has an elegant solution that works; though Oracle warns that the option is deprecated. The above solution has been explained with nice screen shots here.

I try to address the problem by using the supported, “merge repository” feature. Hopefully, you find this helpful in some ways. I did a test in a fairly complex setting. But for illustration purposes, I took a very simple example scenario and have shown it below.

We are well aware of the “MERGE” feature in OBI Administration tool. We are aware of the Original, Modified and Current Repository and what they all stand for. In my opinion, these terms are too confusing and does not provide a clear meaning to what they really stand for. For instance, Is the Original repository the one I am “currently” using in Production or is it the current repository? Does the Current Repository refer to the freshly updated DEV repository or is that the Original Repository because of the fact that it has all the Updates? or shall we say, “true Original”, since the changes that were not reflected in PROD at the moment invalidates it to be the “Original”? I am going to refer to the repositories in question and coin new names to them that I think will be instantly recognizable.

Current Repository – Hereafter referred to as “Target Repository”: Repository that is “actively” being used in Production. In other words, the target repository is the one that needs to be incrementally updated.

Orignial Repository – Hereafter referred to as “Updated Repository”: Repostiory (usually in DEV/QA that has all the release updates that eventually should be applied to PRODUCTION (i.e. Target Repository).

Modified Repository – Hereafter referred to as “Copy of Updated Repository” : This is a plain copy of the Updated Repository. It’ s the copy of the Updated repository that I am using as the modified repository which “actually” makes sense.

In my example, I take a simplest case of a target repository where a table “SINGLE_TABLE_TEST” exists with one column (COL1). A change has to be made to this by adding a DESCRIPTION column to the table. Once this has been done in the Updated Repository, the Target Repository needs to be updated.






Open the Target Repository in Offline mode.






This slide indicates that the Target Repository currently has just (COL1) as part of the Single_table_test structure.



By Selecting the Merge Option from Files Menu, Open the “Updated Repository” (sh_obiee_usage_track – Copy.rpd) in my example.




In the Merge Repositories window, select the copy of the Updated repository for the “Modified Repository” option. Once you are done, you will see the screen below.




This screen shows the pieces in all the three repository copies. In the middle window, the default description would be “Deleted from Current”. Select “Current” as the decision option.




Once you click “Merge”, you will see the Merged Repository to have just the incremental updates added to the target repository. Just by backing up the “target repository” and renaming the “Merged repository” to target, we now got the incremental updates in Production. In my testing, even the session and environment variables remained unchanged.

If you have any updates/comments/links to enhance this note please feel free to let me know.

Friday, January 9, 2009

RMAN Improvements Over Versions

Back in the days of 9iR2, RMAN was getting very popular but there were some issues with features where end-user requirements were not met to it's fullest extent. Over the next set of releases Oracle RMAN has improved on these issues and has become my favorites these days.

One of the biggest improvements in RMAN came in Oracle 10g with the introduction of block change tracking. However, there was another feature that left us longing for more. The "KEEP FOREVER" feature that would let us keep a backup forever (or until we desired) so to speak.

In Oracle 9i (9.2.0.7 to be precise), I tried to use this feature to retain my Oracle ERP 11i backups until a critical phases of the projects were complete say, CRP II or UAT etc. In 9i, when the retention was set to number of redundancies, and a copy of the database backup was meant to be kept forever, we ran into an issue when the newest nth backup will actually not happen.
In other words, If the redundancy was set at (let's say 5), and the first backup was "kept forever", then the 6th backup was not actually retained for restore. Instead the backup commands will complete successfully without the physical backup taking place. The only option, as I recall, was to keep incrementing the redundancy until the "keep forever" backup was to expire and do catalog maintenance to cleanup subsequent backups before getting back to the original redundancy requirement. To say the least, this was not very helpful.

Then in Oracle 10g, I was delighted to see that this issue was resolved but the one last nagging issue still remained with long term backups. That is the requirement to have the backup in a consistent state in order to have the archived logs deleted. If the backup was to be taken in an inconsistent state, Oracle kept every single one of archived logs for recovery. I used to wonder why this requirement was enforced as it is expected that an inconsistent backup when used for restore will be restored to a point-in-time or SCN and should be opened "resetlogs". In this case, there is no need for "all" the archives to be kept up until right this minute.

Finally, this critical issue has been answered in Oracle 11g.

Oracle 11g has introduced a new feature called a "restore point" in rman backups wherein a restore point that tracks the last SCN for a long-term inconsistent backup is maintained. Gone are the syntaxes "keep forever logs or nologs" instead we use "keep forever restore point XXX". This will provide us with the ability to delete all the archive logs generated after the long term backup thereby saving lot of space that in it's absence would be needed to maintain all the archived logs.

So with 11g, Now we can take a inconsistent backup that is long term with no overhead storage requirement for all the archived logs generated. That's great!!!

Below are the rman output for a 10g and 11g "keep forever" backup respectively.


10g Keep Forever Backup

RMAN> backup full database tag='full_keep_forever' keep forever logs;

Starting backup at 04-DEC-08
using channel ORA_DISK_1
using channel ORA_DISK_2
backup will never be obsolete
archived logs required to recover from this backup will expire when this backup expires
skipping datafile 6; already backed up 2 time(s)
channel ORA_DISK_1: starting full datafile backupset
channel ORA_DISK_1: specifying datafile(s) in backupset
input datafile fno=00004 name=C:\ORACLE\PRODUCT\ORADATA\O10203\USERS01.DBF
input datafile fno=00008 name=C:\ORACLE\PRODUCT\ORADATA\O10203\RMAN_TEST_TBS01.DBF
input datafile fno=00005 name=C:\ORACLE\PRODUCT\ORADATA\O10203\USERS02.DBF
channel ORA_DISK_1: starting piece 1 at 04-DEC-08
channel ORA_DISK_2: starting full datafile backupset
channel ORA_DISK_2: specifying datafile(s) in backupset
input datafile fno=00001 name=C:\ORACLE\PRODUCT\ORADATA\O10203\SYSTEM01.DBF
input datafile fno=00002 name=C:\ORACLE\PRODUCT\ORADATA\O10203\UNDOTBS01.DBF
input datafile fno=00003 name=C:\ORACLE\PRODUCT\ORADATA\O10203\SYSAUX01.DBF
input datafile fno=00007 name=C:\ORACLE\PRODUCT\ORADATA\O10203\NEW_RW_TBS01.DBF
channel ORA_DISK_2: starting piece 1 at 04-DEC-08
channel ORA_DISK_2: finished piece 1 at 04-DEC-08
piece handle=F:\RMANBACKUP\E3K1DTCM_1_1 tag=FULL_KEEP_FOREVER comment=NONE
channel ORA_DISK_2: backup set complete, elapsed time: 00:03:21
channel ORA_DISK_1: finished piece 1 at 04-DEC-08
piece handle=F:\RMANBACKUP\E2K1DTCK_1_1 tag=FULL_KEEP_FOREVER comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:03:38
Finished backup at 04-DEC-08

Starting Control File and SPFILE Autobackup at 04-DEC-08
piece handle=C:\ORACLE\PRODUCT\DB102\DATABASE\C-4123959065-20081204-01 comment=NONE
Finished Control File and SPFILE Autobackup at 04-DEC-08

11g Keep Forever Backup

RMAN> backup full database tag='full_keep_forever' keep forever restore point abc;

Starting backup at 04-DEC-08
current log archived

using channel ORA_DISK_1
backup will never be obsolete
archived logs required to recover from this backup will be backed up
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00002 name=C:\ORACLE\PRODUCT\ORADATA\ORA11\SYSAUX01.DBF
input datafile file number=00001 name=C:\ORACLE\PRODUCT\ORADATA\ORA11\SYSTEM01.DBF
input datafile file number=00003 name=C:\ORACLE\PRODUCT\ORADATA\ORA11\UNDOTBS01.DBF
input datafile file number=00004 name=C:\ORACLE\PRODUCT\ORADATA\ORA11\USERS01.DBF
input datafile file number=00005 name=C:\ORACLE\PRODUCT\ORADATA\ORA11\EISCAT01.DBF
input datafile file number=00006 name=C:\ORACLE\PRODUCT\ORADATA\ORA11\HYPUSER01.DBF
input datafile file number=00007 name=C:\ORACLE\PRODUCT\ORADATA\ORA11\PROD_SRC01.DBF
channel ORA_DISK_1: starting piece 1 at 04-DEC-08
channel ORA_DISK_1: finished piece 1 at 04-DEC-08
piece handle=C:\ORACLE\PRODUCT\11G\DB_1\DATABASE\11K1DTMI_1_1 tag=FULL_KEEP_FOREVER comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:02:58


current log archived
using channel ORA_DISK_1
backup will never be obsolete
archived logs required to recover from this backup will be backed up
channel ORA_DISK_1: starting archived log backup set
channel ORA_DISK_1: specifying archived log(s) in backup set
input archived log thread=1 sequence=191 RECID=31 STAMP=672593814
channel ORA_DISK_1: starting piece 1 at 04-DEC-08
channel ORA_DISK_1: finished piece 1 at 04-DEC-08
piece handle=C:\ORACLE\PRODUCT\11G\DB_1\DATABASE\12K1DTSQ_1_1 tag=FULL_KEEP_FOREVER comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01

using channel ORA_DISK_1
backup will never be obsolete
archived logs required to recover from this backup will be backed up
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current SPFILE in backup set
channel ORA_DISK_1: starting piece 1 at 04-DEC-08
channel ORA_DISK_1: finished piece 1 at 04-DEC-08
piece handle=C:\ORACLE\PRODUCT\11G\DB_1\DATABASE\13K1DTT9_1_1 tag=FULL_KEEP_FOREVER comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01

using channel ORA_DISK_1
backup will never be obsolete
archived logs required to recover from this backup will be backed up
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
channel ORA_DISK_1: starting piece 1 at 04-DEC-08
channel ORA_DISK_1: finished piece 1 at 04-DEC-08
piece handle=C:\ORACLE\PRODUCT\11G\DB_1\DATABASE\14K1DTTN_1_1 tag=FULL_KEEP_FOREVER comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 04-DEC-08

RMAN>