Saturday, August 14, 2010

Primary-Physical Standby with different db_name

Recently, I came across a posting in Oracle Forums regarding a primary-physical standby configuration with primary and standby having different db_names. While there were many responses, questions and followups, the basic question remains.

Why would anyone want to have a standby with a different name than that of the primary? however, I wanted to test this curious scenario to see what happens.

The test was done in a 10.2.0.5 version. ***Some obvious feedbacks from the various commands are deleted to keep the output concise***

Let's start with the primary with the original db_name.


SQL> startup mount pfile='/tmp/initnxyz812.ora';
ORACLE instance started.
Database mounted.
SQL> select name,dbid,database_role,open_mode from v$database;

NAME            DBID DATABASE_ROLE    OPEN_MODE
--------- ---------- ---------------- ----------
ASML2     1922246803 PRIMARY          MOUNTED

[oracle10@linux2 10.2]$ nid target=sys/sys dbname=TEST1 setname=Y;

Connected to database ASML2 (DBID=1922246803)
Connected to server version 10.2.0
Control Files in database:
    +DATARAC10G/asm/controlfile/current.276.720729679
Change database name of database ASML2 to TEST1? (Y/[N]) => Y
Proceeding with operation
Changing database name from ASML2 to TEST1
    Control File +DATARAC10G/asm/controlfile/current.276.720729679 - modified
    Datafile +DATARAC10G/asm/datafile/system.288.720728637 - wrote new name
    Datafile +DATARAC10G/asm/datafile/undotbs1.289.720728605 - wrote new name
    Datafile +DATARAC10G/asm/datafile/sysaux.dbf.290.720728559 - wrote new name
    Datafile +DATARAC10G/asm/datafile/users01.dbf.291.720728283 - wrote new name
    Datafile +DATARAC10G/asm/datafile/example.287.720728703 - wrote new name
    Datafile +DATARAC10G/asml2/datafile/fm_auto.280.724896241 - wrote new name
    Datafile +DATARAC10G/asml2/datafile/fm_manual.279.724896573 - wrote new name
    Datafile +DATARAC10G/redl2/tempfile/temp.264.718373311 - wrote new name
    Control File +DATARAC10G/asm/controlfile/current.276.720729679 - wrote new name
    Instance shut down

Now, let's start the database and run some DMLs on primary and capture relevant information.


SQL> select max(sequence#) from v$archived_log;

MAX(SEQUENCE#)
--------------
           240

SQL> select * from before_all_changes;

      COL1
----------
       100
       200
       300
       400

SQL> select name,dbid,open_mode,database_role from v$database;

NAME            DBID OPEN_MODE  DATABASE_ROLE
--------- ---------- ---------- ----------------
TEST1     1922246803 READ WRITE PRIMARY

SQL> select name,value from v$parameter
  2  where name in ('instance_name',db_name','log_archive_dest_2',
  3  'log_archive_dest_state_2');

SQL> col name for a20
SQL> col value for a25
SQL> /

NAME                 VALUE
-------------------- -------------------------
log_archive_dest_2   SERVICE=NABC812
log_archive_dest_sta enable
te_2
instance_name        NXYZ812
db_name              TEST1

SQL> insert into before_all_changes values (500);

1 row created.

SQL> commit;

Commit complete.

SQL> alter system switch logfile;

System altered.

SQL> select max(sequence#) from v$archived_log;

MAX(SEQUENCE#)
--------------
           241




On the Physical Standby Database:

SQL> startup mount pfile='/tmp/initnabc812.ora';
ORACLE instance started.

SQL> select sequence#,archived,applied,deleted,status from v$archived_log
  2  where sequence# >=238;

 SEQUENCE# ARC APP DEL S
---------- --- --- --- -
       238 YES YES NO  A
       239 YES YES NO  A
       240 YES YES NO  A
       241 YES NO  NO  A

SQL> select name,dbid,open_mode,database_role from v$database;

NAME            DBID OPEN_MODE  DATABASE_ROLE
--------- ---------- ---------- ----------------
ASML2     1922246803 MOUNTED    PHYSICAL STANDBY


SQL> select name,value from v$parameter
  2  where name in ('instance_name','db_name','fal_server','fal_client');

NAME                 VALUE
-------------------- --------------------
fal_client           NABC812
fal_server           NXYZ812
instance_name        NABC812
db_name              ASML2

SQL> host ps -ef| grep mrp
oracle10  9190  7030  0 09:16 pts/1    00:00:00 /bin/bash -c ps -ef| grep mrp
oracle10  9192  9190  0 09:16 pts/1    00:00:00 grep mrp

SQL> alter database recover managed standby database disconnect from session;

Database altered.


SQL> select sequence#,archived,applied,deleted,status from v$archived_log
  2  where sequence#>=238;

 SEQUENCE# ARC APP DEL S
---------- --- --- --- -
       238 YES YES NO  A
       239 YES YES NO  A
       240 YES YES NO  A
       241 YES YES NO  A

SQL> alter database recover managed standby database cancel;

Database altered.

SQL> alter database open read only;

Database altered.

SQL> select * from before_all_changes;

      COL1
----------
       100
       200
       300
       400
       500

So though the controlfile, datafile and redolog files has the db_name, it seems the DBID plays a vital role than the db_name. Even creation of new datafiles/tablespaces etc works!!

Friday, May 21, 2010

MAX_UTILIZATION_LIMIT - A Very useful option in 11gR2

Resource allocation and management in large SMP systems has always been a challenge especially in large data warehouse systems. At the client sites that I have been to, the challenge will be in the form of critical ETL jobs that does massive data processing while trickle-down feeds that gets incrementally loaded in the staging tables for subsequent processing.

While Oracle manages CPU resources efficiently using database resource manager APIs, there are cases when this does not satisfy the real world requirements.

For instance, the CPU resource allocation pretty much goes by allocation based on how much resource is currently available at the time a process started. Once the process starts it is impossible to take away resources from the job for redistribution.

Imagine a scenario where a slew of jobs from a low priority group kicks off when the system is totally idle. Based on the way resource management is handled in Oracle, all the available CPU resources will be divided among the processes. Now while these processes are consuming CPU, if a job from a high priority group kicks off, the resources are not immediately available and the high priority job has to work with what is available. Potentially, if the low priority jobs are many and as a whole consumes a significant amount of CPU, then this directly will impact the high priority job.

Oracle Database 11gR2 provides a new parameter "MAX_UTILIZATION_LIMIT" for the resource manager APIs that let's us set an "absolute" maximum CPU that a group and it's associated processes can use. This is really significant. With this now we can guarantee some low priority jobs will not eat up CPU cycles while more important jobs need to take a hit.

Below is a sample.

Some portion of the code was re-used from Jonathan Lewis's kill_cpu.

The test was conducted in a completely idle system running 11gR2/OEL5/ASM 11R2.

In the first case, the plan with 2 groups with "max_utilization" resource allocation of 50/50 was created and a simple create table statement was executed. The tkprof output,

create table tab_stingy_cpu_new_50_50 as
  select a.name,a.type,a.line,a.text,b.*
from dba_source a, dba_objects b
 where rownum <=1000000

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.12       0.18          0          0          0           0
Execute      1      7.04      27.22      10554      12559      27033     1000000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      7.16      27.41      10554      12559      27033     1000000


The next sample shows, the elapsed time change when the max_utilization_limit is set to a higly imbalanced (1,99) fro the same respective groups.

exec dbms_resource_manager.create_pending_area;
exec dbms_resource_manager.update_plan_directive('STINGY_CPU','STINGY_CPU_CON_GROUP',NEW_MAX_UTILIZATION_LIMIT=>1);
exec dbms_resource_manager.update_plan_directive('STINGY_CPU','OTHER_GROUPS',NEW_MAX_UTILIZATION_LIMIT=>99);
exec dbms_resource_manager.validate_pending_area;
exec dbms_resource_manager.submit_pending_area


SQL> select a.sid,a.resource_consumer_group,
b.MAX_UTILIZATION_LIMIT from v$session a, DBA_RSRC_PLAN_DIRECTIVES  b where plan='STINGY_CPU'
and a.resource_consumer_group=b.group_or_subplan
and a.sid=(select sys_context('USERENV','SID') from dual)  2    3    4 
  5  /

       SID RESOURCE_CONSUMER_GROUP          MAX_UTILIZATION_LIMIT
---------- -------------------------------- ---------------------
       146 STINGY_CPU_CON_GROUP                                 1

create table tab_stingy_cpu_new_1_99 as
  select a.name,a.type,a.line,a.text,b.*
from dba_source a, dba_objects b
 where rownum <=1000000

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.12       6.17          0          0          0           0
Execute      1      7.03     305.25      10554      12559      27033     1000000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      7.16     311.43      10554      12559      27033     1000000



Amazingly, to my surprise the CPU time is exactly the same (7.16 seconds). I confess, I did not expect this :) However, if you notice the elapsed time is 311 seconds. This is because, the max_utilization_limit enforces that the process does not take too much CPU though the machine is complete idle.

This in my opinion is an amazing feature from Data Warehousing systems perspective where I can be confident with my ETL load and Materialized view refresh timings.

After all, consistency is more important than speed when it comes to explaining performance to end users.

All comments welcome.

Friday, May 7, 2010

UNIX Group Permissions when Installing 10g on 11g ASM

The following was a lesson learned with proper group permissions when interacting with the new and improved 11gR2 ASM. I thought perhaps, this might be useful for someone who is facing this scenario in the future. Hope Google, provides this as a possible link if you ever search this problem.
I tried to install Oracle 10g database on 11.2 ASM. I already have an 11.2 RAC DB running fine on this 11.2 ASM (also RAC) instance.
Upon completing the DB binary install (10.2.0.1), I tried to create the database using DBCA. At the point where it asks to select the type of the file, When I selected ASM: The following message pops up.




It erroed saying that the ASM instance was not running. Obviously, it was. So, It was time to look under the covers.

First place was to look at the trace.log in $OH/cfgtoollogs/dbca
trace.log (final few lines ) were...

[AWT-EventQueue-0] [22:31:45:302] [ASMInstance.discoverASMInstance:369] Found existing ASM -- Sid: +ASM2 OracleHome: /u01/app/11.2.0/grid
[AWT-EventQueue-0] [22:31:45:303] [ASMInstance.discoverASMInstance:374] ORATAB_FILE /etc/oratab
[AWT-EventQueue-0] [22:31:45:305] [ASMInstance.discoverASMInstance:376]discoverOSMInstance:bExists=true
[AWT-EventQueue-0] [22:31:45:310] [ASMInstance.initialize:339]initialize:m_bASMPresent=true
[AWT-EventQueue-0] [22:31:46:925] [ASMInstance.validateASM:475] Validating SI-ASM...
[AWT-EventQueue-0] [22:31:46:952] [SQLEngine.initialize:244] Execing SQLPLUS/SVRMGR process...
[AWT-EventQueue-0] [22:31:46:973] [SQLEngine.initialize:272] m_bReaderStarted: false
[AWT-EventQueue-0] [22:31:46:975] [SQLEngine.initialize:276] Starting Reader Thread...
[AWT-EventQueue-0] [22:31:47:33] [ASMInstance.detectASMInstance:775]detectOSMInstance:m_bASMPresent=true
[AWT-EventQueue-0] [22:31:47:34] [ASMInstance.detectASMInstance:776]detectOSMInstance:m_bASMRunning=false
[AWT-EventQueue-0] [22:31:47:147] [SQLEngine.done:1960] Done called

Here is the problem. It just stops abruptly instead of listing the ASM disk groups to proceed further with the install. So something is amiss.
[AWT-EventQueue-0] [22:31:47:148] [SQLEngine.reInitialize:605] Reinitializing SQLEngine...
[AWT-EventQueue-0] [22:31:47:151] [SQLEngine.initialize:244] Execing SQLPLUS/SVRMGR process...

Looking into metalink, notes: 948456.1

Point #1 was not the reason for my error, as the cluster in linux2 (my node) is pinned.
[root@linux2 ~]# olsnodes -t -n
linux1 1 Unpinned
linux2 2 Pinned

I looked at Point#2 and this was the exact error message I received. So I decided to take that route and did an upgrade to 10.2.0.5. I could not find out whether (8288940) is included in 10.2.0.5. It was a leap of faith.

After Upgrade, I started DBCA and came to the same error. So I decided to compare the o/s level permissions between the 11.2 oracle id (oracle) and 10.2 oracle id (oracle10).
I saw a difference.

[oracle@linux2 ~]$ id <-- This is the oracle db software owner id for 11gR2.
uid=1101(oracle) gid=1000(oinstall) groups=502(dba),503(oper),1000(oinstall),1201(asmdba)

[oracle10@linux2 ~]$ id <-- This is the Oracle s/w owner for 10g
uid=1104(oracle10) gid=1000(oinstall) groups=1000(oinstall),1200(asmadmin),1203(dba10)


As you can see, the asmdba group is missing for Oracle10.
Oracle 11.2 provides new security restrictions to manage and maintain ASM. Gone are the days when the "sys" user of the database can do everything on ASM. Specifically, oracle provides (asmadmin, asmdba, and asmoper). You can read more about the roles and privileges.

It states:

Members of the (OSASM group), a.k.a asmadmin can:

use SQL to connect to an Oracle ASM instance as SYSASM using operating system authentication. The SYSASM privileges permit mounting and dismounting disk groups, and other storage administration tasks.

Whereas,
Members of the ASM Database Administrator group (OSDBA for ASM) are granted read and write access to files managed by Oracle ASM. The grid infrastructure installation owner and all Oracle Database software owners must be a member of this group, and all users with OSDBA membership on databases that have access to the files managed by Oracle ASM must be members of the OSDBA group for ASM.

Translation:

Though one might think (I certainly did) "asmadmin" is some sort of "admin" group that would have a superset of privileges, in fact, this is not true. It is more important to have the oracle software owners to be part of "asmdba" group than "asmadmin” group.

Upon adding the “asmdba” group to “oracle10” id, I restarted the install. This time the DBCA went past the problem area.
















This time the trace.log has the following.

94 [AWT-EventQueue-0] [23:9:41:819] [ASMInstance.discoverASMInstance:369] Found existing ASM -- Sid: +ASM2 OracleHome: /u01 /app/11.2.0/grid
95 [AWT-EventQueue-0] [23:9:41:820] [ASMInstance.discoverASMInstance:374]ORATAB_FILE /etc/oratab
96 [AWT-EventQueue-0] [23:9:41:821] [ASMInstance.discoverASMInstance:376]discoverOSMInstance:bExists=true
97 [AWT-EventQueue-0] [23:9:41:825] [ASMInstance.initialize:339]initialize:m_bASMPresent=true
98 [AWT-EventQueue-0] [23:9:42:721] [ASMInstance.validateASM:475] Validating SI-ASM...
99 [AWT-EventQueue-0] [23:9:42:742] [SQLEngine.initialize:244] Execing SQLPLUS/SVRMGR process...
100 [AWT-EventQueue-0] [23:9:42:757] [SQLEngine.initialize:272] m_bReaderStarted: false
101 [AWT-EventQueue-0] [23:9:42:758] [SQLEngine.initialize:276] Starting Reader Thread...
102 [AWT-EventQueue-0] [23:9:42:809] [ASMInstance.detectASMInstance:775]detectOSMInstance:m_bASMPresent=true
103 [AWT-EventQueue-0] [23:9:42:810] [ASMInstance.detectASMInstance:776]detectOSMInstance:m_bASMRunning=false
104 [AWT-EventQueue-0] [23:9:42:919] [ASMInstance.detectASMInstance:803] OSM Instance is running
105 [AWT-EventQueue-0] [23:9:42:990] [ASMInstance.detectASMInstance:880] ASM version = 11.2.0.1.0
106 [AWT-EventQueue-0] [23:9:43:206] [InitParamAttributes.loadDBParams:4123]Checking if SPFILE is used
107 [AWT-EventQueue-0] [23:9:43:263] [InitParamAttributes.loadDBParams:4134]spParams = [Ljava.lang.String;@1e2a069
108 [AWT-EventQueue-0] [23:9:43:265] [ASMParameters.loadASMParameters:441]useSPFile=true
109 [AWT-EventQueue-0] [23:9:43:265] [StorageOptionsPage.validate:569] ASM is present and started.
110 [AWT-EventQueue-0] [23:9:43:347] [DiskGroupsPanel.initialize:262] Loading Data in the Table..
111 [AWT-EventQueue-0] [23:9:43:348] [DiskGroupDataSource.loadData:252] Load the data from the datasource
112 [AWT-EventQueue-0] [23:9:43:422] [ASMInstance.loadDiskGroups:1811] sql to be executed:=select NAME||'|'||round(TOTAL_MB)||'|'||round(USABLE_FILE_MB)||'|'||nvl(TYPE, 'DBCA_NULL')||'|'||STATE from v$asm_diskgroup order by NAME
113 [AWT-EventQueue-0] [23:9:43:611] [ASMInstance.loadDiskGroups:1815] Loading the diskgroup from schema....
114 [AWT-EventQueue-0] [23:9:43:612] [ASMInstance.loadDiskGroups:1823] diskgroup: CRS|3069|2673|EXTERN|MOUNTED
115 [AWT-EventQueue-0] [23:9:43:615] [ASMInstance.loadDiskGroups:1823] diskgroup: DATADG|30719|28615|EXTERN|MOUNTED
116 [AWT-EventQueue-0] [23:9:43:616] [ASMInstance.loadDiskGroups:1823] diskgroup: DATARAC10G|30719|30624|EXTERN|MOUNTED
117 [AWT-EventQueue-0] [23:9:43:617] [ASMInstance.loadDiskGroups:1823] diskgroup: FRADG|30719|30192|EXTERN|MOUNTED
118 [AWT-EventQueue-0] [23:9:43:688] [ASMInstance.setBestDiskGroup:1432] sql to be executed:=select name from v$asm_diskgrou p where free_mb= (select max(free_mb) from v$asm_diskgroup)

Just to ensure "asmdba" is the only critical group needed, I took out "asmadmin" and tried the install once again.

[oracle10@linux2 ~]$ id
uid=1104(oracle10) gid=1000(oinstall) groups=1000(oinstall),1201(asmdba),1203(dba10)

It was able to recognize the disks.

So, please make sure that the oracle binary owners are part of the "asmdba" group when creating databases on 11.2 ASM.
















Friday, April 23, 2010

11gR2 Oracle Local Registry

Until Oracle Database 11gR1, the RAC configurations consisted of just one registry when running Oracle Clusterware. Shortly called OCR, Oracle Cluster Registry, maintained the cluster level resource information, privileges etc. To be precise, the OCR maintained information about 2 sets of node level resources, namely, the Oracle Clusterware Components (CRS, CSS, evm) as well as Cluster resources (DB, Listener etc).

In 11gR2, Oracle has introduced a new registry to maintain the clusterware resources (css, crs,evm,gip and more) in a new registry called Oracle Local Registry (OLR). The OCR still exists, but maintains only the cluster resources.

Why this distinction?

Before we get into this, we should see some of the improvements in Oracle 11gR2 RAC infrastructure. Until 11gR2, the CRS resources namely the OCR components and the voting disks were maintained in RAW or shared file systems. With the new 11gR2, the Oracle clusterware related files can be maintained in Oracle ASM (Automatic Storage Management). A feature that was introduced with Oracle 10g DB release. This ability to host OCR and Voting disks in ASM poses an interesting situation.

In order for the cluster resources to be up, the ASM needs to be up. If ASM needs to be up, the clusterware components should be functional. By having all the CRS and cluster resource information stored in OCR, this contradicting situation cannot be resolved unless somehow the cluster sepcific components detail is separately maintained from other resources/services.

As a solution, Oracle has come up with a new approach; the Oracle Local Registry. The Oracle Local registry maintains the node specific information and gets created with Oracle Clusterware installation of OCR. Since this maintains node specific resources, the clusterware components (crs,css,ctss,evm,gip, and asm) can be made available, with ASM being made available, this makes the OCR and voting disks access possible which eventually opens up the various cluster resources and components.

Without OLR, the clusterware resources will not start which in turn will not start the dependent components.

NOTE: OLR has to be *manually* backed up. There is no support for an automatic backup of OLR.

A quick (dirty) peek at the olr shows the resources that are being maintained.

[root@linux2 linux2]# strings -a backup_20100422_152429.olr| grep ! | sort -u| grep -v type
!DAEMON_TRACING_LEVELS
ora!asm
ora!crsd
ora!cssd
ora!cssdmonitor
ora!ctssd
ora!diskmon
ora!drivers!acfs
ora!evmd
ora!gipcd
ora!gpnpd
ora!mdnsd


A similar peek at the OCR backup file shows:

[root@linux2 racnode-cluster]# strings -a backup_20100423_091503.ocr| grep -v type | grep ora!
ora!GREEN
ora!DATARAC10G!dg
ora!OLDSTYLE!lsnr
ora!LTEST!lsnr
ora!green!db
ora!LSNRGRID!lsnr
ora!LISTENERCUSTOM!lsnr
ora!LISTENER!lsnr
dora!FRADG!dg
Fora!DATADG!dg
Aora!linux2!vip
ora!oc4j
ora!LISTENER_SCAN1!lsnr
ora!scan1!vip
ora!registry!acfs
ora!CRS!dg
iora!asm
dora!eons
ora!ons
ora!gsd
ora!linux1!vip
ora!net1!network

Sunday, March 7, 2010

SCAN - Single Client Access Name in Oracle 11gr2

SCAN (Single Client Access Name) is a new feature in Oracle 11gR2. With this, Oracle is taking the High-Availability Plug and Play to the next level. Though this new feature has some advantages in a grid infrastructure setup, there seems to be some penalty in connection initiation when using the SCAN method.

Details about SCAN and it's workings can be gotten from Metalink.

History

To take a walk down the memory lane; Oracle 10g RAC provided the ability to use VIP (Virtual IP) to have a faster response to network related issues. Before this, database connections having network related disconnect issues, could indicate to the end-user about the state of the connection based on TCP timeout, as late as 10 minutes. This was overcome with the Oracle VIP wherein a virtual ip from the same subnet as the public IP was configured which would failover to one of the surviving instances instantaneously and send an error message to the client. This concept has been extended from a “virtual-ip from node-level” to a virtual-IP at cluster level. The Oracle VIP is still valid and being used in 11g. The SCAN (single client access name), provides a cluster level abstraction. The SCAN is configured to listen to incoming connections on a specific port. All cluster resources registers with this resource including the database listeners that are configured to the instances. SCAN configuration seems to be mandatory during Oracle 11gR2 install; however, it can be disabled afterwards if preferred.

The idea behind this feature is to make cluster node management easier and seamless to the underlying applications. In the past, node addition or removal meant editing listener entries and tnsnames entries in all the nodes that are part of the cluster. With virtualizing the connection information using SCAN this step is completely eliminated.

With SCAN, a simple entry for the SCAN VIP(s) and PORT# in TNSNAMES.ora entries is just enough making an exhaustive configuration not necessary.

What’s New

1. The remote_listener will always be set to the SCAN Name and Port.
2. The load balancing, failover parameters etc that were configured before is not needed as the SCAN listener by default takes care of these functions.

Configuration Setup

Below, a sample setup to show the workings of SCAN is provided. In the configuration, I have purposely used multiple variations to emphasize the point that it works rather seamlessly and the amount of setup and configuration required are very minimal.

The test database runs on a 2-node cluster/11gR2 with default setup for SCAN listener.

[grid@linux1 admin]$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is running on node linux1
[grid@linux1 admin]$ srvctl config scan_listener
SCAN Listener LISTENER_SCAN1 exists. Port: TCP:1521

Two listeners are created one on the grid infrastructure and the other on the Oracle database homes.

[oracle@linux1 ~]$ ps -ef grep inheritgrep -v grep awk '{print $8" " $9}'
/u01/app/11.2.0/grid/bin/tnslsnr LTEST
/u01/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN1

[oracle@linux2 admin]$ ps -ef grep inheritgrep -v grep awk '{print $8" " $9}'
/u01/app/oracle/product/11.2.0/dbhome_1/bin/tnslsnr LISTENERCUSTOM

[grid@linux1 admin]$ cat endpoints_listener.ora grep LTEST
LTEST_LINUX1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=linux1-vip)(PORT=1529))
(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.153)(PORT=1529)(IP=FIRST))))# line added by Agent

[oracle@linux2 admin]$ cat endpoints_listener.ora grep LISTENER
LISTENERCUSTOM_LINUX2=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=linux2-vip)(PORT=1522))(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.1.155)(PORT=1522)(IP=FIRST)))) # line added by Agent

SQL> set head off
SQL> col name fold_a
SQL> select name,value from v$parameter where
2 name in ('local_listener','remote_listener','instance_name');

instance_name
GREEN2

local_listener
(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=linux2-vip)(PORT=1522))))

remote_listener
racnode-cluster-scan:1521

SQL> select name,value from v$parameter where
2 name in ('local_listener','remote_listener','instance_name');

instance_name
GREEN1

local_listener
(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=linux1-vip)(PORT=1529))))

remote_listener
racnode-cluster-scan:1521

With this setup, I make connections on a quiet (sandbox) database.

[oracle@linux1 ~]$ strace -o ltest1_t -af -tt sqlplus system/sys@green @param1 instance_name

Some Log Details sniped…

NAME VALUE SES SYS ISMOD ISADJ
----------------------------------- --------------- ----- --------- ---------- -----
instance_name GREEN1 FALSE FALSE FALSE FALSE

21:53:42.615196 connect(9, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("192.168.1.187")}, 16) = -1 EINPROGRESS (Operation now in progress)
21:53:42.616490 getsockname(9, {sa_family=AF_INET, sin_port=htons(62112), sin_addr=inet_addr("192.168.1.187")}, [16]) = 0
21:53:42.623002 connect(9, {sa_family=AF_INET, sin_port=htons(1529), sin_addr=inet_addr("192.168.1.201")}, 16) = -1 EINPROGRESS (Operation now in progress)

Another attempt to the second instance, shows

NAME VALUE SES SYS ISMOD ISADJ
----------------------------------- --------------- ----- --------- ---------- -----
instance_name GREEN2 FALSE FALSE FALSE FALSE

[oracle@linux1 ~]$ cat ltest2_t grep inet
21:58:20.645723 connect(9, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("192.168.1.187")}, 16) = -1 EINPROGRESS (Operation now in progress)
21:58:20.646914 getsockname(9, {sa_family=AF_INET, sin_port=htons(23662), sin_addr=inet_addr("192.168.1.187")}, [16]) = 0
21:58:20.654415 connect(9, {sa_family=AF_INET, sin_port=htons(1522), sin_addr=inet_addr("192.168.1.202")}, 16) = -1 EINPROGRESS (Operation now in progress)

If you notice, from the connection to the “GREEN1” instance, this is the local node from where the test was initiated, so the connection sort of comes back to the host. It takes 7.8 milliseconds (623002-615196)/1000 and for the connection that has to go to a remote instance, in our case “GREEN2”, it is 8.6 milliseconds.

Just to prove that there is an overhead with SCAN, I ran another test with the 10g-like setup.

[oracle@linux1 ~]$ strace -o oldstyl1 -af -tt sqlplus system/sys@oldstyle_linux1 @param1 instance_name
NAME VALUE SES SYS ISMOD ISADJ
----------------------------------- --------------- ----- --------- ---------- -----
instance_name GREEN1 FALSE FALSE FALSE FALSE

[oracle@linux1 ~]$ cat oldstyl1 grep inet
16:40:02.555231 bind(6, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
16:40:02.666356 connect(9, {sa_family=AF_INET, sin_port=htons(1528), sin_addr=inet_addr("192.168.1.201")}, 16) = -1 EINPROGRESS (Operation now in progress)

For the remote node connection, it takes 6.5milliseconds.

[oracle@linux1 ~]$ strace -o oldstyl2 -af -tt sqlplus system/sys@oldstyle_linux1 @param1 instance_name

NAME VALUE SES SYS ISMOD ISADJ
----------------------------------- --------------- ----- --------- ---------- -----
instance_name GREEN2 FALSE FALSE FALSE FALSE

[oracle@linux1 ~]$ cat oldstyl2 grep inet
16:40:15.188398 connect(9, {sa_family=AF_INET, sin_port=htons(1528), sin_addr=inet_addr("192.168.1.201")}, 16) = -1 EINPROGRESS (Operation now in progress)
16:40:15.189370 getsockname(9, {sa_family=AF_INET, sin_port=htons(41163), sin_addr=inet_addr("192.168.1.201")}, [16]) = 0
16:40:15.194980 connect(9, {sa_family=AF_INET, sin_port=htons(1528), sin_addr=inet_addr("192.168.1.202")}, 16) = -1 EINPROGRESS (Operation now in progress).

It seems, while SCAN provides some advantages with node management at the cluster level, it seems, there is a connect-time penalty to every single connection even the connection ends up in the originating node.