Clicky Web Analytics

Clicky

Nov
8
Mon
Posted By Ritesh Chhajer on Monday, November 08, 2010
17322 Views 4 Comments


We'll upgrade 10.2.0.4 RAC database to 11.2.0.2 database.
This has been tested on RHEL4U8 x-64 as well as RHEL5U3 x-64.

Prerequisites:
1. SCAN:
As a basic prerequisite for any Oracle 11g release, ensure your SCAN names have been configured and registered with with either DNS or GNS. You can refer to my previous post on Oracle 11gR2 Features for more details on SCAN. Keep the SCAN names handy. In this example, we are using DNS.

2. Multicasting
With 11.2.0.2, Oracle has introduced redundant interconnect which needs multicast to be enabled on the private network. Read ML Note: 1212703.1 which has a sample program that can be used to validate if multicasting is enabled on the server. As a prerequisite, always a good idea to validate this beforehand. 

3. Software:
Download the software from https://updates.oracle.com/download/10098816.html
Starting with 11.2.0.2, Oracle database patch sets are full installations and Oracle's recommended way is to do out-of-place upgrades which makes sense as it is less risky with easier and faster fallback. You can read more on ML Note: 1189783.1

4. Patch
Ensure you have the latest Opatch - 6880880 for 11.2. You can refer to ML Note: 274526.1 on how to download and install OPatch.
Note: If you are upgrading from 11.2.0.1 to 11.2.0.2 then you'll need to apply patch 9655006 to 11.2.0.1 GI home before upgrading to 11.2.0.2. Refer ML Note: 10036834.8 for more details.
I would recommend Amit Bansal's blog where he has given all the steps for applying this patch as well as screenshots for upgrading from 11.2.0.1 to 11.2.0.2.

5. Backups
As with any upgrades, take database backup. I also prefer to save list of services along with their preferred and available nodes so that in case of any issues with my OCR, I have necessary information pertinent to services. Remaining information like instances, listener can always be set up again.
Example:
[oracle@test-server1]~% srvctl config service -d testdb > /oracle/11gupgrade_log/service_config.txt
[oracle@test-server1]~% srvctl status service -d testdb > /oracle/11gupgrade_log/service_status.txt

6. NTP
Since we are using Network Time Protocol (NTP) for synchronization of time across all the servers in the cluster, a mandatory requirement with 11gR2 is to enable the slewing option by adding '-x' argument in the ntp configuration file as seen below:
[root@test-server1 ~]# cat /etc/sysconfig/ntpd
# Drop root to id 'ntp:ntp' by default.
OPTIONS="-x -u ntp:ntp -p /var/run/ntpd.pid"

# Set to 'yes' to sync hw clock after successful ntpdate
SYNC_HWCLOCK=no

# Additional options for ntpdate
NTPDATE_OPTIONS=""

[root@test-server1 ~]# service ntpd stop
Shutting down ntpd: [ OK ]

[root@test-server1 ~]# service ntpd start
ntpd: Synchronizing with time server: [ OK ]
Starting ntpd: [ OK ]

[root@test-server1 ~]# service ntpd status
ntpd (pid 32062) is running...

You can also refer to ML Note: 1056693.1 to address cluvfy errors like PRVF-5436 PRVF-9652

7. Pre-Upgrade Utility SQL:
Download utlu112i.sql from ML Note: 884522.1 and execute it on your existing database.

8. DB Upgrade/Migrate Diagnostic SQL:
The script needs to be run both before the upgrade on the source database and after the upgrade on the upgraded database as SYS user. This will help to determine the status of the database before and after upgrade. Download the script from ML Note: 556610.1 and save as dbupgdiag.sql 
While running the script, give directory path as argument where the log file will be generated.
Example:
SQL> @dbupgdiag.sql /oracle/11gupgrade_log

9. Cluster Verification Utility:
Run cluvfy from 11g software home and take appropriate actions accordingly.
Example:
[oracle@test-server1]~% ./runcluvfy.sh stage -pre crsinst -n test-server1,test-server2 -verbose > /oracle/11gupgrade_log/cluvfy.log

11.2.0.2 Grid Upgrade:
I prefer all installations and upgrades using silent way as it saves time. Copy the default response file from 11.2.0.2 grid software home and edit the below parameters and leave the rest to their default values. In this case, SCAN is registered with DNS and GNS is not being used. Hence autoConfigureClusterNodeVIP is set to false.
Sample grid.rsp:
ORACLE_HOSTNAME=test-server1
INVENTORY_LOCATION=/oracle/oraInventory
SELECTED_LANGUAGES=en
oracle.install.option=UPGRADE
ORACLE_BASE=/oracle
ORACLE_HOME=/oragrid/product/11.2.0.2
oracle.install.asm.OSDBA=dba
oracle.install.asm.OSOPER=dba
oracle.install.asm.OSASM=dba
oracle.install.crs.config.gpnp.scanName=test-server-clu
oracle.install.crs.config.gpnp.scanPort=1521
oracle.install.crs.config.clusterName=crs
oracle.install.crs.config.autoConfigureClusterNodeVIP=false
oracle.install.crs.upgrade.clusterNodes=test-server1,test-server2

Since response file is prepared, we are good to kick of the installer.
[oracle@test-server1]~% ./runInstaller -silent -waitforcompletion -responseFile /home/oracle/software/grid/grid.rsp -force –ignoreSysPrereqs

Once the binaries are installed on both the hosts it will prompt us to run rootupgrade.sh as root user which actually does the upgrade process. When rootupgrade.sh runs, it will stop 10g CRS and start 11g CRS. This can be done in rolling fashion. Downtime starts only at the step of rootupgrade.sh, till then it's just 11g software installation.

CRS active version would get updated only after all nodes have been upgraded. Till then new options that come with 11g crsctl won't work. Only 10g compliant commands would work. Before proceeding with the rootupgrade.sh, run the runfixup.sh as root user on both the nodes. This is a new 11g feature which will fix minimum required OS parameters for Oracle.

[root@test-server1 tmp]# cd /tmp/CVU_11.2.0.2.0_oracle
[root@test-server1 tmp]# ./runfixup.sh

Similarly do it for rest of the nodes.

You can verify /etc/security/limits.conf and /etc/sysctl.conf to see if they've been updated.

As root user execute the rootupgrade.sh now
[root@test-server1 ~]# /oragrid/product/11.2.0.2/rootupgrade.sh
[root@test-server2 ~]# /oragrid/product/11.2.0.2/rootupgrade.sh

Note: It's very important to run them sequentially.
In case rootupgrade.sh fails for some reason, follow ML Note: 969254.1 for further troubleshooting.

Once the rootupgrade.sh is executed on all the nodes, you can verify.
[oracle@test-server1]~% crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.2.0]

Run cluster verification utility post upgrade just to verify everything is fine and you are good to proceed with RDBMS upgrade.
[oracle@test-server1]~% ./runcluvfy.sh stage -post crsinst -n test-server1,test-server2 -verbose > /oracle/11gupgrade_log/cluvfy_post.log

11.2.0.2 RDBMS Upgrade:
We'll install the binaries using silent installation. Copy the default response file from the oracle 11.2.0.2 software and edit the following parameters as below and leave the other parameter to their default values. We are just installing the software and will do DB upgrade using manual method, hence you see oracle.install.option=INSTALL_DB_SWONLY
Sample db.rsp:
oracle.install.option=INSTALL_DB_SWONLY
ORACLE_HOSTNAME=test-server1
UNIX_GROUP_NAME=dba
INVENTORY_LOCATION=/oracle/oraInventory
SELECTED_LANGUAGES=en
ORACLE_HOME=/oracle/product/11.2.0.2
ORACLE_BASE=/oracle
oracle.install.db.InstallEdition=EE
oracle.install.db.DBA_GROUP=dba
oracle.install.db.OPER_GROUP=dba
oracle.install.db.CLUSTER_NODES=test-server1,test-server2
oracle.install.db.config.starterdb.type=GENERAL_PURPOSE
oracle.install.db.config.starterdb.characterSet=AL32UTF8
oracle.install.db.config.starterdb.memoryOption=false
SECURITY_UPDATES_VIA_MYORACLESUPPORT=false
DECLINE_SECURITY_UPDATES=true

[oracle@test-server1]~% ./runInstaller -silent -waitforcompletion - responseFile /home/oracle/software/database/db.rsp –force

At the end of the installation, it will prompt us to run root.sh on both the nodes.
Before that, run runfixup.sh as root user on all the nodes

[root@test-server1 ~]# cd /tmp/CVU_11.2.0.2.0_oracle
[root@test-server1 tmp]# ./runfixup.sh

Similary, do it for rest of the nodes.
Now, run root.sh
[root@test-server1 ~]# /oracle/product/11.2.0.2/root.sh
[root@test-server2 ~]# /oracle/product/11.2.0.2/root.sh

Manual DB Upgrade:
We have all the binaries ready for upgrading, till now the database is up and running and now the actual downtime starts. Source the 10g specific profile file and shutdown the database. Keep the new init files ready with following changes:
*.cluster_database=false
*.compatible='11.2.0'
*.diagnostic_dest=/oracle

I also prefer to keep *.sec_case_sensitive_logon=FALSE so that existing users are not affected with password case sensitivity that is introduced with 11g.

Note: If you want to keep the option of running catdwgrd.sql open, then don't change compatible parameter yet. Change it after catupgrd.sql is completed successfully.

Comment out following as they are deprecated with 11g:
#*.audit_file_dest
#*.background_dump_dest
#*.core_dump_dest
#*.user_dump_dest

All diagnostic files like alert log and trace files would now be located in the directory as mentioned by DIAGNOSTIC_DEST. With 11g, you can use adrci utility as well to view the logs.

Remember to copy important files from old 10g home to new 11g home:
1. init/spfile
2. password file
3. tnsnames.ora

Now source the 11g specific profile file, and start the upgrade process.
[oracle@test-server1]~% echo $ORACLE_HOME
/oracle/product/11.2.0.2

[oracle@test-server1]~% echo $ORACLE_SID
TESTDB1

SQL> startup upgrade
SQL> spool upgrade.log
SQL> set time on timing on echo on
SQL> @?/rdbms/admin/catupgrd.sql
SQL> spool off

Source 10g home profile and remove DB from srvctl
[oracle@test-server1]~% /oracle/product/10.2/bin/srvctl remove database -d testdb

Source 11g home profile and add DB to srvctl
[oracle@test-server1]~%srvctl add database -d testdb -o /oracle/product/11.2.0.2
[oracle@test-server1]~%srvctl add instance -d testdb -i testdb1 -n test-server1
[oracle@test-server1]~%srvctl add instance -d testdb -i testdb2 -n test-server2

Change the cluster_database parameter to TRUE and start the DB using srvctl.
[oracle@test-server1]~% srvctl start database –d testdb
[oracle@test-server1]~% srvctl status db -d testdb
Instance testdb1 is running on node test-server1
Instance testdb2 is running on node test-server2

We had saved backup of services in /oracle/11gupgrade_log/service_config.txt
Use that to add the services now to 11g CRS.

Run the post upgrade SQLs:
SQL> spool postupgrade.log
SQL> @?/rdbms/admin/utlu112s.sql --Reports updated registry version and upgrade time
SQL> @?/rdbms/admin/catuppst.sql --Migrates AWR/ADDM data from 10g to 11g dictionary
SQL> @?/rdbms/admin/utlrp.sql --Recompiles invalids
SQL> select * from registry$history; --Verifies registry
SQL> spool off

Run the diagnostic SQL to verify post upgrade.

SQL> @dbupgdiag.sql /oracle/11gupgrade_log

You can verify the status of all resources using:
crsctl status resource -t
crsctl status resource -t -init

Post Upgrade Checks:
Since we did a manual DB upgrade, remember to check following files on all nodes post upgrade to ascertain new homes are reflected correctly:
1. /etc/oratab
2. inventory.xml

Besides, any home grown shell script that has Oracle Home reference would need to be updated.
Also, verify grid control where you'll have to re-configure the targets to ensure new home names are identified.
You may have to restart agent as part of it.

XDB and ACL:
Starting with Oracle 11g, so called "fine-grained access" has been implemented to limit usage of packages like UTL_SMTP, UTL_HTTP connecting over the network to other services like mail server etc so several UTL_* packages require additional permissions to be granted for network access. So if your application is using these packages then you need create ACL and assign it to a network using DBMS_NETWORK_ACL_ADMIN.

The ACL is an XML file which lists the permissions given to user(s). This XML is stored in Oracle XML DB. 
For XML DB creation, refer ML Note:742014.1
Example:
SQL> spool catqm.log
SQL> @?/rdbms/admin/catqm.sql xdb sysaux temp
SQL> spool off
xdb - schema Name
sysaux - Tablespace(You can have seperate tablespace created if needed)
temp - Temporary tablespace
When prompted for arg 4, press Enter or you can say "YES" or "NO". By default it will choose "Secure Files" which is "YES"
Verification SQL xdbusagecheck.sql can be found on ML Note:733667.1 to verify if the XDB install is valid.

CBO Statistics:
You may want to refer to ML Note: 465787.1 for managing CBO statistics post upgrade.

In next post, I'll share the new features that are pertinent to only 11.2.0.2
As such, general overview of 11g New features has been covered in one of my previous post.

Addendum:
I got few requests to share the issues encountered during the upgrade so here they are along with their resolution steps.

1. Inconsistent inventory
No errors were reported during the CRS upgrade. However, when trying to upgrade RDBMS, cluvfy was throwing all kinds of errors like:
Check: Time zone consistency
CRS configuration file "/oracle/product/crs/crs/install/s_crsconfig_test-server1_env.txt" missing on node "test-server1".
Result: Time zone consistency check failed

ERROR:
PRVF-5319 : Oracle Cluster Synchronization Services do not appear to be online.

Though CSS is up, but it's still showing CSS as down. For time zone consistency check, it's looking for an env file in old CRS home. That's when I suspected there could be an issue with inventory.

[oracle@test-server1]~% cat /etc/oraInst.loc
inventory_loc=/oracle/oraInventory
inst_group=dba

[oracle@test-server1]~% cat /oracle/oraInventory/ContentsXML/inventory.xml|grep CRS
<HOME NAME="OraCrs10g_home" LOC="/oracle/product/crs" TYPE="O" IDX="1" CRS="true">

For some reason, Oracle did not update the inventory correctly. 
CRS="true" should be pointed to 11g CRS home. Instead it was still pointing to old 10g CRS Home.

So I fixed it after taking a backup of the existing one. This is how the correct inventory should look like.
[oracle@test-server1]~% cat /oracle/oraInventory/ContentsXML/inventory.xml|grep CRS
<HOME NAME="Ora11g_gridinfrahome1" LOC="/oragrid/product/11.2" TYPE="O" IDX="3" CRS="true">

Please note that manually updating inventory is not a good idea(Albeit I've been habituated to do that:)). 
Better and cleaner way would be to use detach and attach options of runInstaller.
See below example:
./runInstaller -detachHome -silent -local ORACLE_HOME=/oracle/product/crs

./runInstaller -silent -local -ignoreSysPrereqs -attachHome ORACLE_HOME=/oragrid/product/11.2 ORACLE_HOME_NAME=Ora11g_gridinfrahome1 LOCAL_NODE=test-server1 CLUSTER_NODES=test-server1,test-server2 CRS=true

2. rootupgrade.sh failure
Failed to start Oracle Clusterware stack Failed to start Cluster Synchorinisation Service in clustered mode at /oragrid/product/11.2.0.2/crs/install/crsconfig_lib.pm line 1016.
/oragrid/product/11.2.0.2/perl/bin/perl -I/oragrid/product/11.2.0.2/perl/lib -I/oragrid/product/11.2.0.2/crs/install /oragrid/product/11.2.0.2/crs/install/rootcrs.pl execution failed

Here the cause was human error. Basically, rootupgrade.sh from 2nd node was kicked off before it completed on node 1.

Following was done to fix it:

From node1, /oragrid/product/11.2.0.2/crs/install/rootcrs.pl -deconfig -force
From last node, /oragrid/product/11.2.0.2/crs/install/rootcrs.pl -deconfig -force -lastnode

This wiped off OCR and votedisk.
Touched those files and changed permissions appropriately.
Ran root.sh from 10g home which started the CRS from 10g.
After that, ran rootupgrade.sh from 11g home which was successfull.

However, later I realized that there is proper document published by Oracle Support that addresses on what to do if rootupgrade.sh fails.(ML Note: 969254.1)

3. OCR/Voting Disk Permission
PRVF-5431 : Oracle Cluster Voting Disk configuration check failed

[oracle@test-server1]% ls -ltr
total 20024
-rw-r----- 1 root   dba 272756736 Oct 14 13:22 ocrfile
-rw-r--r-- 1 oracle dba  10240000 Oct 14 14:19 cssfile

Existing permission is 644. Need to change it to 640.
[oracle@test-server1]% chmod 640 cssfile

Similarly, you may find permission error for OCR like:
PRVF-4178 : Check for OCR location "/u05/cludata/ocrfile" failed:
Permissions of file "/u05/cludata/ocrfile" did not match the expected value. [Expected = "0640" ; Found = "0644"]

Fix is to simply change the permission to 640 using chmod.

4. Private interface eth1 not coming up post node reboot
Well, this one is not really specific to Oracle upgrade but just putting it across so that if someone faces the same issue, he can resolve it on his own without waiting on sysadmin. As part of setting up hugepages, one of the host was rebooted during the upgrade and post reboot, CRS was not coming up.

a) Checked /var/log/messages which confirmed CRS was not able to start

b) Checked ifconfig which showed the private interface eth1 is not up

c) Run "ifconfig -a|grep HW" to get the MAC address of the private interface
Example:
[root@test-server1 ~]# ifconfig -a|grep HW
eth0      Link encap:Ethernet  HWaddr XX:XX:XX:XX:XX:XX
eth0:1    Link encap:Ethernet  HWaddr XX:XX:XX:XX:XX:XX
eth1      Link encap:Ethernet  HWaddr XX:XX:XX:XX:XX:XY

d) Add the MAC address of eth1 to /etc/sysconfig/network-scripts/ifcfg-eth1
Example:
[root@test-server1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
ONBOOT=yes
BOOTPROTO=static
IPADDR=x.x.x.x
NETMASK=xxx.xxx.xxx.x
HWADDR=XX:XX:XX:XX:XX:XY

e) Now, bring up the interface as root
[root@test-server1 ~]# ifup eth1

Once the private interconnect is up, CRS would come up fine.

Hope this helps.

Categories
RAC

Rants & Raves Minimize

  • Gravatar
    Mary Tuesday, November 23, 2010 at 3:45 PM

    I'm seeing messages like "CRS-2409:The clock on host is not synchronous with the mean cluster time" in the logs. Just curious to know if it's a cause for concern.

    • Gravatar
      Ritzy Tuesday, November 23, 2010 at 4:15 PM
      CTSSD
      Hey Mary,

      Nice to see you back in action.

      CTSSD is cluster time synchronization service daemon which is reporting about the time difference.
      However, CTSSD can't do much as it runs on observer mode. You can verify that using "crsctl status resource ora.ctssd -init -t".

      NTPD is network time protocol daemon which is really responsible for time management within the cluster.You may want to check with your sysadmins if the hardware clock time is in sync across cluster nodes. You can also check status of ntp using "service ntpd status" and also verify "/etc/sysconfig/ntpd" to ascertain it has "=x" in OPTIONS.

      • Gravatar
        Bhasker Sakkari Wednesday, June 01, 2011 at 8:31 PM
        Upgrade RAC with Dataguard
        Hi Ritesh, Thanks for your post. We need some more info from you. As we did patch RAC 11.2.0.1 database to 11.2.0.2 on test server. But our production server has also present with RAC to single instance Dataguard. Please advice has with more detail steps to upgrade RAC with Dataguard present. As fellowing in Upgrade guide from ORACLE, but not much discussed about upgrading RAC database with Dataguard present .
        Please share with us with detailed steps to follow, we need it ASAP.
        Thank you for your time.

        Thanks & Regards
        Bhasker Sakkari.

        • Gravatar
          Ritzy Saturday, June 04, 2011 at 9:14 PM

          The upgrade process is pretty much same for RAC with Dataguard

  • oracleref.wordpress.com Sunday, November 21, 2010 at 1:52 AM
    via pingback
    OLearning … « Oracle DB 10g & 11g Tops

  • www.ritzyblogs.com Wednesday, November 17, 2010 at 2:14 PM
    via pingback
    Getting started with 11.2.0.2 - New features|Oracle Talk

  • Recommended Oracle DBA Books Minimize

         

    Tag Cloud Minimize


    Archive Posts Minimize
     
    Monthly
      Yearly

      Disclaimer:
      This posting is provided "AS IS" with no warranties, and confers no rights. You assume all risk for your use.

      This posting has nothing to do with my present or past employer.