Skip to content

Configure Oracle Grid Infrastructure for a Cluster … failed

November 2, 2013

Central Inventory Not updated, After Installing GRID Infrastructure where eons fails to startup as part of root.sh

While running root.sh on node1, it failed at starting Nodeapps
===============================================================

CRS-2672: Attempting to start ‘ora.asm’ on ‘mgracsolsrv64bit1’
CRS-2676: Start of ‘ora.asm’ on ‘mgracsolsrv64bit1’ succeeded
CRS-2672: Attempting to start ‘ora.OCRVOTE.dg’ on ‘mgracsolsrv64bit1’
CRS-2676: Start of ‘ora.OCRVOTE.dg’ on ‘mgracsolsrv64bit1’ succeeded
/u01/app/11.2.0.1/grid/bin/srvctl start nodeapps -n mgracsolsrv64bit1 … failed
Configure Oracle Grid Infrastructure for a Cluster … failed
mgracsolsrv64bit1:[root]$

******* Further findings…Shows it failed to start resource eons due to Bug – 9879177 **************

cat /u01/app/11.2.0.1/grid/cfgtoollogs/crsconfig/rootcrs_mgracsolsrv64bit1.log
=====================================================================================
2013-10-30 12:12:25: Running as user grid: /u01/app/11.2.0.1/grid/bin/srvctl add scan_listener -p 1521
2013-10-30 12:12:25:   Invoking “/u01/app/11.2.0.1/grid/bin/srvctl add scan_listener -p 1521” as user “grid”
2013-10-30 12:12:33: add scan listener … success
2013-10-30 12:12:34: Running as user grid: /u01/app/11.2.0.1/grid/bin/srvctl add oc4j
2013-10-30 12:12:34:   Invoking “/u01/app/11.2.0.1/grid/bin/srvctl add oc4j” as user “grid”
2013-10-30 12:12:40: J2EE (OC4J) Container Resource Add … passed …
2013-10-30 12:12:40: starting nodeapps…                                <****Here its starting nodeapps******>
2013-10-30 12:12:40: DHCP_flag=0
2013-10-30 12:12:40: nodes_to_start=mgracsolsrv64bit1
2013-10-30 12:14:32: exit value of start nodeapps/vip is 2
2013-10-30 12:14:33: output for start nodeapps is  PRCR-1013 : Failed to start resource ora.eons PRCR-1064 : Failed to start resource ora.eons on node mgracs
olsrv64bit1 CRS-5016: Process “/u01/app/11.2.0.1/grid/bin/scrctl” spawned by agent “/u01/app/11.2.0.1/grid/bin/oraagent.bin” for action “start” failed: detai
ls at “(:CLSN00010:)” in “/u01/app/11.2.0.1/grid/log/mgracsolsrv64bit1/agent/crsd/oraagent_grid/oraagent_grid.log” CRS-2674: Start of ‘ora.eons’ on ‘mgracsol
srv64bit1’ failed
2013-10-30 12:14:33: output of startnodeapp after removing already started mesgs is PRCR-1013 : Failed to start resource ora.eons PRCR-1064 : Failed to start
resource ora.eons on node mgracsolsrv64bit1 CRS-5016: Process “/u01/app/11.2.0.1/grid/bin/scrctl” spawned by agent “/u01/app/11.2.0.1/grid/bin/oraagent.bin”
for action “start” failed: details at “(:CLSN00010:)” in “/u01/app/11.2.0.1/grid/log/mgracsolsrv64bit1/agent/crsd/oraagent_grid/oraagent_grid.log” CRS-2674:
Start of ‘ora.eons’ on ‘mgracsolsrv64bit1’ failed
2013-10-30 12:14:33: /u01/app/11.2.0.1/grid/bin/srvctl start nodeapps -n mgracsolsrv64bit1 … failed
2013-10-30 12:14:35: Configure Oracle Grid Infrastructure for a Cluster … failed

cat /u01/app/11.2.0.1/grid/log/mgracsolsrv64bit1/agent/crsd/oraagent_grid/oraagent_grid.l01
cat /u01/app/11.2.0.1/grid/log/mgracsolsrv64bit1/agent/crsd/oraagent_grid/oraagent_grid.log
———————————————————————————————————————————————————–
2013-10-30 12:14:06.100: [    AGFW][7] Command: start for resource: ora.eons mgracsolsrv64bit1 1 completed with status: SUCCESS
2013-10-30 12:14:06.101: [    AGFW][9] Agent sending reply for: RESOURCE_START[ora.eons mgracsolsrv64bit1 1] ID 4098:1218
2013-10-30 12:14:06.104: [    AGFW][20] Executing command: check for resource: ora.eons mgracsolsrv64bit1 1
2013-10-30 12:14:06.512: [ora.eons][20] [check] Utils::getCrsHome crsHome /u01/app/11.2.0.1/grid
2013-10-30 12:14:06.585: [ora.eons][20] [check] Failed to read eONS daemon check port number
2013-10-30 12:14:06.618: [ora.eons][20] [check] Filling Environment Map


2013-10-30 12:14:23.985: [ USRTHRD][27] Thread:[EonsSub EONS] run: waiting on connecting to event server: ons_subscriber_status() returns WAITING
2013-10-30 12:14:24.068: [ USRTHRD][23] Thread:[EonsSub ONS] run: waiting on connecting to event server: ons_subscriber_status() returns WAITING
2013-10-30 12:14:25.019: [ USRTHRD][27] Thread:[EonsSub EONS] run: waiting on connecting to event server: ons_subscriber_status() returns WAITING
2013-10-30 12:14:25.020: [ora.eons][20] [check] execCmd ret = 1
2013-10-30 12:14:25.045: [    AGFW][20] check for resource: ora.eons mgracsolsrv64bit1 1 completed with status: OFFLINE
2013-10-30 12:14:25.045: [    AGFW][9] ora.eons mgracsolsrv64bit1 1 state changed from: STARTING to: OFFLINE
2013-10-30 12:14:25.046: [    AGFW][9] Agent sending last reply for: RESOURCE_START[ora.eons mgracsolsrv64bit1 1] ID 4098:1218
2013-10-30 12:14:25.068: [ USRTHRD][23] Thread:[EonsSub ONS] run: waiting on connecting to event server: ons_subscriber_status() returns WAITING
2013-10-30 12:14:26.022: [ USRTHRD][27] Thread:[EonsSub EONS] run: waiting on connecting to event server: ons_subscriber_status() returns WAITING
2013-10-30 12:14:26.071: [ USRTHRD][23] Thread:[EonsSub ONS] run: waiting on connecting to event server: ons_subscriber_status() returns WAITING

cat /u01/app/11.2.0.1/grid/log/mgracsolsrv64bit1/crsd/crsd.log
==========================================================================================
2013-10-30 12:14:05.777: [    AGFW][62] Received the reply to the message: RESOURCE_START[ora.eons mgracsolsrv64bit1 1] ID 4098:1218 from the agent /u01/app/11.2.0.1/grid/bin/oraagent_grid
2013-10-30 12:14:05.987: [    AGFW][62] Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.eons mgracsolsrv64bit1 1] ID 4098:1213
2013-10-30 12:14:06.103: [    AGFW][62] Received the reply to the message: RESOURCE_START[ora.eons mgracsolsrv64bit1 1] ID 4098:1218 from the agent /u01/app/11.2.0.1/grid/bin/oraagent_grid
2013-10-30 12:14:06.123: [    AGFW][62] Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.eons mgracsolsrv64bit1 1] ID 4098:1213
2013-10-30 12:14:06.183: [   CRSPE][67] Received reply to action [Start] message ID: 1213
2013-10-30 12:14:06.183: [   CRSPE][67] Got agent-specific msg: CRS-5016: Process “/u01/app/11.2.0.1/grid/bin/scrctl” spawned by agent “/u01/app/11.2.0.1/grid/bin/oraagent.bin” for action “start” failed: details at “(:CLSN00010:)” in “/u01/app/11.2.0.1/grid/log/mgracsolsrv64bit1/agent/crsd/oraagent_grid/oraagent_grid.log”

2013-10-30 12:14:06.487: [   CRSPE][67] Received reply to action [Start] message ID: 1213
2013-10-30 12:14:06.487: [UiServer][69] Container [ Name: ORDER
MESSAGE:
TextMessage[CRS-5016: Process “/u01/app/11.2.0.1/grid/bin/scrctl” spawned by agent “/u01/app/11.2.0.1/grid/bin/oraagent.bin” for action “start” failed: details at “(:CLSN00010:)” in “/u01/app/11.2.0.1/grid/log/mgracsolsrv64bit1/agent/crsd/oraagent_grid/oraagent_grid.log”]
MSGTYPE:
TextMessage[1]
OBJID:
TextMessage[ora.eons]
WAIT:
TextMessage[0]
]
2013-10-30 12:14:19.998: [UiServer][71] S(4c792c0): set Properties ( grid,4c5e260)
2013-10-30 12:14:20.009: [UiServer][69] processMessage called
2013-10-30 12:14:20.009: [UiServer][69] Sending message to PE. ctx= 4c3f970

….
….

2013-10-30 12:14:25.094: [    AGFW][62] Agfw Proxy Server sending the last reply to PE for message:RESOURCE_START[ora.eons mgracsolsrv64bit1 1] ID 4098:1213
2013-10-30 12:14:25.094: [   CRSPE][67] Received reply to action [Start] message ID: 1213
2013-10-30 12:14:25.222: [   CRSPE][67] CRS-2674: Start of ‘ora.eons’ on ‘mgracsolsrv64bit1’ failed

2013-10-30 12:14:25.237: [UiServer][69] Container [ Name: ORDER
MESSAGE:
        TextMessage[CRS-2674: Start of ‘ora.eons’ on ‘mgracsolsrv64bit1’ failed]
MSGTYPE:
TextMessage[1]
OBJID:
TextMessage[ora.eons]
WAIT:
TextMessage[0]
]
2013-10-30 12:14:25.292: [   CRSPE][67] Sequencer for [ora.eons mgracsolsrv64bit1 1] has completed with error: CRS-0215: Could not start resource ‘ora.eons’.

2013-10-30 12:14:25.398: [   CRSPE][67] PE Command [ Start Resource : 4b268b0 ] has completed
..

Issue:
======================================================
Start of resource “ora.eons” failed, Due to Bug (Bug 9879177 , 9765069), due to which the “srvctl start nodeapps” failed by root.sh, as all the resource did not come online successfully.

Then, Oracle Installer fails , while running Cluvfy
======================================================
INFO: Started Plugin named: Oracle Cluster Verification Utility
INFO: Found associated job
INFO: Starting ‘Oracle Cluster Verification Utility’
INFO: Starting ‘Oracle Cluster Verification Utility’
INFO: Performing post-checks for cluster services setup
INFO: Checking node reachability…
INFO: Node reachability check passed from node “mgracsolsrv64bit1”
INFO: Checking user equivalence…
INFO: User equivalence check passed for user “grid”
WARNING:
INFO: ERROR:
INFO: CRS is not installed on any of the nodes
INFO: Verification cannot proceed
INFO: Post-check for cluster services setup was unsuccessful on all the nodes.
INFO:
INFO: Completed Plugin named: Oracle Cluster Verification Utility
INFO: Oracle Cluster Verification Utility failed.

On Manual Check with Cluvfy fails as below
================================================
./runcluvfy.sh stage -post crsinst -n mgracsolsrv64bit1,mgracsolsrv64bit2 -verbose | tee /export/home/grid/cvulogs/node1.post-crs-check.txt


ERROR:
CRS is not installed on any of the nodes
Verification cannot proceed

Finding:
1) Above error by Cluvfy states : CRS is not installed , this happened because Inventory was not updated by root.sh as it failed to start nodeapps, Due to Bug (Bug 9879177 , 9765069)
2) While configuring GI, start nodeapps failed and got exited, whithout performing post GI config activities like updating the inventory to be CRS and UpdateNodeList.

It should have done further steps as below : which did not processed after GI config failure
=============================================================================================
Below steps are from successful run of earlier setup
——————————————————————————-
Configure Oracle Grid Infrastructure for a Cluster … succeeded
Updating inventory properties for clusterware
Starting Oracle Universal Installer…

Checking swap space: must be greater than 500 MB.   Actual 19128 MB    Passed
The inventory pointer is located at /var/opt/oracle/oraInst.loc
The inventory is located at /u01/app/oraInventory
‘UpdateNodeList’ was successful.

The inventory.xml was not updated correctly and missing CRS=”true” parameter.
===========================================================================================

mgracsolsrv64bit1:/u01/app/oraInventory/ContentsXML: more inventory.xml
<?xml version=”1.0″ standalone=”yes” ?>
<!– Copyright (c) 1999, 2009, Oracle. All rights reserved. –>
<!– Do not modify the contents of this file by hand. –>
<INVENTORY>
<VERSION_INFO>
<SAVED_WITH>11.2.0.1.0</SAVED_WITH>
<MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME=”Ora11g_gridinfrahome1″ LOC=”/u01/app/11.2.0.1/grid” TYPE=”O” IDX=”1″>    ******* Here is the issue CRS=”true” parameter is missing *********
<NODE_LIST>
<NODE NAME=”mgracsolsrv64bit1″/>
<NODE NAME=”mgracsolsrv64bit2″/>
</NODE_LIST>
</HOME>
</HOME_LIST>
</INVENTORY>

We should not modify inventory.xml manually, execute below command : runInstaller with  updateNodeList
===============================================================================================
mgracsolsrv64bit1:/u01/app/oraInventory/ContentsXML: $ORACLE_HOME/oui/bin/runInstaller -silent -ignoreSysPrereqs -updateNodeList ORACLE_HOME=$ORACLE_HOME “CLUSTER_NODES={mgracsolsrv64bit1,mgracsolsrv64bit2}” CRS=true
Starting Oracle Universal Installer…

Checking swap space: must be greater than 500 MB.   Actual 5803 MB    Passed
The inventory pointer is located at /var/opt/oracle/oraInst.loc
The inventory is located at /u01/app/oraInventory
SEVERE:Remote ‘UpdateNodeList’ failed on nodes: ‘mgracsolsrv64bit2‘. Refer to ‘/u01/app/oraInventory/logs/UpdateNodeList2013-10-30_02-03-27PM.log’ for details.
You can manually re-run the following command on the failed nodes after the installation:
/u01/app/11.2.0.1/grid/oui/bin/runInstaller -updateNodeList -noClusterEnabled ORACLE_HOME=/u01/app/11.2.0.1/grid CLUSTER_NODES=mgracsolsrv64bit1,mgracsolsrv64bit2 CRS=true  “INVENTORY_LOCATION=/u01/app/oraInventory” LOCAL_NODE=<node on which command is to be run>.
Please refer ‘UpdateNodeList’ logs under central inventory of remote nodes where failure occurred for more details.
mgracsolsrv64bit1:/u01/app/oraInventory/ContentsXML:

Further checked the node2 central inventory
=============================================================

mgracsolsrv64bit2:/export/home/grid: cd /u01/app/oraInventory/ContentsXML ********(directory does not exist, so above update for node 2 failed) ***********
ksh: /u01/app/oraInventory/ContentsXML:  not found

mgracsolsrv64bit2:/export/home/grid: cd /u01/app/oraInventory/

mgracsolsrv64bit2:/u01/app/oraInventory: ls -ltrh
total 6
drwxrwx—   2 grid     oinstall     512 Oct 30 11:46 logs
-rwxrwx—   1 grid     oinstall    1.6K Oct 30 12:20 orainstRoot.sh

******* Only 2 files seen as above ******

As seen above Central inventory is missing for node2
============================================================
I copied the central inventory from node 1.
——————————————————————-
mgracsolsrv64bit1:/u01/app/oraInventory: ls -ltrh
total 16
-rw-rw—-   1 grid     oinstall     293 Oct 30 10:52 oraInstaller.properties
drwxrwx—   2 grid     oinstall     512 Oct 30 10:52 oui
drwxrwx—   2 grid     oinstall     512 Oct 30 11:33 ContentsXML
-rw-rw—-   1 grid     oinstall      38 Oct 30 11:33 install.platform
-rwxrwx—   1 grid     oinstall    1.6K Oct 30 11:46 orainstRoot.sh
-rw-rw—-   1 grid     oinstall      56 Oct 30 11:46 oraInst.loc
drwxrwx—   2 grid     oinstall     512 Oct 30 13:42 logs

mgracsolsrv64bit1:/u01/app/oraInventory: cd /u01/app/oraInventory/ContentsXML
mgracsolsrv64bit1:/u01/app/oraInventory/ContentsXML: ls -ltrh
total 6
-rw-rw—-   1 grid     oinstall     521 Oct 30 11:33 inventory.xml
-rw-rw—-   1 grid     oinstall     307 Oct 30 11:33 comps.xml
-rw-rw—-   1 grid     oinstall     270 Oct 30 11:33 libs.xml
mgracsolsrv64bit1:/u01/app/oraInventory/ContentsXML:

mgracsolsrv64bit1:/u01/app/oraInventory/ContentsXML: cd /u01/app/oraInventory/
tar -cvf inv.tar *
scp inv.tar mgracsolsrv64bit2:/u01/app/oraInventory/

Untar on node2:
————————————————-
mgracsolsrv64bit2:/u01/app/oraInventory: tar -xvf inv.tar

As per earlier error by runinstaller, I executed updateNodeList On node2 manually
=======================================================================
mgracsolsrv64bit2:/u01/app/oraInventory/ContentsXML: /u01/app/11.2.0.1/grid/oui/bin/runInstaller -updateNodeList -noClusterEnabled ORACLE_HOME=/u01/app/11.2.0.1/grid CLUSTER_NODES=mgracsolsrv64bit1,mgracsolsrv64bit2 CRS=true  “INVENTORY_LOCATION=/u01/app/oraInventory” LOCAL_NODE=mgracsolsrv64bit2
Starting Oracle Universal Installer…

Checking swap space: must be greater than 500 MB.   Actual 5717 MB    Passed
The inventory pointer is located at /var/opt/oracle/oraInst.loc
The inventory is located at /u01/app/oraInventory
‘UpdateNodeList’ was successful.

Now check the inventory.xml on node1, which looks good.
==========================================================================
mgracsolsrv64bit1:/u01/app/oraInventory/ContentsXML: more inventory.xml
<?xml version=”1.0″ standalone=”yes” ?>
<!– Copyright (c) 1999, 2009, Oracle. All rights reserved. –>
<!– Do not modify the contents of this file by hand. –>
<INVENTORY>
<VERSION_INFO>
<SAVED_WITH>11.2.0.1.0</SAVED_WITH>
<MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME=”Ora11g_gridinfrahome1″ LOC=”/u01/app/11.2.0.1/grid” TYPE=”O” IDX=”1″ CRS=”true”>    *******> Now We can see the Parameter CRS=true <**********
<NODE_LIST>
<NODE NAME=”mgracsolsrv64bit1″/>
<NODE NAME=”mgracsolsrv64bit2″/>
</NODE_LIST>
</HOME>
</HOME_LIST>
</INVENTORY>

Now check the inventory.xml on node2, which looks good.
==========================================================================
mgracsolsrv64bit2:[root]$ more inventory.xml
<?xml version=”1.0″ standalone=”yes” ?>
<!– Copyright (c) 1999, 2009, Oracle. All rights reserved. –>
<!– Do not modify the contents of this file by hand. –>
<INVENTORY>
<VERSION_INFO>
<SAVED_WITH>11.2.0.1.0</SAVED_WITH>
<MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME=”Ora11g_gridinfrahome1″ LOC=”/u01/app/11.2.0.1/grid” TYPE=”O” IDX=”1″ CRS=”true”>
<NODE_LIST>
<NODE NAME=”mgracsolsrv64bit1″/>
<NODE NAME=”mgracsolsrv64bit2″/>
</NODE_LIST>
</HOME>
</HOME_LIST>
</INVENTORY>

Now Cluvfy passed for CRS:
================================================================

./runcluvfy.sh stage -post crsinst -n mgracsolsrv64bit1,mgracsolsrv64bit2 -verbose | tee /export/home/grid/cvulogs/node1.post-crs-check.txt

Checking Cluster manager integrity…

Checking CSS daemon…                                       (earlier cluvfy failed at this step)

Node Name                             Status
————————————  ————————
mgracsolsrv64bit2                     running
mgracsolsrv64bit1                     running

Oracle Cluster Synchronization Services appear to be online.

Cluster manager integrity check passed

********************************************************************************************************************************

Advertisements

From → Oracle, RAC

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: