Home » Server Options » RAC & Failsafe » install and configure 11.1.0.6 clusterware but once server reboot clusterware never goes up (11.1.0.6 clusterware, Suse 10 SP3)
install and configure 11.1.0.6 clusterware but once server reboot clusterware never goes up [message #664541] Fri, 21 July 2017 02:41 Go to next message
juniordbanewbie
Messages: 250
Registered: April 2014
Senior Member
Dear Sir/Mdm,

I've installed and configured 11.1.0.6 clusterware on Suse 10 SP3

but when I reboot the server, the crs never comes up, so I decided to follow the following:

remove the installation
based on
I How to Proceed From a Failed 10g or 11.1 Oracle Clusterware (CRS) Installation (Doc ID 239998.1), this is despite the fact the installation went well.

after installation again and rebooting again, crsd still does not comes up

it does not help that none of the log reflect why clusterware does not start

oracle@suse103-11106-ee-rac1:~> ls -l /u01/app/11.1.0/crs/log/suse103-11106-ee-rac1/alertsuse103-11106-ee-rac1.log
-rw-rw-r-- 1 oracle oinstall 1693 2017-07-21 13:38 /u01/app/11.1.0/crs/log/suse103-11106-ee-rac1/alertsuse103-11106-ee-rac1.log
oracle@suse103-11106-ee-rac1:~> date
Fri Jul 21 15:31:41 SGT 2017

latest log is

/u01/app/11.1.0/crs/log/suse103-11106-ee-rac1/cssd/cssdOUT.log

s0clssscGetIPMIIP: No such file or directory
s0clssscGetIPMIIP: No IMPI IPaddr found
setsid: failed with -1/1
s0clssscSetUser: calling getpwnam_r for user oracle
s0clssscSetUser: info for user oracle complete
07/21/17 13:37:52: CSSD starting
07/21/17 15:05:43: CSSD handling signal 15
07/21/17 15:05:43: CSSD killed

from https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=343659692161193&id=240001.1&_afrWindowMode=0&_adf.ctrl-st ate=cbc0awel3_118

Troubleshooting 10g or 11.1 Oracle Clusterware Root.sh Problems (Doc ID 240001.1)

	

Click to add to Favorites		Troubleshooting 10g or 11.1 Oracle Clusterware Root.sh Problems (Doc ID 240001.1)	To BottomTo Bottom	

Note: This document is for 10g and 11.1, for 11.2 root.sh issues, see:
Note: 1053970.1 "Troubleshooting 11.2 Grid Infastructure Installation Root.sh Issues"

Symptom(s)
~~~~~~~~~~

The CRS stack does not come up while running root.sh after installing CRS 
(Cluster Ready Services):

You may see the startup timing out or failing.  Example:

	Successfully accumulated necessary OCR keys.
	Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
	node :   
	node 1: opcbhp1 int-opcbhp3 opcbhp1
	node 2: opcbhp2 int-opcbhp4 opcbhp2
	Creating OCR keys for user 'root', privgrp 'sys'..
	Operation successful.
	Now formatting voting device: /dev/usupport_vg/rV10B_vote.dbf
	Successful in setting block0 for voting disk.
	Format complete.
	Adding daemons to inittab
	Preparing Oracle Cluster Ready Services (CRS):
	Expecting the CRS daemons to be up within 600 seconds.
	Failure at final check of Oracle CRS stack.

Or you may see one of the daemons core dump:

	Expecting the CRS daemons to be up within 600 seconds.
	4714 Abort - core dumped 

Or you may get another error.


Change(s)
~~~~~~~~~~

Installing CRS (Cluster Ready Services)


Cause
~~~~~~~

Usually a problem in the configuration.


Fix
~~~~

1. Check and make sure you have public and private node names defined and that
these node names are pingable from each node of the cluster.

2. Verify that the OCR file and Voting file are readable and writable by the
Oracle user and the root user.  The permissions that CRS uses for these files
are:

Pre Install:

	OCR    - root:oinstall - 640
	Voting - oracle:oinstall - 660



Post Install:

	OCR    - root:oinstall   - 640
	Voting - oracle:oinstall - 644


from http://docs.oracle.com/cd/B28359_01/install.111/b28263/storage.htm#CWLIN268

Quote:


# OCR disks
sda1:root:oinstall:0640
sdb2:root:oinstall:0640
# Voting disks
sda2:crs:oinstall:0640
sdb3:crs:oinstall:0640
sdc1:crs:oinstall:0640
which udev rule is correct?

my udev rules are in the following file /etc/udev/rules.d/51-oracle.permissions

sdc1:root:oinstall:0640
sde1:oracle:oinstall:0644

is the above correct? do I need to configure any other things?

sdc1 is ocr while sde1 is voting disk

many thanks for helping to resolve on how to start crs after rebooting server.

thanks

[Updated on: Fri, 21 July 2017 02:54]

Report message to a moderator

Re: install and configure 11.1.0.6 clusterware but once server reboot clusterware never goes up [message #664625 is a reply to message #664541] Tue, 25 July 2017 03:30 Go to previous messageGo to next message
trantuananh24hg
Messages: 744
Registered: January 2007
Location: Ha Noi, Viet Nam
Senior Member
You should post something error information in crsd, cssd log here.
And, was the Clusterware new in building? Is there any RAC database activation?
Re: install and configure 11.1.0.6 clusterware but once server reboot clusterware never goes up [message #664648 is a reply to message #664625] Wed, 26 July 2017 06:18 Go to previous messageGo to next message
juniordbanewbie
Messages: 250
Registered: April 2014
Senior Member
content of /u01/app/11.1.0/crs/log/ora55-11106-ee-rac1/client/clsc1.log

Quote:

Oracle Database 11g CRS Release 11.1.0.6.0 - Production Copyright 1996, 2007 Oracle. All rights reserved.
2017-07-25 15:00:56.756: [ OCROSD][1630988016]utopen:7:failed to open OCR file/disk /dev/sdc1 , errno=13, os err string=Permission denied
2017-07-25 15:00:56.756: [ OCRRAW][1630988016]proprinit: Could not open raw device
2017-07-25 15:00:56.756: [ default][1630988016]a_init:7!: Backend init unsuccessful : [26]
2017-07-25 15:00:58.397: [ COMMCRS][1630988016]clsc_connect: (0x1cb7be0) no listener at (ADDRESS=(PROTOCOL=IPC)(KEY=CRSD_UI_SOCKET))

2017-07-25 15:01:00.041: [ COMMCRS][1630988016]clsc_connect: (0x1b372b0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))



it is quite obvious that oracle user cannot read ocr which is /dev/sdc

from https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=366363028638879&parent=SrDetailText&sourceId=3-15404501511&am p;id=1310396.1&_afrWindowMode=0&_adf.ctrl-state=sasb303gn_79

CRS cannot startup after node reboot, with errors opening raw OCR/VD on EMC, eg. "utopen:7:Failed To Open Ocr File/Disk" (Doc ID 1310396.1)

Quote:


2. The oracle software owner cannot read from the EMC Luns for OCR/Vote disks :
$ dd if=/dev/rdsk/c25t0d4s1 of=/dev/null bs=1024 count=2000
=> THIS FAILS
But, the permissions are correct on the /dev/rdsk devices.

3. Root user CAN read from the EMC luns for OCR/Vote disks
# dd if=/dev/rdsk/c25t0d4s1 of=/dev/null bs=1024 count=2000
=> THIS SUCCEEDS

4. After root user has read from the disk with the above command, THEN the oracle software owner can read from it
# dd if=/dev/rdsk/c25t0d4s1 of=/dev/null bs=1024 count=2000
# exit
$ dd if=/dev/rdsk/c25t0d4s1 of=/dev/null bs=1024 count=2000
=> THIS SUCCEEDS!
exactly same symptions

from https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=366541218454171&id=1528148.1&displayIndex=10&_afrWindowMo de=0&_adf.ctrl-state=sasb303gn_237#FIX

How To Setup Partitioned Linux Block Devices Using UDEV (Non-ASMLIB) And Assign Them To ASM? (Doc ID 1528148.1)

Quote:

/sbin/scsi_id -g -u -s %p
unfortunately my scsi_is empty, which I do not know why?

thanks a lot for any solution offer
Re: install and configure 11.1.0.6 clusterware but once server reboot clusterware never goes up [message #664666 is a reply to message #664648] Wed, 26 July 2017 20:03 Go to previous messageGo to next message
John Watson
Messages: 8922
Registered: January 2010
Location: Global Village
Senior Member
I do not think that you will find any useful advice, here or anywhere else, on this problem. You are using release 11.1 of the clusterware. I did a lot of work with that - years ago. Grid Infrastructure was released with 11.2, I think 8 years ago. Seriously different architecture. No-one is using your old 11.1 clusterware any more. There can be sensible reasons for using old releases of the database, but not for using old releases of the clusterware.

11.1.0.6 is not even the terminal release, which was 11.1.0.7.
Re: install and configure 11.1.0.6 clusterware but once server reboot clusterware never goes up [message #664695 is a reply to message #664541] Fri, 28 July 2017 11:29 Go to previous messageGo to next message
scottyyu
Messages: 7
Registered: July 2017
Junior Member
I think encounter this issue before, is this a single node rac or multiple node rac?

scott
Re: install and configure 11.1.0.6 clusterware but once server reboot clusterware never goes up [message #664711 is a reply to message #664695] Sun, 30 July 2017 08:27 Go to previous messageGo to next message
juniordbanewbie
Messages: 250
Registered: April 2014
Senior Member
single node
Re: install and configure 11.1.0.6 clusterware but once server reboot clusterware never goes up [message #664712 is a reply to message #664711] Sun, 30 July 2017 13:01 Go to previous messageGo to next message
scottyyu
Messages: 7
Registered: July 2017
Junior Member
I had similar problem , year ago.

this following is what oracle support ask me to collect, it turn out the problem after linux sa bounce the server, crs come up.

see following

Hi,



We need to verify the network and the connectivity

1). Spool the output of the following from each node( FROM ALL NODE) to a separate txt file
$CLUSTERWARE_HOME/bin/oifcfg iflist -p -n
$CLUSTERWARE_HOME/bin/oifcfg getif -global
$CLUSTERWARE_HOME/bin/gpnptool get
ifconfig -a
netstat -rn
# service iptables status
# chkconfig --list iptables

2). Execute the following command to verify node connectivity:
./cluvfy comp nodecon -n <racnode1>,<racnode2>,<racnode3>
./cluvfy comp nodereach -n <racnode1>,<racnode2>,<racnode3> -verbose
./cluvfy stage -post crsinst -n <racnode1>,<racnode2>,<racnode3>
.
3). * I would suggest downloading the latest version of cluvfy from OTN
http://www.oracle.com/technetwork/products/clustering/downloads/
then select Cluster Verification Utillity
* set the environment variables CV_HOME to point to the cvu home, CV_JDKHOME to point to the JDK home and an optional CV_DESTLOC pointing to a writeable area on all nodes (e.g
* cd $GRID_HOME/bin
* script /tmp/cluvfy.log ### run the following as oracle software owners id
* cluvfy stage -post crsinst -n all -verbose
* exit

4) /var/log/message files from all Node

Re: install and configure 11.1.0.6 clusterware but once server reboot clusterware never goes up [message #668124 is a reply to message #664712] Thu, 08 February 2018 02:43 Go to previous message
juniordbanewbie
Messages: 250
Registered: April 2014
Senior Member
Dear Scott,

thanks for your solution.

Basically I was quite lucky that I do not need to install old clusterware on virtualbox to test
Previous Topic: Oracle Grid installation
Next Topic: 12.1.0.2 root.sh fails to start after deconfiguring clusterware
Goto Forum:
  


Current Time: Thu Mar 28 06:19:38 CDT 2024