When rebooted Oracle RAC Server or..., Oracle Cluster couldn't start anyway;)
that doesn't happen often..., But If!
This just show my idea to solve about it... If We find out How to resolve ...we can find many... articles on Internet;)
Oracle Cluster wasn't started, and that we will not find anything at ORA_CRS_HOME/log/HOSTNAME/* to help... So, just find out to Operation System Logs(*.info).
# /u01/oracle/product/crs/bin/crsctl check crs
Failure 1 contacting Cluster Synchronization Services daemon
Cannot communicate with Cluster Ready Services
Cannot communicate with Event Manager
# /u01/oracle/product/crs/bin/crs_stat
CRS-0184: Cannot communicate with the CRS daemon.
Start to investigate from Checking init processes;)
# ps -aef | grep "init\."
root 4124 1 0 12:08 ? 00:00:00 /bin/sh /etc/init.d/init.evmd run
root 4125 1 0 12:08 ? 00:00:00 /bin/sh /etc/init.d/init.cssd fatal
root 4126 1 0 12:08 ? 00:00:00 /bin/sh /etc/init.d/init.crsd run
root 4710 4124 0 12:08 ? 00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
root 5031 4125 0 12:08 ? 00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
root 5289 4126 0 12:08 ? 00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
that show some processes was checking to start...
If your crs 's disabled you just find:
# ps -aef | grep "init\."
root 4166 1 0 12:33 ? 00:00:00 /bin/sh /etc/init.d/init.evmd run
root 4167 1 0 12:33 ? 00:00:00 /bin/sh /etc/init.d/init.cssd fatal
root 4168 1 0 12:33 ? 00:00:00 /bin/sh /etc/init.d/init.crsd run
After make sure Cluster have the problem, So, check messages log on (Linux)/var/log/messages
Apr 3 12:11:59 oratest01 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.5031.
Apr 3 12:11:59 oratest01 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.5289.
After that, Check /tmp/crsctl.* files
# cat /tmp/crsctl.5289
Oracle Cluster Registry initialization failed accessing Oracle Cluster Registry device: PROC-26: Error while accessing the physical storage Operating System error [No such file or directory] [2]
If your crs 's disabled, that dont' find anything on /var/log/messages file.
Example: Upon problem about OCR File
# cat /etc/oracle/ocr.loc
ocrconfig_loc=/dev/raw/raw11
ocrmirrorconfig_loc=/dev/raw/raw12
local_only=FALSE
that show that use rawdevice ;), So check rawdevice services:
# /etc/init.d/rawdevices status
Nothing to show, So start rawdevices.
# /etc/init.d/rawdevices start
Assigning devices:
.
.
.
/dev/raw/raw11 --> /dev/loop1
/dev/raw/raw11: bound to major 7, minor 1
/dev/raw/raw12 --> /dev/loop2
/dev/raw/raw12: bound to major 7, minor 2
.
.
.
# /etc/init.d/rawdevices status
.
.
.
/dev/raw/raw11: bound to major 7, minor 1
/dev/raw/raw12: bound to major 7, minor 2
.
.
.
That should to resovle this case;)
# /u01/oracle/product/crs/bin/crsctl check crs
Cluster Synchronization Services appears healthy
Cluster Ready Services appears healthy
Event Manager appears healthy
>>> What is that mean? I try to show when Oracle Cluster could not start... How can I do ? How can I think to do?
- Check init process
$ ps -aef | grep "init\."
.
.
.
If don't find any process about cluster, make sure have some scripts on /etc/inittab file.
>>>
h1:35:respawn:/etc/init.d/init.evmd run >/dev/null 2>&1
h2:35:respawn:/etc/init.d/init.cssd fatal >/dev/null 2>&1
h3:35:respawn:/etc/init.d/init.crsd run >/dev/null 2>&1
>>>
- Check Error on messages log (/var/log/messages) and /tmp/crsctl.*
Another Case;) to make idea
# /u01/oracle/product/crs/bin/crsctl check crs
Failure 1 contacting Cluster Synchronization Services daemon
Cannot communicate with Cluster Ready Services
Cannot communicate with Event Manager
# /u01/oracle/product/crs/bin/crs_stat
CRS-0184: Cannot communicate with the CRS daemon.
# ps -aef | grep "init\."
root 4166 1 0 12:33 ? 00:00:00 /bin/sh /etc/init.d/init.evmd run
root 4167 1 0 12:33 ? 00:00:00 /bin/sh /etc/init.d/init.cssd fatal
root 4168 1 0 12:33 ? 00:00:00 /bin/sh /etc/init.d/init.crsd run
root 9579 4166 0 12:46 ? 00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
root 10566 4167 0 12:46 ? 00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
root 10585 4168 0 12:46 ? 00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
Oracle cluster was checking to start. So, check ... check and check! finally check messages log:
Apr 3 12:47:32 oratest01 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.10566.
Apr 3 12:47:32 oratest01 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.10585.
found something "logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.10585" and then
# cat /tmp/crsctl.10585
Failed 3 to bind listening endpoint: (ADDRESS=(PROTOCOL=tcp)(HOST=oratest01-priv))
That mean, it had the problem about "oratest01-priv" (InterConnect) , hostname or ..., So check and resolve;)
It's a good thing DBA should to do... check any error on messages log [when cluster could not start or cluster rebooted...].
DBA should to know about Operation System... that helpful to be DBA;)