VCS Cluster not starting.
Hello All,
I am having difficulties trying to get VCS started on this system.
I have attached what I have got so far. I apperciate any comments or suggestions as to go from here.
Thank you
The hostnames in the main.cf corrosponds to that of the servers.
hastatus -sum
VCS ERROR V-16-1-10600 Cannot connect to VCS engine
VCS WARNING V-16-1-11046 Local system not available
hasys -state
VCS ERROR V-16-1-10600 Cannot connect to VCS engine
hastop -all -force
VCS ERROR V-16-1-10600 Cannot connect to VCS engine
hastart / hastart -onenode
dmesg: Exiting: Another copy of VCS may be running
engine_A.log
2013/10/22 15:16:43 VCS NOTICE V-16-1-11051 VCS engine join version=4.1000
2013/10/22 15:16:43 VCS NOTICE V-16-1-11052 VCS engine pstamp=4.1 03/03/05-14:58:00
2013/10/22 15:16:43 VCS NOTICE V-16-1-10114 Opening GAB library
2013/10/22 15:16:43 VCS NOTICE V-16-1-10619 'HAD' starting on: db1
2013/10/22 15:16:45 VCS INFO V-16-1-10125 GAB timeout set to 15000 ms
2013/10/22 15:17:00 VCS CRITICAL V-16-1-11306 Did not receive cluster membership, manual intervention may be needed for seeding
#gabconfig -a
GAB Port Memberships
===============================================================
#lltstat -nvv
LLT node information:
Node State Link Status Address
* 0 db1 OPEN
bge1 UP 00:03:BA:15
bge2 UP 00:03:BA:15
1 db2 CONNWAIT
bge1 DOWN
bge2 DOWN
bash-2.05$ lltconfig
LLT is running
ps -ef | grep had
root 826 1 0 15:16:43 ? 0:00 /opt/VRTSvcs/bin/had
root 836 1 0 15:16:45 ? 0:00 /opt/VRTSvcs/bin/hashadow
If only one of two nodes can connect through llt (see your lltstat -nvv where one node is present and the other is down) then the cluster will attempt to start but will wait for both nodes to be available.
This is done to ensure in a heartbeat disconnection scenario or split-brain condition that you do not have 2 seperate clusters starting.
If this is a known condition, you can run the command
# gabconfig -C -X
This removes the number of nodes needed to seed a cluster, but this command should only be performed if you are certain the other node does not already have a running cluster. You should also diagnose why the other nodes' heartbeat links are not visable from llt.