VCS Cluster not starting.
Hi
I am facing problem while trying to start VCS .
From LOG :
==============================================================
tail /var/VRTSvcs/log/engine_A.log
2014/01/13 21:39:14 VCS NOTICE V-16-1-11050 VCS engine version=5.1
2014/01/13 21:39:14 VCS NOTICE V-16-1-11051 VCS engine join version=5.1.00.0
2014/01/13 21:39:14 VCS NOTICE V-16-1-11052 VCS engine pstamp=Veritas-5.1-10/06/09-14:37:00
2014/01/13 21:39:14 VCS INFO V-16-1-10196 Cluster logger started
2014/01/13 21:39:14 VCS NOTICE V-16-1-10114 Opening GAB library
2014/01/13 21:39:14 VCS NOTICE V-16-1-10619 ‘HAD’ starting on: nsscls01
2014/01/13 21:39:16 VCS INFO V-16-1-10125 GAB timeout set to 30000 ms
2014/01/13 21:39:16 VCS NOTICE V-16-1-11057 GAB registration monitoring timeout set to 200000 ms
2014/01/13 21:39:16 VCS NOTICE V-16-1-11059 GAB registration monitoring action set to log system message
2014/01/13 21:39:31 VCS CRITICAL V-16-1-11306 Did not receive cluster membership, manual intervention may be needed for seeding
=============================================================================================
root@nsscls01# hastatus -sum
VCS ERROR V-16-1-10600 Cannot connect to VCS engine
VCS WARNING V-16-1-11046 Local system not available
Please advice how can I start the VCS.
Hello,
Clearly the issue is at LLT layer ... as you can see from llstat output you have posted, node 1 says that node 2 LLT is down, & node 2 says that node 1 LLT is down .. couple of possibilities here
1. Either the physical connection itself has issues, you can use tools like dlpiping or lltping to determine the status of LLT links. These tools are helpful because LLT works at mac layer. Alternatively to test, you can plumb some IPs on both the sides & try test ping. for e.g plumb 1.1.1.1 on nxge1 on node 1 & 1.1.1.2 on nxge1 on node 2 & you can ping to confirm connectivity.
Link for dlpiping
http://sfdoccentral.symantec.com/sf/5.0MP3/aix/manpages/vcs/man1/dlpiping.html
2. If connectivity is found right, just try to start all the components manually on node 1
# /etc/init.d/llt start
# /etc/init.d/gab start
I have observed that sometimes LLT status is now shown correctly unless GAB is started correct. once these are started, check the "gabconfig -a" output again. If GAB starts & shows membership with other node, you will need to start IOFencing
# /etc/init.d/vxfen start
post this you would be able to execute "hastart" in order for VCS to start
G