Archive

Posts Tagged ‘vcs’

veritas vcs 5.1 on solaris 5.10 changes of restarting procedure

July 26th, 2012 No comments

For 5.1 VCS on solaris 10, start/stop of VCS are no longer controlled by /etc/rc*.d/S* scripts.
They are under SMF control. Plus, some of the /etc/default/gab,llt,vcs,vxfen etc.. there are lines which needs to be set to 1 if VCS is setup manually.
For example:

VCS_START=1
VCS_STOP=1

More interestingly with VCS one node cluster, the SMF resource for vcs is not system/vcs:default, It is system/vcs-onenode:default.

Categories: HA, HA & HPC, IT Architecture Tags:

vcs commands hang consistently

June 8th, 2012 No comments

Today we encounter an issue that veritas vcs commands hang in a consistent manner. The commands like haconf -dump -makero just stuck there for a long time that we have to terminate it from console. When using truss(on solaris) or strace(on linux) to trace system calls and signals, we found the following output:

test# truss haconf -dump -makero

execve(“/opt/VRTSvcs/bin/haconf”, 0xFFBEF21C, 0xFFBEF22C) argc = 3
resolvepath(“/usr/lib/ld.so.1″, “/usr/lib/ld.so.1″, 1023) = 16
open(“/var/ld/ld.config”, O_RDONLY) Err#2 ENOENT

open(“//.vcspwd”, O_RDONLY) Err#2 ENOENT
getuid() = 0 [0]
getuid() = 0 [0]
so_socket(1, 2, 0, “”, 1) = 4
fcntl(4, F_GETFD, 0×00000004) = 0
fcntl(4, F_SETFD, 0×00000001) = 0
connect(4, 0xFFBE7E1E, 110, 1) = 0
fstat64(4, 0xFFBE7AF8) = 0
getsockopt(4, 65535, 8192, 0xFFBE7BF8, 0xFFBE7BF4, 0) = 0
setsockopt(4, 65535, 8192, 0xFFBE7BF8, 4, 0) = 0
fcntl(4, F_SETFL, 0×00000084) = 0
brk(0x000F6F28) = 0
brk(0x000F8F28) = 0
poll(0xFFBE8A60, 1, 0) = 1
send(4, ” G\0\0\0 $\0\0\t15\0\0\0″.., 57, 0) = 57
poll(0xFFBE8AA0, 1, -1) = 1
poll(0xFFBE68B8, 0, 0) = 0
recv(4, ” G\0\0\0 $\0\0\r02\0\0\0″.., 8192, 0) = 55
poll(0xFFBE8B10, 1, 0) = 1
send(4, ” G\0\0\0 $\0\0\f 1\0\0\0″.., 58, 0) = 58
poll(0xFFBE8B50, 1, -1) = 1
poll(0xFFBE6968, 0, 0) = 0
recv(4, ” G\0\0\0 $\0\0\r02\0\0\0″.., 8192, 0) = 49
getpid() = 10386 [10385]
poll(0xFFBE99B8, 1, 0) = 1
send(4, ” G\0\0\0 $\0\0\f A\0\0\0″.., 130, 0) = 130
poll(0xFFBE99F8, 1, -1) = 1
poll(0xFFBE7810, 0, 0) = 0
recv(4, ” G\0\0\0 $\0\0\r02\0\0\0″.., 8192, 0) = 62
fstat64(4, 0xFFBE9BB0) = 0
getsockopt(4, 65535, 8192, 0xFFBE9CB0, 0xFFBE9CAC, 0) = 0
setsockopt(4, 65535, 8192, 0xFFBE9CB0, 4, 0) = 0
fcntl(4, F_SETFL, 0×00000084) = 0
getuid() = 0 [0]
door_info(3, 0xFFBE78C8) = 0
door_call(3, 0xFFBE78B0) = 0
open(“//.vcspwd”, O_RDONLY) Err#2 ENOENT
poll(0xFFBEE370, 1, 0) = 1
send(4, ” G\0\0\0 $\0\0\t13\0\0\0″.., 42, 0) = 42
poll(0xFFBEE3B0, 1, -1) (sleeping…)

After some digging into the internet, we found the following solution to this weird problem:

1. Stop VCS on all nodes in the cluster by manually killing both had & hashadow processes on each node.
# ps -ef | grep had
root 27656 1 0 10:24:02 ? 0:00 /opt/VRTSvcs/bin/hashadow
root 27533 1 0 10:22:01 ? 0:02 /opt/VRTSvcs/bin/had -restart

# kill 27656 27533
GAB: Port h closed

2. Unconfig GAB & llt.
# gabconfig -U
GAB: Port a closed
GAB unavailable

# lltconfig -U
lltconfig: this will attempt to stop and reset LLT. Confirm (y/n)? y

3. Unload GAB & llt modules.
# modinfo | grep gab
100 60ea8000 38e9b 136 1 gab (GAB device)

# modunload -i 100
GAB unavailable

# modinfo | grep llt
84 60c6a000 fd74 137 1 llt (Low Latency Transport device)
# modunload -i 84
LLT Protocol unavailable

4. Restart llt.
# /etc/rc2.d/S70llt start
Starting LLT
LLT Protocol available

5. Restart gab.
# /etc/gabtab
GAB available
GAB: Port a registration waiting for seed port membership

6. Restart VCS :
# hastart -force
# VCS: starting on: <node_name>

Categories: HA, HA & HPC, IT Architecture Tags:

impact of restart vxconfigd on solaris and linux – VxVM Configuration Daemon

May 30th, 2012 No comments

stop and restart the VxVM Configuration Daemon, vxconfigd may cause your VxVA, VMSA and/or VEA session to exit. This may also cause a momentary stoppage of any VxVM configuration actions. This should not harm any data; however, it may cause some configuration operations (e.g. moving subdisks, plex resynchronization) to abort unexpectedly. Any VxVM configuration changes should be completed before running this section.

If you are using EMC PowerPath devices with Veritas Volume Manager, you must run the EMC command(s) ‘powervxvm setup’ (or ‘safevxvm setup’) and/or ‘powervxvm online’ (or ‘safevxvm online’) if this script terminates abnormally. Also, if VCS service groups are running on the host, restarting vxconfigd may cause failover to occur. So you’d better freeze service groups before doing this. You can refer to the following for details: http://www.doxer.org/learn-linux/differences-between-freezing-vcs-system-and-freezing-service-group/

Categories: HA, HA & HPC Tags:

vcs service group and resource attributes dictionary page

May 22nd, 2012 No comments

Here’s all the veritas vcs service group and resource attributes and their explanation/crab sheet/cheatsheet(actually this is the file content of /etc/VRTSvcs/conf/attributes/cluster_attrs.xml):

Read more…

Categories: HA, HA & HPC Tags:

differences between freezing vcs system and freezing service group

May 16th, 2012 No comments

In veritas vcs, freezing a system prevents service groups from coming online on the system if they failover from another node in the cluster. But this does not prevent faults from failing any service group already online on the system.

To prevent veritas intervention on faults caused by expected changes (even if the symptoms are unexpected) we would usually freeze the service group. This prevents any online/clean or restart operation kicking in on detection of faults.

After your modification on vcs, you need check that resources are not autodisabled and make sure that the config is made ro again.

Here’s the step to freeze service group(s) in vcs:
/opt/VRTS/bin/haconf -makerw
mkdir /var/tmp/veritas_config_backup_`date +%F`
cp -R /etc/VRTSvcs /var/tmp/veritas_config_backup_`date +%F`
/opt/VRTS/bin/hagrp -freeze $i -persistent
/opt/VRTS/bin/haconf -dump -makero

Categories: HA, HA & HPC Tags: ,