Archive

Archive for May, 2012

differences between Server Connection Time Server Response Time Page Load Time Page Download Time

May 31st, 2012 No comments

Here’s an excerpt from google analystics:

Avg. Server Connection Time (sec): 0.12 The average amount of time (in seconds) spent in establishing TCP connection for this page.

Avg. Server Response Time (sec): 0.80 The average amount of time (in seconds) your server takes to respond to a user request, including the network time from user’s location to your server.

Avg. Page Load Time (sec): 7.85 Avg. Page Load Time is the average amount of time (in seconds) it takes for pages from the sample set to load, from initiation of the pageview (e.g. click a page link) to load completion in the browser. If you see zero (0) as a value, please refer to the Site Speed article.

Avg. Page Download Time (sec): 2.08 The average amount of time (in seconds) to download this page.

For example, my site is like this:

Server Response Time

PS:

1.You can read more info in the following link for how to use google analytics site speed http://support.google.com/analytics/bin/answer.py?hl=en-us&topic=1120718&answer=1205784

2.If you want to break down your site’s loading time by digging resources like js/css/html/cgi/php ones, firebug is your friend. You can refer to the following two links for how to use firebug:

http://www.softwareishard.com/blog/firebug/firebug-net-panel-timings/

http://www.softwareishard.com/blog/firebug/page-load-analysis-using-firebug/

Categories: IT Architecture Tags:

difference between SCSI ISCSI FCP FCoE FCIP NFS CIFS DAS NAS SAN iFCP

May 30th, 2012 No comments

Here goes some differences between SCSI ISCSI FCP FCoE FCIP NFS CIFS DAS NAS SAN(excerpt from Internet):

Most storage networks use the SCSI protocol for communication between servers and disk drive devices. A mapping layer to other protocols is used to form a network: Fibre Channel Protocol (FCP), the most prominent one, is a mapping of SCSI over Fibre Channel; Fibre Channel over Ethernet (FCoE); iSCSI, mapping of SCSI over TCP/IP.

 

A storage area network (SAN) is a dedicated network that provides access to consolidated, block level data storage. SANs are primarily used to make storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices to the operating system. A storage area network (SAN) is a dedicated network that provides access to consolidated, block level data storage. SANs are primarily used to make storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices to the operating system. Historically, data centers first created “islands” of SCSI disk arrays as direct-attached storage (DAS), each dedicated to an application, and visible as a number of “virtual hard drives” (i.e. LUNs). Operating systems maintain their own file systems on their own dedicated, non-shared LUNs, as though they were local to themselves. If multiple systems were simply to attempt to share a LUN, these would interfere with each other and quickly corrupt the data. Any planned sharing of data on different computers within a LUN requires advanced solutions, such as SAN file systems or clustered computing. Despite such issues, SANs help to increase storage capacity utilization, since multiple servers consolidate their private storage space onto the disk arrays.Sharing storage usually simplifies storage administration and adds flexibility since cables and storage devices do not have to be physically moved to shift storage from one server to another. SANs also tend to enable more effective disaster recovery processes. A SAN could span a distant location containing a secondary storage array. This enables storage replication either implemented by disk array controllers, by server software, or by specialized SAN devices. Since IP WANs are often the least costly method of long-distance transport, the Fibre Channel over IP (FCIP) and iSCSI protocols have been developed to allow SAN extension over IP networks. The traditional physical SCSI layer could only support a few meters of distance – not nearly enough to ensure business continuance in a disaster.

More about FCIP is here http://en.wikipedia.org/wiki/Fibre_Channel_over_IP (still use FC protocol)

A competing technology to FCIP is known as iFCP. It uses routing instead of tunneling to enable connectivity of Fibre Channel networks over IP.

IP SAN uses TCP as a transport mechanism for storage over Ethernet, and iSCSI encapsulates SCSI commands into TCP packets, thus enabling the transport of I/O block data over IP networks.

Network-attached storage (NAS), in contrast to SAN, uses file-based protocols such as NFS or SMB/CIFS where it is clear that the storage is remote, and computers request a portion of an abstract file rather than a disk block. The key difference between direct-attached storage (DAS) and NAS is that DAS is simply an extension to an existing server and is not necessarily networked. NAS is designed as an easy and self-contained solution for sharing files over the network.

 

FCoE works with standard Ethernet cards, cables and switches to handle Fibre Channel traffic at the data link layer, using Ethernet frames to encapsulate, route, and transport FC frames across an Ethernet network from one switch with Fibre Channel ports and attached devices to another, similarly equipped switch.

 

When an end user or application sends a request, the operating system generates the appropriate SCSI commands and data request, which then go through encapsulation and, if necessary, encryption procedures. A packet header is added before the resulting IP packets are transmitted over an Ethernet connection. When a packet is received, it is decrypted (if it was encrypted before transmission), and disassembled, separating the SCSI commands and request. The SCSI commands are sent on to the SCSI controller, and from there to the SCSI storage device. Because iSCSI is bi-directional, the protocol can also be used to return data in response to the original request.

 

Fibre channel is more flexible; devices can be as far as ten kilometers (about six miles) apart if optical fiber is used as the physical medium. Optical fiber is not required for shorter distances, however, because Fibre Channel also works using coaxial cable and ordinary telephone twisted pair.

 

Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems in 1984,[1] allowing a user on a client computer to access files over a network in a manner similar to how local storage is accessed. On the contrary, CIFS is its Windows-based counterpart used in file sharing.

Categories: NAS, SAN, Storage Tags:

impact of restart vxconfigd on solaris and linux – VxVM Configuration Daemon

May 30th, 2012 No comments

stop and restart the VxVM Configuration Daemon, vxconfigd may cause your VxVA, VMSA and/or VEA session to exit. This may also cause a momentary stoppage of any VxVM configuration actions. This should not harm any data; however, it may cause some configuration operations (e.g. moving subdisks, plex resynchronization) to abort unexpectedly. Any VxVM configuration changes should be completed before running this section.

If you are using EMC PowerPath devices with Veritas Volume Manager, you must run the EMC command(s) ‘powervxvm setup’ (or ‘safevxvm setup’) and/or ‘powervxvm online’ (or ‘safevxvm online’) if this script terminates abnormally. Also, if VCS service groups are running on the host, restarting vxconfigd may cause failover to occur. So you’d better freeze service groups before doing this. You can refer to the following for details: http://www.doxer.org/learn-linux/differences-between-freezing-vcs-system-and-freezing-service-group/

Categories: HA, HA & HPC Tags:

check lun0 is the first mapped LUN before rescan-scsi-bus.sh(sg3_utils) on centos linux

May 26th, 2012 No comments

rescan-scsi-bus.sh from package sg3_utils scans all the SCSI buses on the system, updating the SCSI layer to reflect new devices on the bus. But in order for this to work, LUN0 must be the first mapped logical unit. Here’s some excerpt from wiki page:

LUN 0: There is one LUN which is required to exist in every target: zero. The logical unit with LUN zero is special in that it must implement a few specific commands, most notably Report LUNs, which is how an initiator can find out all the other LUNs in the target. But LUN zero need not provide any other services, such as a storage volume.

To confirm LUN0 is the first mapped LUN, do the following check if you’re using symantec storage foundation:

syminq -pdevfile |awk ‘!/^#/ {print $1,$4,$5}’ |sort -n | uniq | while read _sym _FA _port
do
if [[ -z "$(symcfg -sid $_sym -fa $_FA -p $_port -addr list | awk '$NF=="000"')" ]]
then
print Sym $_sym, FA $_FA:$_port
fi
done
If you see the following line, then it proves that lun0 is the first mapped LUN, and you can continue with the script rescan-scsi-bus.sh to scan new lun:

Symmetrix ID: 000287890217

Director Device Name Attr Address
———————- —————————– —- ————–
Ident Symbolic Port Sym Physical VBUS TID LUN
—— ——– —- —- ———————– —- — —

FA-4A 04A 0 0000 c1t600604844A56CA43d0s* VCM 0 00 000

PS:

For more infomation what Logical Unit Number(LUN) is, you may refer to:

http://en.wikipedia.org/wiki/Logical_Unit_Number

Categories: SAN, Storage Tags:

solaris format disk label Changing a disk label (EFI / SMI)

May 24th, 2012 No comments

I had inserted a drive into a V440 and after running devfsadm, I ran format on the disk. I was presented with the following partition table:

partition> p
Current partition table (original):
Total disk sectors available: 143358320 + 16384 (reserved sectors)

Part Tag Flag First Sector Size Last Sector
0 usr wm 34 68.36GB 143358320
1 unassigned wm 0 0 0
2 unassigned wm 0 0 0
3 unassigned wm 0 0 0
4 unassigned wm 0 0 0
5 unassigned wm 0 0 0
6 unassigned wm 0 0 0
8 reserved wm 143358321 8.00MB 143374704

This disk was used in a zfs pool and, as a result, uses an EFI label. The more familiar label that is used is an SMI label (8 slices; numbered 0-7 with slice 2 being the whole disk). The advantage of the EFI label is that it supports LUNs over 1TB in size and prevents overlapping partitions by providing a whole-disk device called cxtydz rather than using cxtydzs2.

However, I want to use this disk for UFS partitions. This means I need to get it back the SMI label for the device. Here’s how it’s done:

# format -e

partition> label
[0] SMI Label
[1] EFI Label
Specify Label type[1]: 0
Warning: This disk has an EFI label. Changing to SMI label will erase all
current partitions.
Continue? y
Auto configuration via format.dat[no]?
Auto configuration via generic SCSI-2[no]?
partition> q

format> q
#

Running format again will show that the SMI label was placed back onto the disk:

partition> p
Current partition table (original):
Total disk cylinders available: 14087 + 2 (reserved cylinders)

Part Tag Flag Cylinders Size Blocks
0 root wm 0 – 25 129.19MB (26/0/0) 264576
1 swap wu 26 – 51 129.19MB (26/0/0) 264576
2 backup wu 0 – 14086 68.35GB (14087/0/0) 143349312
3 unassigned wm 0 0 (0/0/0) 0
4 unassigned wm 0 0 (0/0/0) 0
5 unassigned wm 0 0 (0/0/0) 0
6 usr wm 52 – 14086 68.10GB (14035/0/0) 142820160
7 unassigned wm 0 0 (0/0/0) 0

partition>

PS:
  1. Keep in mind that changing disk labels will destroy any data on the disk.
  2. Here’s more info about EFI & SMI disk label -  http://docs.oracle.com/cd/E19082-01/819-2723/disksconcepts-14/index.html
  3. More on UEFI and BIOS - http://en.wikipedia.org/wiki/Unified_Extensible_Firmware_Interface
Categories: Storage Tags:

modify sudoers_debug in ldap.conf to debug sudo on linux and solaris

May 22nd, 2012 No comments

If you encountered some problem when doing sudoCommand, you will be happy if there’s debug info showed in console. To show detailed debug info when doing sudo, modify /etc/ldap.conf(for both solaris ldap and linux ldap):

# verbose sudoers matching from ldap
sudoers_debug 2

sudoers_debug setting to 1 will show moderate debugging, setting to 2 will show the results of the matches themselves. For example, if you have set sudoers_debug to 2 and when you execute sudoCommand, info you’ll get will like the following:

$ sudo -i
LDAP Config Summary
===================
uri ldaps://testLdapServer/
ldap_version 3
sudoers_base ou=SUDOers,dc=doxer,dc=org
binddn cn=proxyAgent,ou=profile,dc=doxer,dc=org
bindpw password
bind_timelimit 120000
timelimit 120
ssl on
tls_cacertdir /etc/openldap/cacerts
===================
sudo: ldap_initialize(ld, ldaps://testLdapServer/)
sudo: ldap_set_option: debug -> 0
sudo: ldap_set_option: ldap_version -> 3
sudo: ldap_set_option: tls_cacertdir -> /etc/openldap/cacerts
sudo: ldap_set_option: timelimit -> 120
sudo: ldap_set_option(LDAP_OPT_NETWORK_TIMEOUT, 120)
sudo: ldap_set_option(LDAP_OPT_X_TLS, LDAP_OPT_X_TLS_HARD)
sudo: ldap_simple_bind_s() ok
sudo: found:cn=defaults,ou=SUDOers,dc=doxer,dc=org
sudo: ldap sudoOption: ‘ignore_local_sudoers’
sudo: ldap search ‘(|(sudoUser=liandy)(sudoUser=%linuxsupport)(sudoUser=%linux)(sudoUser=ALL))’
sudo: found:cn=LDAPpwchange,ou=sudoers,dc=doxer,dc=org
sudo: ldap sudoHost ‘server01′ … not
sudo: ldap sudoHost ‘server02′ … not
sudo: ldap search ‘sudoUser=+*’
sudo: found:cn=test-su,ou=SUDOers,dc=doxer,dc=org
sudo: ldap sudoUser netgroup ‘+sysadmin-ng’ … not
sudo: found:cn=dba-su,ou=SUDOers,dc=doxer,dc=org
sudo: ldap sudoUser netgroup ‘+dba-ng’ … not
sudo: ldap sudoUser netgroup ‘test01′ … not
sudo: ldap sudoUser netgroup ‘test02′ … not
sudo: found:cn=Linux-Team-root,ou=SUDOers,dc=doxer,dc=org
sudo: ldap sudoUser netgroup ‘+linuxadmins’ … MATCH!
sudo: ldap sudoHost ‘ALL’ … MATCH!
sudo: ldap sudoorgmand ‘ALL’ … MATCH!
sudo: Perfect Matched!
sudo: ldap sudoOption: ‘!authenticate’
sudo: user_matches=-1
sudo: host_matches=-1
sudo: sudo_ldap_check(0)=0×422

So from above debugging outputs, you’ll know that the account to be sudo authenticated belongs to linuxadmins netgroup and this netgroup is in the sudoUser’s scope of Linux-Team-root SUDOers. As Linux-Team-root has sudoCommand for “ALL” and sudoHost for “ALL” and also has sudoOption “!authenticate”, then the user will successfully get root access with no password prompt.

Now let’s go through a failed authentication to see the debugging information:

$ sudo hastatus -sum
LDAP Config Summary
===================
host testLdapServer
port 389
ldap_version 3
sudoers_base ou=SUDOers,dc=doxer,dc=org
binddn (anonymous)
bindpw (anonymous)
===================
ldap_init(testLdapServer,389)
ldap_set_option(LDAP_OPT_PROTOCOL_VERSION,0×03)
ldap_bind() ok
found:cn=defaults,ou=SUDOers,dc=doxer,dc=org
ldap sudoOption: ‘ignore_local_sudoers’
ldap search ‘(|(sudoUser=liandy)(sudoUser=%normaluser)(sudoUser=%normaluser)(sudoUser=%patop)(sudoUser=ALL))’
ldap search ‘sudoUser=+*’
found:cn=test-su,ou=SUDOers,dc=doxer,dc=org
ldap sudoUser netgroup ‘+sysadmin-ng’ … not
found:cn=tstwas-su,ou=SUDOers,dc=doxer,dc=org
ldap sudoUser netgroup ‘+linux-team-ng’ … not
found:cn=normal-su,ou=SUDOers,dc=doxer,dc=org
ldap sudoUser netgroup ‘+normaluser-ng’ … MATCH!
ldap sudoHost ‘all’ … MATCH!
ldap sudoCommand ‘/opt/OV/bin/OpC/opcagt -start’ … not
ldap sudoCommand ‘/opt/OV/bin/OpC/opcagt -status’ … not
ldap sudoCommand ‘/opt/OV/bin/OpC/opcagt -stop’ … not
ldap sudoCommand ‘/opt/OV/bin/OpC/opcagt -kill’ … not
user_matches=-1
host_matches=-1
sudo_ldap_check(0)=0×04
Password:

From here we can see that although the user to be authenticated is in “normal-su” SUDOers, and the host is in it’s sudoHost, but as there’s no “hastatus -sum” defined for sudoCommand, so at last the authentication failed(user_matches=-1, host_matches=-1) and prompts for sudo password.

vcs service group and resource attributes dictionary page

May 22nd, 2012 No comments

Here’s all the veritas vcs service group and resource attributes and their explanation/crab sheet/cheatsheet(actually this is the file content of /etc/VRTSvcs/conf/attributes/cluster_attrs.xml):

Administrators Contains list of users with Administrator privileges.
 AllowNativeCliUsers If user does not have root privileges, and if this attribute is set to 0 (false), user is prompted for a password when issuing ha-xxx commands. If this attribute is set to 1(true), the user is not prompted; instead, VCS validates OS user’s login against VCS’ list of user IDs and assigns appropriate privileges. Default = 0(false).
 ClusterLocation Specifies the location of the cluster.
 ClusterName Arbitrary string containing the name of cluster.
 ClusterOwner This attribute is used for VCS notification; specifically, VCS sends notifications to persons designated in this attribute when something goes wrong with the cluster.
 CompareRSM Indicates if VCS engine is to verify that Replicated State Machine is consistent.This can be set by using the hadebug command.
 CounterInterval Intervals counted by the attribute GlobalCounter indicating approximately how often a broadcast will happen that will cause the GlobalCounter attribute to increase. The default value of the GlobalCounter increment can be modified by changing CounterInterval. If you increase this attribute to exceed five seconds, consider increasing the default value of the ShutdownTimeout attribute
 DumpingMembership Indicates that the engine is writing to disk.
 EngineClass Indicates the scheduling class for the VCS engine (had).
 EnginePriority Indicates the priority in which had runs. This attribute has no effect for windows environment.
 GlobalCounter This counter increases incrementally by one for each counter interval. It increases when the broadcast is received. VCS uses the GlobalCounter attribute to measure the time it takes to shut down a system. By default, the GlobalCounter attribute is updated every five seconds. This default value, combined with the 60-second default value of ShutdownTimeout, means if system goes down within twelve increments of GlobalCounter, it is treated as a fault. The default value of GlobalCounter increment can be modified by changing the CounterInterval attribute.
 GroupLimit Maximum number of service groups.
 HacliUserLevel This attribute has three, case-sensitive values:
 LockMemory Controls the locking of VCS engine pages in memory. This attribute has three values: ALL: Locks all current and future pages. CURRENT: Locks current pages.
 LogSize Size of the log file. Minimum value 64 KB  Maximum value 128 MB.
 MajorVersion Major version of system’s join protocol.
 MinorVersion Minor version of system’s join protocol.
 Notifier Indicates the status of the notifier in the cluster; specifically:
 Operators Contains list of users with Operator privileges.
 ProcessClass Indicates the scheduling class for had processes (for example, triggers).
 ProcessPriority Indicates the priority of had processes (for example, triggers). This attribute has no effect for windows environment.
 PrintMsg If set to 1 (true) , enables logging TagM messages in engine log.
 ReadOnly Indicates the mode of cluster configuration.
 ResourceLimit Maximum number of resources.
 SourceFile File from which the configuration was read.
 TypeLimit Maximum number of resource types.
 UserNames List of VCS user names.
 VCSMode Denotes the mode for which VCS is licensed, including VCS, VCS_QUICKSTART, and VCS_OPS.
 LinkMonitoring Enables link monitoring.
 NotifyList Stores notification list consisting of recipient’s email addresses, separated by spaces.
 MaxFactor For internal use only.
 LoadSampling For internal use only.
 Factor For internal use only.
 Stewards Specifies the IP address/hostname of systems running the steward process.
 ClusterAddress Specifies the cluster’s virtual IP address (used by a remote cluster when connecting to the local cluster).
 ClusterUUID Indicates the unique cluster identification assigned to the cluster by the Availability Manager.
 ClusState Indicates the current state of the cluster
 AutoStartTimeout If the local cluster cannot communicate with one or more remote clusters, this attribute specifies the number of seconds the VCS engine waits before initiating the AutoStart process for an AutoStart global service group.
 PanicOnNoMem For internal use only.
 UseFence Indicates whether the cluster uses SCSI III I/O fencing.
 VCSFeatures Indicates which VCS features are enabled.
 ClusterTime The number of seconds since January 1, 1970. This is defined by the lowest node in running state.
 WACPort The TCP port on which the WAC (Wide Area Connector) process on the local cluster listens for connection from remote clusters. The attribute can take a value from 0 to 65535.
 SecureClus Indicates whether the cluster is secured. VCS runs in the Secure mode using VxSS; all public network communications use SSL. VCS users belong to the platform user base and VCS does not store user passwords. This value cannot be changed while the cluster is running.
 AvailableCapacity Available Capacity = Capacity – Current System Load
 Capacity Value expressing total load capacity of system. This value is relative to other systems in the cluster and does not reflect any real value associated with the system. Default=100
 ConfigBlockCount Number of 512-byte blocks in configuration when the system joined the cluster.
 ConfigCheckSum Sixteen-bit checksum of configuration identifying when the system joined the cluster.
 ConfigDiskState State of configuration on the disk when the system joined the cluster.
 ConfigFile Directory containing the configuration files.
 ConfigModDate Last modification date of configuration when the system joined the cluster.
 CurrentLimits System-maintained calculation of current value of Limits. CurrentLimits = Limits – (additive value of all service group Prerequisites).
 CPUUsage Indicates the CPUUsage of the system in the form of CPU percentage utilization. This attribute’s value is valid if the Enabled value in CPUUsageMonitoring attribute equals 1. This value is updated when there is a change of 5 percent since the last indicated value.
 CPUUsageMonitoring Monitors the system’s CPU usage using various factors. The default value for this attribute is CPUUsageMonitoring = {Enabled = 0, NotifyThreshold = 0, NotifyTimeLimit = 0, ActionThreshold = 0, ActionTimeLimit = 0, Action = NONE}
 DiskHbStatus Indicates status of communication disks on the system.
 DynamicLoad System-maintained value of current dynamic load. The value is set external to VCS with the hasys -load command.
 Frozen Indicates if service groups can be brought online or taken offline on the system. Groups cannot be brought online taken offline if the attribute value is 1(true).
 GUIIPAddr Determines the local IP address that VCS uses to accept connections. Incoming connections over other IP addresses are dropped. If GUIIPAddr is not set, the default behavior is to accept external connections over all configured local IP addresses.
 Limits An unordered set of name=value pairs denoting specific resources available on a system. Names are arbitrary and are set by the administrator for any value; names are not obtained from the system. The format for Limits is:Limits() = { Name=Value, Name2=Value2 }.
 Location Denotes the location of the system.
 LinkHbStatus Indicates status of private network links on any system.
 LoadTimeCounter System-maintained internal counter of how many seconds the system load has been above LoadWarningLevel. This value resets to zero anytime system load drops below the value in LoadWarningLevel.
 LoadTimeThreshold Indicates length of time a system must remain above at or above LoadWarningLevel before the loadwarning trigger is fired. Default = 600 seconds.
 LoadWarningLevel Defines, as a percentage of system Capacity, the level at which load has reached a critical limit. Default = 80 percent.
 MajorVersion Major version of system’s join protocol.
 MinorVersion Minor version of system’s join protocol.
 NodeId System ID specified in “/etc/llttab”.
 OnGrpCnt Number of groups that are online, or about to go online.
 ShutdownTimeout Determines whether to treat system reboot as a fault for service groups running on the system. On many systems, when a reboot occurs the processes are killed first, then the system goes down. When the VCS engine is killed, service groups that include the failed system in their SystemList attributes are autodisabled. However, if the system goes down within the number of seconds designated in ShutdownTimeout, service groups previously online on the failed system are treated as faulted and failed over. If you do not want to treat the system reboot as a fault, set the value for this attribute to 0. Default = 120 seconds
 SourceFile File from which the configuration was read.
 SysInfo Provides platform-specific information, including the name, version, and release of the operating system, the name of the system on which it is running, and the hardware type.
 SystemOwner This attribute is used for VCS email notification and logging. VCS sends email notification to the person designated in this attribute when an event occurs related to the system.
 SysState System state such as running , faulted , exited.
 TFrozen Indicates if a group can be brought online or taken offline on the system.
 TRSE Indicates in seconds the time to Regular State Exit. Time is calculated as the duration between the events of VCS losing port h membership and of VCS losing port a membership of GAB.
 UpDownState This attribute has four values DOWN, UP BUT NOT IN CLUSTER MEMBERSHIP, UP AND IN JEOPARDY, UP.
 UserInt Stores a system’s integer value.
 UserStr Stores a system’s String value.
 DiskHbDown Indicates if communication disks are down on any system. Enabled by the LinkMonitoring attribute.
 LinkHbDown Indicates if private network links are down on any system. Enabled by the LinkMonitoring attribute.
 Load Normalized value of system load used to compare systems in load balancing. Value is determined by dividing the raw values of LoadRaw by the values of Factor.
 LoadRaw List of load-calculation criterion and their associated raw values over the last five seconds.
 SysName The name of the system. The name must begin with a letter and must only contain letters, numbers, dashes (-), and underscores (_).
 SystemLocation Denotes the location of the system.
 LLTNodeId Displays the node ID ( as defined in the llthots.txt ) for the node.
 ConfigInfoCnt For internal use only.
 AgentsStopped The attribute is set to 1 for a system when all agents running on that system are stopped.
 NoAutoDisable When set to 0, this attribute autodisables service groups when the VCS engine is taken down. Groups remain autodisabled until the engine is brought up (regular membership). Setting this attribute to 1 bypasses the autodisable feature.
 LicenseKey LicenseKey
 VCSFeatures Indicates which VCS features are enabled.
 LicenseType Indicates the license type of the base VCS key used by the system. Possible values are:
 VCSMode Denotes the mode for which VCS is licensed, including VCS, Traffic Director, and VCS_OPS.
 CPUBinding Binds the HAD process to the specified CPU.
 EngineRestarted Indicates whether the VCS engine (HAD) was restarted by the hashadow process on a node in the cluster. The value 1 indicates that the engine was restarted; 0 indicates it was not restarted.
 ConnectorState Indicates the state of the wide-area connector (WAC). If 0, WAC is not running. If 1, WAC is running and communicating with the VCS engine.
 ActiveCount Number of resources in a service group that are active (online or waiting to go online). When the number of resources drops to zero, the service group is considered offline.
 Administrators List of VCS users with privileges to administer the group.
 AutoDisabled Indicates that VCS does not know the status of a service group (or specified system for parallel service groups). This is due to: 1) Group not probed (on specified system for parallel groups) in the SystemList attribute. 2) VCS engine in not running on a node designated in the SystemList attribute, but the node is visible.
 AutoFailOver Indicates whether VCS initiates an automatic failover if the service group faults.
 AutoRestart Restarts a service group after a faulted persistent resource becomes online. This attribute applies to persistent resources only. Default value is 1(true).
 AutoStart Designates whether a service group is automatically started when VCS is started. Default value is 1(true).
 AutoStartIfPartial Indicates whether to initiate bringing a service group online if the group is probed and discovered to be in a PARTIAL state when VCS is started. Default =1.
 AutoStartList List of systems on which, under specific conditions, the service group will be started with  VCS (usually at system boot). For example, if a system is a member of a failover service group’s AutoStartList attribute, and if it is not already running on another system in the cluster, the group is brought online when the system is started.
 AutoStartPolicy Sets the policy VCS uses to determine on which system to bring a service group online if multiple systems are available.
 CurrentCount Number of systems on which the service groups is active.
 Enabled Indicates if a group can be failed over or brought online. If any of the local values are disabled, the group is disabled. Default value is 1(true).
 Evacuate Indicates if VCS initiates an automatic failover when user issues hastop -local -evacuate. Default value is 1(true).
 Evacuating Indicates the node ID from which the service group is being evacuated.
 Failover Indicates service group is in the process of failing over.
 FailOverPolicy Sets the policy VCS uses to determine which system a group fails over to if multiple systems exist. The values are Priority (default), Load, RoundRobin.
 FromQ Indicates the system name from which the service is failing over. This attribute is specified when service group failover is a direct consequence of the group event, such as a resource fault within the group or a group switch.
 Frozen Disables all actions, including autostart, online and offline, and failover, except for monitor actions performed by agents. Default value is 0(false).
 GroupOwner This attribute is used for VCS email notification and logging. VCS sends email notification to the person designated in this attribute when an event occurs related to the service group.
 IntentOnline Indicates whether to keep service groups online or offline. It is set to 1 by VCS if an attempt has been made, successful or not, to online the service group. For failover groups, this attribute is set to 0 by VCS when the group is taken offline. For parallel groups, it is set to 0 for the system when the group is taken offline or when the group faults and can fail over to another system.
 LastSuccess For internal use only.
 Load Integer value expressing total system load this group will put on a system.
 ManualOps Indicates if manual operations are allowed on the service group.
 MigrateQ Indicates the system from which the service group is migrating. This attribute is specified when group failover is an indirect consequence, such as system shutdown, another group faulted and is linked to this group, etc.
 NumRetries Indicates the number of times attempts are made to bring a service group online. This attribute is used only if the attribute OnlineRetryLimit is set for the service group.
 OnlineRetryInterval Indicates the interval, in seconds, during which a service group that has successfully restarted on the same system and faults again should be failed over, even if the attribute OnlineRetryLimit is non-zero. This prevents a group from continuously faulting and restarting on the same system. Default=0
 OnlineRetryLimit If non-zero, specifies the number of times the VCS engine tries to restart a faulted service group on the same system on which the group faulted, before it gives up and tries to fail over the group to another system. Default = 0.
 Operators List of VCS users with privileges to operate the group. A Group Operator can only perform  online/offline, and temporary freeze/unfreeze operations pertaining to a specific group.
 Parallel This indicates if service group is failover (0), parallel (1) or hybrid (2).
 PathCount Number of resources in path not yet taken offline. When this number drops to zero, the engine may take the entire service group offline if critical fault has occurred.
 PreOnline Indicates that the VCS engine should not online a service group in response to a manual group online, group autostart, or group failover. The engine should instead call a user-defined script that checks for external conditions before bringing the group online. Default value is 0(false).
 PreOnlining Indicates that VCS engine invoked the preonline script; however, the script has not yet returned with group online.
 Prerequisites An unordered set of name=value pairs denoting specific resources required by a service group. If prerequisites are not met, the group cannot go online. The format for Prerequisites is: Prerequisites() = { Name=Value, name2=value2 }. Names used in setting Prerequisites are arbitrary and not obtained from the system. Coordinate name=value pairs listed in Prerequisites with the same name=value pairs in Limits().
 Priority Enables users to designate and prioritize the service group. VCS does not interpret the value; rather, this attribute enables the user to configure the priority of a service group and the sequence of actions required in response to a particular event. Default=0
 PrintTree Indicates whether or not the resource dependency tree is written to the configurtaion file.
 Probed Indicates whether all enabled resources in the group have been detected by their respective agents.
 ProbesPending The number of resources that remain to be detected by the agent on each system.
 Responding Indicates VCS engine is responding to a failover event and is in the process of bringing the service group online or failing over the node.
 SourceFile File from which the configuration was read.
 State Group state on each system. Group states are OFFLINE, ONLINE, FAULTED, PARTIAL, STARTING, STOPPING, OFFLINE | FAULTED, OFFLINE | STARTED, PARTIAL | FAULTED, PARTIAL | STARTING, PARTIAL | STOPPING, ONLINE | STOPPING.
 SystemList List of systems on which the service group is configured to run and their priorities. Lower numbers indicate a preference for the system as a failover target.
 SystemZones Indicates the virtual sublists within the SystemList attribute that grant priority in failing over.Values are string/integer pairs. The string key is the name of a system in the SystemList attribute, and the integer is the number of the zone. Systems with the same zone number are members of the same zone. If a service group faults on one system in a zone, it is granted priority to fail over to another system within the same zone, despite the policy granted by the FailOverPolicy attribute.
 Tag Identifies special-purpose service groups created for specific VCS products.
 TargetCount Indicates the number of target systems on which the service group should be brought online.
 TFrozen Indicates if service groups can be brought online on the system. Groups cannot be brought online if the attribute value is 1(true). Default value is 0 (false).
 ToQ Indicates the node name to which the service is failing over. This attribute is specified when service group failover is a direct consequence of the group event, such as a resource fault within the group or a group switch.
 TriggerResStateChange Determines whether or not to invoke the resstatechange trigger if resource state changes.
 UserIntGlobal Use this attribute for any purpose. It is not used by VCS.
 UserStrGlobal Use this attribute for any purpose. It is not used by VCS.
 TypeDependencies Creates a dependency between resource types specified in the service group list, and all instances of the respective resource type.
 UserIntLocal Use this attribute for any purpose. It is not used by VCS.
 UserStrLocal Use this attribute for any purpose. It is not used by VCS.
 Dependencies Creates a dependency between resource types specified in the service group list, and all instances of the respective resource type.
 PostOffline Setting this attribute to 1 executes the PostOffline event trigger on the system where the group went offline from a partial or fully online state.
 PostOnline Setting this attribute to 1 executes the PostOnline event trigger on the system where the group went online from a partial or fully offline state.
 ExtMonApp For internal use only.
 ExtMonArgs For internal use only.
 PreOffline For internal use only.
 PreOfflining For internal use only.
 Restart For internal use only.
 TriggerEvent For internal use only.
 ClusterList Specifies the list of clusters on which the service group is configured to run.
 Authority Indicates whether or not the local cluster is allowed to bring the service group online. If set to 0, it is not, if set to 1, it is.
 ClusterFailOverPolicy Determines how a global service group behaves when a cluster faults.
 ManageFaults Specifies if VCS manages resource failures within the service group by calling clean entry point for
 FaultPropagation Specifies if VCS should propagate the fault up to parent resources and take the entire service group
 PreonlineTimeout Defines the maximum amount of time the preonline script takes to run the command hagrp -online -nopre for the group. Note that HAD uses this timeout during evacuation only.
 DeferAutoStart Indicates whether HAD defers the auto-start of a local group in case the global cluster is not fully connected.
 VCSi3Info Enables VCS service groups to be mapped to VERITAS i3 applications. This attribute is managed solely by the i3 product and should not be set or modified by the user.
 AgentClass Indicates the scheduling class for the VCS agent process.
 AgentFailedOn A list of systems on which the agent for the resource type has failed.
 AgentPriority Indicates the priority in which the agent process runs. This attribute has no effect for windows environment. Default = 0.
 AgentReplyTimeout The number of seconds the engine waits to receive a heartbeat from the agent before restarting the agent. Default = 130 seconds.
 AgentStartTimeout The number of seconds after starting the agent that the engine waits for the initial agent “handshake” before restarting the agent. Default = 60 seconds.
 ArgList An ordered list of attributes whose values are passed to the open, close, online, offline, monitor, and clean entry points.
 AttrChangedTimeout Maximum time (in seconds) within which the attr_changed entry point must complete or be terminated. Default = 60 seconds.
 CleanTimeout Maximum time (in seconds) within which the clean entry point must complete or else be terminated. Default = 60 seconds.
 CloseTimeout Maximum time (in seconds) within which the close entry point must complete or else be terminated. Default = 60 seconds.
 ConfInterval When a resource has remained online for the specified time (in seconds), previous faults and restart attempts are ignored by the agent.
 FaultOnMonitorTimeouts When a monitor times out as many times as the value specified, the corresponding resource is brought down by calling the clean entry point. The resource is then marked FAULTED, or it is restarted, depending on the value set in the Restart Limit attribute. When FaultOnMonitorTimeouts is set to 0, monitor failures are not considered indicative of a resource fault. A low value may lead to spurious resource faults, especially on heavily loaded systems.
 LogFileSize Specifies the size (in bytes) of the agent log file. Minimum value is 65536 bytes. Maximum value is 134217728 bytes (128MB). Default = 33554432 (32MB)
 MonitorInterval Duration (in seconds) between two consecutive monitor calls for an ONLINE or transitioning resource. A lower value could impact performance if many resources of the same type exist. A higher value could delay detection of a faulted resource.
 MonitorTimeout Maximum time (in seconds) within which the monitor entry point must complete or else be terminated. Default = 60 seconds
 OfflineMonitorInterval Duration (in seconds) between two consecutive monitor calls for an OFFLINE resource. If set to 0, OFFLINE resources are not monitored.
 NumThreads Number of threads used within the agent process for managing resources. This number does not include the three threads used for other internal purposes.Increasing to a significantly large value can degrade system performance. Decreasing to 1 prevents multiple threads. Default = 10.
 OfflineTimeout Maximum time (in seconds) within which the offline entry point must complete or else be terminated. Default = 300 seconds
 OnlineRetryLimit Number of times to retry online, if the attempt to online a resource is unsuccessful. This parameter is meaningful only if clean is implemented. Default = 0.
 OnlineTimeout Maximum time (in seconds) within which the online entry point must complete or else be terminated. Default = 300 seconds
 OnlineWaitLimit Number of monitor intervals to wait after completing the online procedure, and before the resource becomes online. Default = 2.
 OpenTimeout Maximum time (in seconds) within which the open entry point must complete or else be terminated. Default = 60 seconds.
 Operations Indicates valid operations of resources of the resource type. Values are OnOnly (can online only), OnOff (can online and offline), None (cannot online or offline).
 RestartLimit Number of times to retry bringing a resource online when it is taken offline unexpectedly and before VCS declares it FAULTED. Default = 0
 ScriptClass Indicates the scheduling class of the script processes (for example, online) created by the agent. This attribute has no effect for windows environment.
 ScriptPriority Indicates the priority of the script processes created by the agent. This attribute has no effect for windows environment. Default = 0.
 SourceFile File from which the configuration was read.
 ToleranceLimit Number of times the monitor entry point should return OFFLINE before declaring the resource FAULTED. A large value could delay detection of a genuinely faulted resource. Default = 0
 MonitorIfOffline Indicates whether resources are monitored when offline (value 1), or not (value 0).
 Type File system type, such as vxfs, ufs, etc.
 RestartLimits The number of times the agent should try to restart the resources.
 FireDrill Specifies whether or not fire drill is enabled for resource type. If set to 1, fire drill is enabled. If set to 0, it is disabled.
 LogDbg Indicates the debug severities enabled for the resource type or agent framework. Debug severities used by the agent entry points are in the range of DBG_1 to DBG_21. The debug messages from the agent framework are logged with the severities DBG_AGINFO, DBG_AGDEBUG and DBG_AGTRACE, representing the least to most verbose.
 MonitorStatsParam Designates the values governing the monitor interval. Valid keys include:
 InfoInterval Determines when info entry point is invoked by the agent framework. If set to 0, the entry point is not invoked. Set this attribute to a non-zero value to invoke the entry point periodically.
 InfoTimeout Timeout value for info entry point. If entry point does not complete by the designated time, the agent framework cancels the entry point’s thread.
 ActionTimeout Timeout value for action entrypoint. Default is 40s
 SupportedActions Valid action tokens for this resource type. Default is an
 LogLevel LogLevel
 LogTags LogTags
 ArgListValues List of arguments passed to the resource’s agent on each system.This attribute is resource- and system-specific, meaning that the list of values passed to the agent depend on which system and which resource they are for.
 AutoStart Indicates that the resource is brought online when the service group is brought online. Default value is 1(true).
 ConfidenceLevel Indicates the level of confidence in an online resource. Values range from 0 – 100. Note that some VCS agents may not take advantage of this attribute and may always set it to 0. Set the level to 100 if the attribute is not used.
 Critical Indicates that the service group is faulted when the resource, or any resource it depends on, faults. Default value is 1(true).
 Enabled Indicates agents monitor the resource. If a resource is created dynamically while VCS is running, you must enable the resource before VCS monitors it. When Enabled is set to 0(false), it implies a disabled resource. VCS will not bring a disabled resource, nor its children online, even if the children are enabled. If you specify the resource in main.cf prior to starting VCS, the default value for this attribute is 0(false).
 Flags Additional information relating to the state of a resource. Possible values are : RESTARTING, STATUS UNKNOWN, MONITOR TIMEDOUT, UNABLE TO OFFLINE and ADMIN WAIT.
 Group String name of the service group to which the resource belongs.
 LastOnline Indicates the system name on which the resource was last online. This attribute is automatically set by the VCS engine (had).
 MonitorOnly Indicates if the resource can be brought online or taken offline. If set to 0(false), resource can be brought online or taken offline. If set to 1(true),resource can be monitored only. Default value is 0(false).
 IState Indicates internal state of a resource. In addition to the State attribute, this attribute shows to which state the resource is transitioning. Possible values are : NOT WAITING, WAITING TO GO ONLINE, WAITING FOR CHILDREN ONLINE, WAITING TO GO OFFLINE, WAITING TO GO OFFLINE (propagate), WAITING TO GO ONLINE (reverse), WAITING TO GO OFFLINE (reverse/propagate).
 Path The number of parent resources in the path up to the top of the resource graph. This attribute is used when an online resource faults.
 Probed Indicates whether the resource has been detected by the agent.
 ResourceOwner This attribute is used for VCS email notification and logging. VCS sends email notification to the person designated in this attribute when an event occurs related to the resource.VCS also logs the owner name in when an event occurs.If ResourceOwner is not specified in main.cf, the default value is “unknown.”
 Signaled Indicates whether a resource has been traversed. Used when bringing a service group online or taking it offline.
 Start Indicates whether a resource was started (the process of bringing it online was initiated) on a system.
 State Resource state on each system. Possible values are : ONLINE, OFFLINE, FAULTED, ONLINE | STATE UNKNOWN, ONLINE | MONITOR TIMEDOUT, ONLINE | UNABLE TO OFFLINE, OFFLINE | STATE UNKNOWN, FAULTED | RESTARTING. A faulted resource is physically offline, though unintentionally.
 AgentDebug A flag that defines whether the agent logs additional debug messages. The value 1(true) indicates that the agent will log additional debug messages. The value 0(false) indicates that it will not. Default value is 0(false).
 TriggerEvent For internal use only.
 ResourceInfo This attribute has three predefined keys:State: values are Valid, Invalid, or Stale Msg: output of the info entry point captured on stdout by the agent framework TS: timestamp indicating when the ResourceInfo attribute was updated by the agent framework Defaults: State = Valid Msg = “” TS = “”
 ComputeStats The attribute indicates to the agfw whether or not to calculate monitor time statistics for the resource. By default this is set to FALSE.
 MonitorTimeStats The valid keys for this attribute are: Average, TS. Average is the average time taken by the monitor EP over the last “Frequency” number of monitor cycles. TS is the timestamp of when the engine last updated the Average for the resource. Default values are:
 Name For internal use only.
 Enabled Indicates if SNMP traps are enabled.
 IPAddr IP address of the host where the SNMP Manager resides.
 Port Port of SNMP server.
 SourceFile File from which the configuration was read.
 TrapList List of traps and their descriptions.
 Clusterlist List of clusters whose health is determined by this heartbeat.
 AgentState State of the heartbeat agent.
 State This is the state of the heartbeat. This state is used to determine the health of the remote cluster.
 AYAInterval This is the ‘Are You Alive Interval’. This is the interval after which the local cluster heartbeats the remote cluster.
 InitTimeout Timeout value for the ‘init’ entry pont.
 StartTimeout Timeout value for the ‘start’ entrypoint.
 CleanTimeout This is the timeout value for the ‘clean’ entry point.
 StopTimeout This is the timeout value for the Stop entry point.
 AYATimeout This is the timeout value for the aya entry point.
 AYARetryLimit number of times to call the aya entry point before giving up.
 Arguments extra generic information that can be passed to the heartbeat agent.
 LogDbg This is used for log messages.

PS:
1.You can download cluster_attrs.xml here for more infomation on vcs service group and resource attributes such as whether the attribute is editable/important/mustconfigure/displayname etc .

vcs-cluster_attrs.zip

2.Some vcs attributes not listed here as they’re dedicated for apps, such as oracle. We can import the vcs attributes configuration file detailed for example in this article: http://sfdoccentral.symantec.com/sf/5.0/solaris64/html/vcs_agents_oracle/ch_vha_oracle_configagent9.html
Categories: HA, HA & HPC Tags:

awstats installation and configuration guide on linux centos

May 21st, 2012 1 comment

Here’s a howto/guide about awstats installation and configuration on linux:

yum -y install awstats

here’s main things installed:

/var/www/awstats
/etc/awstats
/etc/cron.hourly/00awstats
/etc/httpd/conf.d/awstats.conf
/usr/bin/awstats_buildstaticpages.pl
/usr/bin/awstats_exportlib.pl
/usr/bin/awstats_updateall.pl
/usr/bin/logresolvemerge.pl
/usr/bin/maillogconvert.pl
/usr/bin/urlaliasbuilder.pl

mv /etc/awstats/awstats.localhost.localdomain.conf /etc/awstats/awstats.mysite.conf
vi /etc/awstats/awstats.mysite.conf

LogFile=”/usr/bin/logresolvemerge.pl /var/log/httpd/*-access.log|”

#there’s a way to add gzipped log file for analyzing
#LogFile=”gzip -d </var/log/apache/access.log.gz|”
LogType=W #W is for analyzing web log files
LogFormat=1 #or use a custom log format if you don’t use the combined log format
SiteDomain=”www.yoursite.com”
AllowToUpdateStatsFromBrowser=1

cd /var/www/awstats/

chmod -R 755 /var/log/httpd #If you do not add x permission to these log files, you’ll encounter error message below when you click “Update now” in browser:

awstats Couldn’t open server log file xxxx: Permission denied

perl ./awstats.pl -config=mysite -update #or update from browser. or through logrotate(http://awstats.sourceforge.net/docs/awstats_faq.html#ROTATE) or through crontab(http://awstats.sourceforge.net/docs/awstats_faq.html#CRONTAB)
perl ./awstats.pl -config=mysite -output -staticlinks > awstats.mysite.html
Now visit http://www.yoursite.com/awstats-html/awstats.mysite.html#or through http://www.yoursite.com/awstats/awstats.pl?config=mysite(like http://www.yoursite.com/awstats/awstats.pl?month=MM&year=YYYY&output=unknownos). Reports are generated in real time from the statistics data base. If this is slow, or putting too much load on your server, consider generating static reports instead.
Here’s the httpd configuration file for awstats:

[root@doxer awstats]# cat /etc/httpd/conf.d/awstats.conf
Alias /awstats/icon/ /var/www/awstats/icon/
Alias /awstats-html/ /var/www/awstats/
ScriptAlias /awstats/ /var/www/awstats/
<Directory /var/www/awstats/>
AllowOverride All
DirectoryIndex awstats.pl
Options ExecCGI
Order allow,deny
Allow from all
</Directory>
#Alias /css/ /var/www/awstats/css/
#Alias /js/ /var/www/awstats/js/

NB:

1.If you encounter 500 internal server error, this article may be useful for you to troubleshoot http://www.doxer.org/learn-linux/resolved-awstats-500-internal-server-error-after-installation-on-centos-linux/

2.For more info, you can refer to official site here http://awstats.sourceforge.net/docs/index.html

resolved awstats 500 internal server error after installation on centos linux

May 21st, 2012 No comments

After installation of awstats on centos according to official installation guide, the dynamically view from browser was rendering ok, i.e. http://www.mysite.com/awstats/awstats.pl?config=mysite was ok and I can see statistics with no problem. However, when I tried view the static html page generated by perl ./awstats.pl -config=mysite -output -staticlinks > awstats.mysite.html, there was 500 internal server error when visiting this page: http://www.mysite.com/awstats/awstats.mysite.html.

This is quite weird because usually the ones that complain about 500 internal server error are usually dynamically generated pages such as php pages or perl cgi script pages. But this problem was that only static html page gave 500 internal server error, and the dynamically generated pages were ok to render. I tried moving the html file to some other virtualhost and it’s ok to render without the horrible 500 internal server error: statistics looked good and all icons were ok.

The configuration of awstats in httpd conf file was like this:

[root@doxer awstats]# cat /etc/httpd/conf.d/awstats.conf
Alias /awstats/icon/ /var/www/awstats/icon/
ScriptAlias /awstats/ /var/www/awstats/
<Directory /var/www/awstats/>
AllowOverride All
DirectoryIndex awstats.pl
Options ExecCGI
Order allow,deny
Allow from all
</Directory>
#Alias /css/ /var/www/awstats/css/
#Alias /js/ /var/www/awstats/js/

Pay attention to line with red color. It took me a whole forenoon before I found the root cause(there was no useful error log for this awstats 500 internal server error). As detailed in httpd documents:

The ScriptAlias directive has the same behavior as the Alias directive, except that in addition it marks the target directory as containing CGI scripts that will be processed by mod_cgi’s cgi-script handler.

This is quite clear that files under directory followed by ScriptAlias directive will be treated as CGI scripts. As the static html file was placed under the directory which should only contains CGI scripts, so 500 internal server error threw when visiting that static html file under it.

To fix this awstats 500 internal server error, change the configuration file as the following:

[root@doxer awstats]# cat /etc/httpd/conf.d/awstats.conf
Alias /awstats/icon/ /var/www/awstats/icon/
Alias /awstats-html/ /var/www/awstats/
ScriptAlias /awstats/ /var/www/awstats/
<Directory /var/www/awstats/>
AllowOverride All
DirectoryIndex awstats.pl
Options ExecCGI
Order allow,deny
Allow from all
</Directory>
#Alias /css/ /var/www/awstats/css/
#Alias /js/ /var/www/awstats/js/

After this, you should now be able to see the awstats static html file with no problem.(use http://www.mysite.com/awstats-html/awstats.mysite.html instead of http://www.mysite.com/awstats/awstats.mysite.html)

NB:

Here’s an article about awstats installation on linux howto:  http://www.doxer.org/learn-linux/awstats-installation-steps-on-linux-centos/

re-ip on solaris server howto – change ip netmask defaultrouter gateway

May 18th, 2012 No comments

To change ip/netmask/defaultrouter/gateway on solaris 10 or solaris 9 server permanently, you need care for files below:

/etc/hosts -> /etc/inet/hosts
/etc/hostname.<tags of your interface>
/etc/inet/netmasks
/etc/defaultrouter

Let’s assume that the new ip address is 101.139.1.151, new netmask is 255.255.254.0, new gateway is 101.139.1.254, new broadcast address is 101.139.1.255, here goes the steps:
1)change /etc/hosts(or /etc/inet/hosts which of them are the same file)
101.139.1.151 <tag for your server’s ip address>

2)change defaultrouter in /etc/defaultrouter:
101.139.1.254 /etc/defaultrouter

3)change /etc/hostname.<tags of your interface>(this step may not needed):

4)change netmask in /etc/inet/netmasks:
You’ll need first calculate network address from the given ipaddress(101.139.1.151) and netmask address(255.255.254.0). You can calculate it by hand(refer to this article http://www.doxer.org/learn-linux/basic-knowledge-for-netmask-hexadecimal-decimal-binary-netmask-cidr-calculator/), but I would prefer to use ipcalc:
[root@doxer~]# ipcalc -pnbm 101.139.1.151 255.255.254.0
NETMASK=255.255.254.0
PREFIX=23
BROADCAST=101.139.1.255
NETWORK=101.139.0.0

So from the output, you’d know that the network address is 101.139.0.0. Then add a line to /etc/inet/netmasks with format <network address> <netmask address>:
101.139.0.0 255.255.254.0

PS:
If you need change ip/netmask using ifconfig temporarily on solaris, use the following command:
ifconfig qfe1 101.139.1.151 netmask 255.255.254.0 broadcast + up

5)Now reboot your server and then use ifconfig -a and netstat -rnv to confirm everything is working as expected.

PS:
If you encounter errors below when booting solaris, then there may be some problem with network configuration on your host. Consider going to single user mode and change networking configuration detailed in this article.

Setting /dev/arp arp_cleanup_interval to 60000
Setting /dev/ip ip_forward_directed_broadcasts to 0
Setting /dev/ip ip_forward_src_routed to 0
Setting /dev/ip ip_ignore_redirect to 1
Setting /dev/ip ip_respond_to_address_mask_broadcast to 0
Setting /dev/ip ip_respond_to_echo_broadcast to 0
Setting /dev/ip ip_respond_to_timestamp to 0
Setting /dev/ip ip_respond_to_timestamp_broadcast to 0
Setting /dev/ip ip_send_redirects to 0
Setting /dev/ip ip_strict_dst_multihoming to 1
Setting /dev/ip ip_def_ttl to 255
Setting /dev/tcp tcp_conn_req_max_q0 to 4096
Setting /dev/tcp tcp_conn_req_max_q to 1024
Setting /dev/tcp tcp_smallest_anon_port to 32768
Setting /dev/tcp tcp_largest_anon_port to 65535
Setting /dev/udp udp_smallest_anon_port to 32768
Setting /dev/udp udp_largest_anon_port to 65535
Setting /dev/tcp tcp_smallest_nonpriv_port to 1024
Setting /dev/udp udp_smallest_nonpriv_port to 1024
Setting /dev/ip ip_ire_arp_interval to 60000
Setting /dev/tcp tcp_extra_priv_ports_add to 6112
Setting /dev/tcp tcp_rev_src_routes to 0

Categories: Networking Security, Unix Tags:

what is fence or fencing device

May 16th, 2012 No comments

To understand what is fencing device, you need first know something about split-brian condition. read here for info: http://linux-ha.org/wiki/Split_Brain

Here’s is something about what fence device is:

Fencing is the disconnection of a node from shared storage. Fencing cuts off I/O from shared storage, thus ensuring data integrity. A fence device is a hardware device that can be used to cut a node off from shared storage. This can be accomplished in a variety of ways: powering off the node via a remote power switch, disabling a Fibre Channel switch port, or revoking a host’s SCSI 3 reservations. A fence agent is a software program that connects to a fence device in order to ask the fence device to cut off access to a node’s shared storage (via powering off the node or removing access to the shared storage by other means).

To check whether a LUN has SCSI-3 Persistent Reservation, run the following:

root@doxer# symdev -sid 369 show 2040|grep SCSI
SCSI-3 Persistent Reserve: Enabled

And here’s an article about I/O fencing using SCSI-3 Persistent Reservations in the configuration of SF Oracle RAC: http://sfdoccentral.symantec.com/sf/5.0/solaris64/html/sf_rac_install/sfrac_intro13.html

Categories: HA & HPC, Hardware, NAS, SAN, Storage Tags:

differences between freezing vcs system and freezing service group

May 16th, 2012 No comments

In veritas vcs, freezing a system prevents service groups from coming online on the system if they failover from another node in the cluster. But this does not prevent faults from failing any service group already online on the system.

To prevent veritas intervention on faults caused by expected changes (even if the symptoms are unexpected) we would usually freeze the service group. This prevents any online/clean or restart operation kicking in on detection of faults.

After your modification on vcs, you need check that resources are not autodisabled and make sure that the config is made ro again.

Here’s the step to freeze service group(s) in vcs:
/opt/VRTS/bin/haconf -makerw
mkdir /var/tmp/veritas_config_backup_`date +%F`
cp -R /etc/VRTSvcs /var/tmp/veritas_config_backup_`date +%F`
/opt/VRTS/bin/hagrp -freeze $i -persistent
/opt/VRTS/bin/haconf -dump -makero

Categories: HA, HA & HPC Tags: ,

extend lvm

May 10th, 2012 No comments

We added new Hard disk. need add it to existing DG

take sdb as newly added disk

Step1: Format new disk and label as LVM disk
fdisk /dev/sdb
n -> p -> 1 -> return twice -> t -> 8e -> w

Step2: Create FS on new disk
mkfs.ext3 /dev/sdb1

Step3: Create as PV
pvcreate /dev/sdb1

Step4: Extend existing VG to new disk
vgextend VolGroup00 /dev/sdb1
then, you will get more free space now. check with:
vgs

Step5: Extend the Vol
lvextend -L +total_space_for_vol_g /dev/VolGroup00/LogVol00
lvextend -L +space_number_to_add_g /dev/VolGroup00/LogVol00

Step6: Online resize the Vol
resize2fs /dev/VolGroup00/LogVol00

Categories: Storage Tags:

change ldap client to bind to another ldap server

May 10th, 2012 No comments

If you want to change ldap client(linux) to bind to another ldap server, here’s the basic steps for you:

1.update /etc/ldap.conf to change where sudoers is authenticating(note that /etc/ldap.conf will only control sudoers)
From:
uri ldap://ldapserver1/ ldap://ldapserver2/
To:
uri ldap://ldapserver2/ ldap://ldapserver1/

2.update /etc/openldap/ldap.conf to change where logins are authenticating
From:
uri ldap://ldapserver1/ ldap://ldapserver2/
To:
uri ldap://ldapserver2/ ldap://ldapserver1/

3.restart nscd
/etc/init.d/nscd restart #or nscd -i hosts

NB:

For ldap client which is running solaris, you’ll need to know something about commands ldap_cachemgr, ldapclient and their mechanism.

Retrieve the contents of a hard disk

May 8th, 2012 No comments

Retrieve the contents of a hard disk

A disk read errors may give non-recoverable hardware, for example in / var / log / kern.log says:

Oct 30 11:22:52 ipbox kernel: hdc: dma_intr: status=0×51 { DriveReady SeekComplete Error }
Oct 30 11:22:52 ipbox kernel: hdc: dma_intr: error=0×40 { UncorrectableError }, LBAsect=11012416, sector=11012416
Oct 30 11:22:52 ipbox kernel: end_request: I/O error, dev hdc, sector 11012416
Oct 30 11:22:52 ipbox kernel: Buffer I/O error on device hdc, logical block 1376552
You can recover the recoverable groped with:

dd if=/dev/hdc1 of=/home/ipbox/hdc1.dmp conv=noerror,sync
fsck.ext3 -f /home/ipbox/hdc1.dmp
mount -o loop,ro -t ext3 /home/ipbox/hdc1.dmp /mnt
The first command reads the entire partition riversandola in a file, does not stop on errors and put zeros instead of unreadable sectors. The second command tries to retrieve the file system contained in the image saved, the third command mounts the file as a filesystem. The example is of course a filesystem ext3 .

ddrescue

The utility ddrescue (Debian package of the same name) can replace dd in more difficult cases. The program creates a log file read operation:

ddrescue – no-split / dev / sdb / tmp / sdb.img / tmp / sdb.log
For groped to read only those sectors that have had trouble with:

ddrescue – live – max-retries = 3 / dev / sdb / tmp / sdb.img / tmp / sdb.log
A further parameter to force the reading is – retrim .

Categories: Hardware, Storage Tags:

basic knowledge for netmask hexadecimal decimal binary netmask cidr calculator

May 3rd, 2012 No comments

Firstly, let’s get familiar with hexadecimal/decimal/binary in netmask linux/windows netmask like FF.FF.FF.FE or 255.255.255.254 or 11111111.11111111.11111111.11111110 which of them are identical.

F(hexadecimal) equals 15(decimal) and 1111(binary), E(hexadecimal) equals 14(decimal) and 1110(binary). Converts every F to 1111 and E to 1110, so FF.FF.FF.FE will turn out to be 11111111.11111111.11111111.11111110. As 11111111(binary) equals 255(decimal) and 11111110(binary) equals 254(decimal) so 11111111.11111111.11111111.11111110 will be 255.255.255.254. As there’s only 1 bit for host and there’s 31 bits for network, so CIDR for FF.FF.FF.FE will be xxx.xxx.xxx.xxx/31.

Then, let’s talk about relationship between ip address/netmask/network address/broadcast address/max hosts in one subnet

Given ip address and netmask address, we can calculate this ip’s network address, broadcast address, max hosts in this network with the same netmask in this specified subnet. For example, if ip is 192.168.1.28, netmask is 255.255.255.240, then 256-240=16(means there’ll be at most 16 hosts), as 192.168.1.28 belongs to ip range of 192.168.1.16 ~ 192.168.1.32, so it means that 192.168.1.28 has network address 192.168.1.16 and broadcast address 192.168.1.31.(network address must be the first address in the available subnet address and must in whole number multiples which is 16, 256-240=16)
NB:
  • We can confirm example above with the help of ipcalc(which is installed by default under RHEL / CentOS / Fedora Linux using initscripts package):

doxer@doxer ~ $ ipcalc -c 192.168.1.28/255.255.255.240 #or you can use ipcalc -pnbm 192.168.1.28 255.255.255.240
Address: 192.168.1.28 11000000.10101000.00000001.0001 1100
Netmask: 255.255.255.240 = 28 11111111.11111111.11111111.1111 0000
Wildcard: 0.0.0.15 00000000.00000000.00000000.0000 1111
=>
Network: 192.168.1.16/28 11000000.10101000.00000001.0001 0000
HostMin: 192.168.1.17 11000000.10101000.00000001.0001 0001
HostMax: 192.168.1.30 11000000.10101000.00000001.0001 1110
Broadcast: 192.168.1.31 11000000.10101000.00000001.0001 1111
Hosts/Net: 14 Class C, Private Internet

  • here’s a url which has some common SubNet Mask <-> Hex SubNet Mask <-> CIDR <-> Bit Mask <-> Quantity in Range

http://www.shorewall.com.au/contrib/IPSubNetMask.html

Categories: Network, Networking Security Tags:

ilom or alom ip address reassignment howto

May 3rd, 2012 No comments

Here’s steps to reassign ip address for ilom or alom system console(out of band access):

  • log on destination host’s system console through the system’s console port address or jump from KVM which connects the host
  • after log on system console, run showsc to confirm before starting, for example on my host:

doxer_con> showsc
Advanced Lights Out Manager CMT v1.1.8

parameter value
——— —–
if_network true
if_modem false
if_emailalerts true
netsc_dhcp false
netsc_ipaddr 192,168.52.164
netsc_ipnetmask 255.255.255.0
netsc_ipgateway 192,168.52.254
mgt_mailhost 172.20.2.231
mgt_mailalert(1) [email protected] 2
sc_customerinfo doxer
sc_escapechars #.
sc_powerondelay true
sc_powerstatememory false
sc_clipasswdecho true
sc_cliprompt doxer_con
sc_clitimeout 0
sc_clieventlevel 3
sc_backupuserdata true
diag_trigger power-on-reset error-reset
diag_verbosity normal
diag_level min
diag_mode normal
sys_autorunonerror false
ser_baudrate 9600
ser_parity none
ser_stopbits 1
ser_data 8
netsc_enetaddr 00:14:4f:7e:24:59
sys_enetaddr 00:14:4f:7e:24:50
doxer_con>

  • Now do the actual setting according to your need:

setsc netsc_ipaddr
setsc netsc_ipnetmask
setsc netsc_ipgateway
setsc if_connection ssh

  • confirm everything is what you want with showsc
  • Now reset the system controller with resetsc -y to make it take effect
  • Once the ILO has rebooted check that you can ssh to it and login as usual

NB:

For more info about alom/ilom/openboot prom commands, please read here alom/ilom/openboot prom commands help

Categories: Hardware, Servers, Unix Tags: , ,

requiretty in sudoers file will break functioning of accounts without tty

May 2nd, 2012 No comments

Intercepted from /etc/sudoers:

Defaults requiretty

#
# Refuse to run if unable to disable echo on the tty. This setting should also be
# changed in order to be able to use sudo without a tty. See requiretty above.
#

This means that if you have created an account without a tty for it, and you want that user have the privileges to some sudo commands, this setting(Defaults requiretty) will not make the account not able to execute these wanted sudo commands.

To fix this, you can do the following:

  1. disable “Defaults requiretty” in /etc/sudoers file
  2. Change nsswitch.conf to be ldap files rather than files ldap
  3. Better yet don’t enable local sudoers