Too many cron jobs and crond processes running

February 17th, 2012

I faced a problem that a ton of crond processes(cronjobs, or crontab) were running on the OS:

root@localhost# ps auxww|grep cron
vare 543 0.0 0.0 141148 5904 ? S 01:43 0:00 crond
root 4085 0.0 0.0 72944 976 ? Ss 2010 1:13 crond
vare 4522 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 5446 0.0 0.0 141148 5904 ? S 02:43 0:00 crond
vare 9202 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 10245 0.0 0.0 141148 5908 ? S 03:43 0:00 crond
vare 13989 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 15487 0.0 0.0 141148 5908 ? S 04:43 0:00 crond
vare 18796 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 20448 0.0 0.0 141148 5908 ? S 05:43 0:00 crond
root 23168 0.0 0.0 6024 596 pts/0 S+ 06:15 0:00 grep cron
vare 23474 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 27183 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 28358 0.0 0.0 141148 5904 ? S 00:43 0:00 crond
vare 32032 0.0 0.0 141148 5904 ? S Feb16 0:00 crond

.....(and more)

Now let's see what cronjobs are running by user vare:
root@localhost# crontab -u vare -l
# run the VERA Deploy routine
43 * * * * cd /share/scripts > /dev/null 2>&1 ; sleep 5 ; /share/scripts/Application/VARE/Deploy > /dev/null 2>&1

After check the script /share/bbscripts/Application/VERA/Deploy, I can see that the script is changing directory to a NFS mount point<i.e. cd /share/scripts> and then do some checks<i.e. /share/scripts/Application/VARE/Deploy>. But as there's problem during the process it's changing to NFS mount point, so the script hung there and didn't quit normally. As such, the number of crond was increasing.

Method to solve this specific problem(specific means you've to check your own script) is to first kill the hung processes of crond, then bounce autofs and then restart crond.


httpd installation upgrade tips

February 17th, 2012

1.different files after configure, make, make install

Firstly, read the following link to get the elementary knowledge about configure/make/make install

Here's the comparison after/before running configure:

Here's the comparison after/before running make:

[root@test ~]# diff configure.after configure.before
< Makefile
< config.log
< config.nice
< config.status
< modules.c

This means after configure, 5 files, i.e. Makefile, config.log, config.nice, config.status were generated.

[root@test ~]# diff make.after configure.after
< buildmark.o
< httpd
< modules.lo
< modules.o

This means buildmark.o, httpd, modules.lo, modules.o were generated after make command.


2.config.nice(with --prefix option to put the new one to some other place)

If you installed httpd by compiling source package(i.e. through downloading/unpack/configure/make/make install), and you haven't remove the source package(especially config.nice file under the unpacked source package), then you'll have the magic when you want to upgrade httpd. Using config.nice!

Read this article to easily upgrade httpd with all your selected options before.

Categories: IT Architecture, Linux, Systems Tags:

add oracle under vcs control howto – using veritas vxvm filesystem

February 14th, 2012

In this example, I'm gonna add oracle and oracle listener under vcs control.

haconf -makerw #make Change VCS to read-write mode

hagrp -add SG_myoracle #add service group

hagrp -modify SG_myoracle SystemList  host3 0 host4 1 host2 2 host1 3

hagrp -modify SG_myoracle AutoStartList  host1 host2 host3 host4 #List of systems on which, under specific conditions, the service group will be started with VCS (usually at system boot). For example, if a system is a member of a failover service group's AutoStartList attribute, and if the service group is not already running on another system in the cluster, the group is brought online when the system is started.

hagrp -modify SG_myoracle SourceFile "./"

hares -add dg_myoracle DiskGroup SG_myoracle #add disk group
hares -modify dg_myoracle Critical 0
hares -modify dg_myoracle DiskGroup myoracle
hares -modify dg_myoracle PanicSystemOnDGLoss 0
hares -modify dg_myoracle StartVolumes 1
hares -modify dg_myoracle StopVolumes 1
hares -modify dg_myoracle MonitorReservation 0
hares -modify dg_myoracle tempUseFence INVALID
hares -modify dg_myoracle DiskGroupType private
hares -modify dg_myoracle Enabled 1
hares -add vip_myoracle IP SG_myoracle #add vip
hares -modify vip_myoracle Critical 0
hares -local vip_myoracle Device
hares -modify vip_myoracle Device bond0 -sys host1
hares -modify vip_myoracle Device bond0 -sys host2
hares -modify vip_myoracle Device bond0 -sys host3
hares -modify vip_myoracle Device bond0 -sys host4
hares -modify vip_myoracle Address ""
hares -modify vip_myoracle NetMask ""
hares -modify vip_myoracle Enabled 1
hares -add mnt_myoracle Mount SG_myoracle #add mount point resource
hares -modify mnt_myoracle Critical 0
hares -modify mnt_myoracle MountPoint "/myoracle"
hares -modify mnt_myoracle BlockDevice "/dev/vx/dsk/myoracle/myoracleroot"
hares -modify mnt_myoracle FSType vxfs
hares -modify mnt_myoracle MountOpt largefiles
hares -modify mnt_myoracle FsckOpt "%-y"
hares -modify mnt_myoracle SnapUmount 0
hares -modify mnt_myoracle CkptUmount 1
hares -modify mnt_myoracle SecondLevelMonitor 0
hares -modify mnt_myoracle SecondLevelTimeout 30
hares -modify mnt_myoracle VxFSMountLock 0
hares -modify mnt_myoracle Enabled 1
hares -add mnt_myoracle-ora Mount SG_myoracle #add another mount point resource
hares -modify mnt_myoracle-ora Critical 0
hares -modify mnt_myoracle-ora MountPoint "/myoracle/ora"
hares -modify mnt_myoracle-ora BlockDevice "/dev/vx/dsk/myoracle/myoracle-ora"
hares -modify mnt_myoracle-ora FSType vxfs
hares -modify mnt_myoracle-ora MountOpt largefiles
hares -modify mnt_myoracle-ora FsckOpt "%-y"
hares -modify mnt_myoracle-ora SnapUmount 0
hares -modify mnt_myoracle-ora CkptUmount 1
hares -modify mnt_myoracle-ora SecondLevelMonitor 0
hares -modify mnt_myoracle-ora SecondLevelTimeout 30
hares -modify mnt_myoracle-ora VxFSMountLock 0
hares -modify mnt_myoracle-ora Enabled 1
hares -add lsnr_myoracle Netlsnr SG_myoracle #add listener resource
hares -modify lsnr_myoracle Critical 0
hares -modify lsnr_myoracle Owner oracle
hares -modify lsnr_myoracle Home "/ora/product/"
hares -modify lsnr_myoracle TnsAdmin "/myoracle/ora/admin/etc"
hares -modify lsnr_myoracle Listener LISTENER_myoracle
hares -modify lsnr_myoracle MonScript "./bin/Netlsnr/"
hares -modify lsnr_myoracle AgentDebug 0
hares -modify lsnr_myoracle Enabled 1

hares -add myoracle Oracle SG_myoracle #add oracle resource
hares -modify myoracle Critical 0
hares -modify myoracle Sid myoracle
hares -modify myoracle Owner oracle
hares -modify myoracle Home "/ora/product/"
hares -modify myoracle Pfile "/myoracle/ora/admin/myoracle/pfile/initmyoracle.ora"
hares -modify myoracle StartUpOpt STARTUP
hares -modify myoracle ShutDownOpt IMMEDIATE
hares -modify myoracle AutoEndBkup 1
hares -modify myoracle MonScript "./bin/Oracle/"
hares -modify myoracle AgentDebug 0
hares -modify myoracle Enabled 1

hares -add proxy_mnic_myoracle Proxy SG_myoracle #add proxy resource
hares -modify proxy_mnic_myoracle Critical 0
hares -modify proxy_mnic_oracle TargetResName mnic
hares -modify proxy_mnic_oracle Enabled 1

#Now do the dependency

hares -link mnt_myoracle dg_myoracle

hares -link mnt_myoracle-ora mnt_myoracle

hares -link myoracle mnt_myoracle-ora

hares -link vip_myoracle proxy_mnic_oracle

hares -link lsnr_myoracle vip_myoracle

hares -link lsnr_myoracle mnt_myoracle-ora

haconf -dump -makero #Write the configuration to disk and remove the designation stale. -makero changes the VCS mode to read-only.


If your system already has other service group configured, then hacf -cftocmd is your friend. Refer to here.

Categories: Clouding, HA & HPC, IT Architecture Tags:

luxadm forcelip/display on solaris 10

February 3rd, 2012

Now let's talk luxadm forcelip/display on solaris. Pay attention to bold ones. This article will be a little long and all about cXtXdXsX, be patient. :D
testhost:root root # vxprint -ht|grep dm #check for the disks on OS's view:
dm emc333263A c1t5006048452A70F7Cd231s2 auto 65536 212055808 -
dm emc3330DA8 c1t5006048452A70F7Cd232s2 auto 65536 17609728 -
dm emc3332646 c1t5006048452A70F7Cd230s2 auto 65536 70640128 -

testhost:root root # luxadm probe #this will probe for SAN disks and it's multipath
No Network Array enclosures found in /dev/es
Found Fibre Channel device(s):
Node WWN:5006048452a70f7c Device Type:Disk device
Logical Path:/dev/rdsk/c1t5006048452A70F7Cd230s2 #the OS disk's wwn
Node WWN:5006048452a70f7c Device Type:Disk device
Logical Path:/dev/rdsk/c1t5006048452A70F7Cd231s2
Node WWN:5006048452a70f7c Device Type:Disk device
Logical Path:/dev/rdsk/c1t5006048452A70F7Cd232s2
Node WWN:5006048452a70f43 Device Type:Disk device
Logical Path:/dev/rdsk/c3t5006048452A70F43d230s2
Node WWN:5006048452a70f43 Device Type:Disk device
Logical Path:/dev/rdsk/c3t5006048452A70F43d231s2
Node WWN:5006048452a70f43 Device Type:Disk device
Logical Path:/dev/rdsk/c3t5006048452A70F43d232s2

From output of luxadm probe, we'll know that there're c1 and c3. We can prove this from
bash-3.00# /usr/sbin/cfgadm -la|grep fabric
c1 fc-fabric connected configured unknown
c3 fc-fabric connected configured unknown

testhost:root root # cfgadm -la|grep fabric
c1 fc-fabric connected configured unknown
c3 fc-fabric connected configured unknown
testhost:root root # fcinfo hba-port -l
HBA Port WWN: 210000e08b18da4f #this is the wwn for hba
OS Device Name: /dev/cfg/c1 #device name for the hba
Manufacturer: QLogic Corp.
Model: 375-3102-xx
Firmware Version: 03.03.28
FCode/BIOS Version: fcode: 1.13;
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 200000e08b18da4f
Link Error Statistics:
Link Failure Count: 0
Loss of Sync Count: 0
Loss of Signal Count: 0
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 0
Invalid CRC Count: 0
HBA Port WWN: 210000e08b18024f
OS Device Name: /dev/cfg/c3
Manufacturer: QLogic Corp.
Model: 375-3102-xx
Firmware Version: 03.03.28
FCode/BIOS Version: fcode: 1.13;
Type: N-port
State: online
Supported Speeds: 1Gb 2Gb
Current Speed: 2Gb
Node WWN: 200000e08b18024f
Link Error Statistics:
Link Failure Count: 0
Loss of Sync Count: 1
Loss of Signal Count: 1
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 0
Invalid CRC Count: 0

To display information on remote targets(includes the storage manufacturer, the storage product type, WWPNs, and all of the SCSI targets that have been presented to the host):
testhost:root root # fcinfo remote-port -slp 210000e08b18024f #which luns are seen by hba 210000e08b18024f?
Remote Port WWN: 5006048452a70f43
Active FC4 Types: SCSI
SCSI Target: yes
Node WWN: 5006048452a70f43
Link Error Statistics:
Link Failure Count: 0
Loss of Sync Count: 1
Loss of Signal Count: 0
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 255
Invalid CRC Count: 0
LUN: 230
Vendor: EMC
OS Device Name: /dev/rdsk/c3t5006048452A70F43d230s2
LUN: 231
Vendor: EMC
OS Device Name: /dev/rdsk/c3t5006048452A70F43d231s2
LUN: 232
Vendor: EMC
OS Device Name: /dev/rdsk/c3t5006048452A70F43d232s2

To Display WWN data for a target device or host bus adapter on the specified fibre channel port
testhost:root root # luxadm -e port
/devices/pci@1e,600000/SUNW,qlc@3/fp@0,0:devctl CONNECTED
/devices/pci@1d,700000/SUNW,qlc@1/fp@0,0:devctl CONNECTED
testhost:root root # luxadm -e dump_map /devices/pci@1e,600000/SUNW,qlc@3/fp@0,0:devctl
Pos Port_ID Hard_Addr Port WWN Node WWN Type
0 10300 0 5006048452a70f7c 5006048452a70f7c 0x0 (Disk device)
1 15500 0 210000e08b18da4f 200000e08b18da4f 0x1f (Unknown Type,Host Bus Adapter)
Here's the multipath info:
testhost:root root # vxdmpadm getctlr all
c1 /pci@1e,600000/SUNW,qlc@3/fp@0,0 QLogic Corp. 21:00:00:e0:8b:18:da:4f
c3 /pci@1d,700000/SUNW,qlc@1/fp@0,0 QLogic Corp. 21:00:00:e0:8b:18:02:4f
c0 /pci@1c,600000/scsi@2 - -
Here's the multipath info for a specific disk(c1t5006048452A70F7Cd231s2):

testhost:root root # vxdisk list c1t5006048452A70F7Cd231s2

Device: c1t5006048452A70F7Cd231s2
devicetag: c1t5006048452A70F7Cd231
type: auto
hostid: testhost
disk: name=emc333263A id=1277720253.8.testhost
group: name=tpdbrdbd01root-dg id=1277720279.10.testhost
info: format=cdsdisk,privoffset=256,pubslice=2,privslice=2
flags: online ready private autoconfig autoimport imported
pubpaths: block=/dev/vx/dmp/c1t5006048452A70F7Cd231s2 char=/dev/vx/rdmp/c1t5006048452A70F7Cd231s2
guid: {5da11fa8-1dd2-11b2-ab51-0003ba89d76a}
udid: EMC%5FSYMMETRIX%5F000290102333%5F33!G+000
site: -
version: 3.1
iosize: min=512 (bytes) max=2048 (blocks)
public: slice=2 offset=65792 len=212055808 disk_offset=0
private: slice=2 offset=256 len=65536 disk_offset=0
update: time=1277829173 seqno=0.11
ssb: actual_seqno=0.0
headers: 0 240
configs: count=1 len=48144
logs: count=1 len=7296
Defined regions:
config priv 000048-000239[000192]: copy=01 offset=000000 enabled
config priv 000256-048207[047952]: copy=01 offset=000192 enabled
log priv 048208-055503[007296]: copy=01 offset=000000 enabled
lockrgn priv 055504-055647[000144]: part=00 offset=000000
Multipathing information:
numpaths: 2
c1t5006048452A70F7Cd231s2 state=enabled
c3t5006048452A70F43d231s2 state=enabled

To read more info:
1.Add and Configure LUNs in Solaris page for luxadm page for fcinfo 
4./usr/sbin/cfgadm -la |grep fabric#solaris, check Fibre Channel controller status
fcinfo hba-port -l #check hba infomation, like Qlogic, Emulex
/usr/sbin/lpfc/lputil #Emulex HBAs are not seen in cfgadm -al output. Emulex uses "lpfc" driver. You can manipulate them via /usr/sbin/lpfc/lputil
luxadm -e port #check whether hba cards are connected, this will show physical path
luxadm –e forcelip c2 #forcelip of one entire controller
cfgadm –c configure c2::5006048452a72687 #configure lun
cfgadm –c configure c2 #configure the whole controller, it does not effect previously configured LUNs
devfsadm -c disk #scan disks in solaris
symcfg disco #update sym db on this host.
luxadm probe #check FC disks allocated to this host

ntp offset – use ntpdate to manually sync local time with ntp server

February 2nd, 2012

Here's the outline for resolution:

1.check whether ntpd is running on the problematic host;

2.stop ntpd;

3.use ntpdate to manually sync with ntp server

4.start up ntpd.

Here's the detailed commands:

root@testhost# service ntpd stop
Shutting down ntpd: [ OK ]
root@testhost# ps -ef|grep ntp
root 9805 9542 0 01:53 pts/0 00:00:00 grep ntp
root@testhost# cat /etc/ntp.conf
tinker panic 0
server timehost1 prefer
server timehost2
server timehost3
driftfile /var/lib/ntp/drift

# Prohibit general access to this service.
#restrict default ignore #this is to allow ntpd to use the new ntpd server

# Permit the cluster node listed to synchronise with this time service.
# Do not permit those systems to modify the configuration of this service.
# Allow this host to be used as a timesource

# Permit all loopback interface access.
root@testhost# ntpdate -u timehost1
2 Feb 01:55:20 ntpdate[9824]: step time server offset 59.998407 sec
root@testhost# service ntpd start
Starting ntpd: [ OK ]
root@testhost# ps -ef|grep ntp
ntp 9999 1 0 01:55 ? 00:00:00 ntpd -u ntp:ntp -p /var/run/ -g
root 10017 9542 0 01:55 pts/0 00:00:00 grep ntp


  • If your host is a Virtual Machine running on Xen hypervisor, then you may find that ntpdate or ntpd will fail to synchronize time with the time server. That's because the VM will sync with hypervisor by default. So if you want to sync your VM's time with the time server, there're two methonds:

1. Synchronize time on XEN hypervisor and the VM will then sync with it automatically;

2. If you just want to sync VM's time without change XEN hypervisor's time setting, then on the VM, do the following (This setting does not apply to hardware virtualized guests):

echo 1 > /proc/sys/xen/independent_wallclock #or in /etc/sysctl.conf, xen.independent_wallclock=1

For VM, you should also note below(Wallclock Time Skew Problems):

Additional parameters may be needed in the boot loader (grub.conf) configuration file for certain operating system variants after the guest is installed. Specifically, for optimal clock accuracy, Linux guest boot parameters should be specified to ensure that the pit clock source is utilized. Adding clock=pit nohpet nopmtimer for most guests will result in the selection of pit as the clock source for the guest. Published templates for Oracle VM include these additional parameters.

Proper maintenance of virtual time can be tricky. The various parameters provide tuning for virtual time management and supplement, but do not replace, the need for an ntp time service running within guest. Ensure that the ntpd service is running and that the /etc/ntpd.conf configuration file is pointing to valid time servers.

After this step, you can now sync time for the VM without impacting others.

  • You can use ntpq -p to check the status of remote time servers.

[root@test-host ~]# ntpq -p
remote refid st t when poll reach delay offset jitter
*LOCAL(0) .LOCL. 5 l 66 64 377 0.000 0.000 0.001 .INIT. 16 u - 512 0 0.000 0.000 0.000 .INIT. 16 u - 512 0 0.000 0.000 0.000

On column "remote", we can see that it's blank space before and, this means that the two sources are discarded, failed sanity check and has never been synced to.

For more details about output of ntpq -p, you can read the explanation here.

  • For ntpd firewall issue, if you want to use ntpd, then you need to fix your network/firewall/NAT so that ntpd can have full unrestricted access to UDP port 123 in both directions. Also, you may setup one cronjob to bounce ntpd every hour so that ntp will be forced to sync with time server. echo "`tr -cd 1-5 </dev/urandom | head -c 2` */1 * * * /sbin/service ntpd restart" >> /var/spool/cron/root #you can of course manully add to cronjob via crontab -e -u root. or echo "`tr -cd 1-5 </dev/urandom | head -c 2` */1 * * * root /sbin/service ntpd restart" >> /etc/crontab
  • Many people have difficulties with using RESTRICT. They want to set themselves up to be as secure as possible, so they create an extremely limited default RESTRICT line in their /etc/ntp.conf file, and then they find that they can't talk to anyone. If you're having problems with your server, in order to do proper debugging, you should turn off all RESTRICT lines in your /etc/ntp.conf file, and otherwise simplify the configuration as much as possible, so that you can make sure that the basic functions are working correctly. Once you get the basics working, try turning back on various features, one-by-one. Here some tips for ntp restrict keyword controlling ntpd access.
  • To get start with ntp, read this guide(explained server & peer & stratum. Pool is a list of servers). And here is the full document about ntp.
  • You can run ntpstat to check ntp status too:

# ntpstat
synchronised to NTP server ( at stratum 3
time correct to within 12 ms
polling server every 128 s

  • linux 'leap-second' bug (link here) - if the system has been up since June 30th and is running ntp. One way to fix it is (as root from the command line):
    date -s "`date`"

1. Observe usage through "top"
2. Cleanup orphan java/FF processes.
3. run this command as root (sudo su -) : date -s "`date`"
4. Observe usage through "top" again. It should fall drastically low (near zero)

using timex to check whether performance degradation caused by OS or VxVM

February 1st, 2012

To check for differences between operating system times to access disks and Volume Manager times to access disks, we can know whether to check for differences between operating system times to access disks and Volume Manager times to access disks. This is because they should both be about the same since both commands force a read of disk header information. If one of those is markedly greater then it indicates a problem in that area.

#echo | timex /usr/sbin/format #to avoid prompt for user input. Use time instead of timex for linux
real          13.03

user           0.10

sys            1.49
#timex vxdisk –o alldgs list
real           2.65

user           0.00

sys            0.00

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

start/stop syslogd on solaris 10 or solaris 8/9

January 21st, 2012

Here's the configuration:

On Solaris 5.8 and 5.9, at the command prompt, enter /etc/init.d/syslog stop, followed by /etc/init.d/syslog start.

On Solaris 5.10, at the command prompt, enter svcadm disable svc:/system/system-log && svcadm enable svc:/system/system-log.

Now you can check ps -ef|grep syslogd, and to configure syslog, go to /etc/syslogd.conf. Log files for syslogd is under /var/log/syslog.

replace a broken disk under solaris svm control

January 12th, 2012

Firstly, you need detach the submirror that need replaced(save a copy of metastat -i/-p, metadb -i, df -k before doing these steps):

metadetach d0 d10 #d10 is c0t0d0 in this context

If you met error like:

attempt an operation on a submirror that has erred components

Then you'll need do a -f with metadetach:

metadetach -f d0 d10

Now do a check that all SVM objects have been removed from the failing disk:

metastat -p | grep c0t0d0
metadb | grep c0t0d0

Insert the new disk now.

Now configure the new disk:(this step may not needed if the disk is there in the output of metastat -i)

cfgadm -c configure c1::dsk/c0t0d0

Verify the disk has a "configured".

copy disk head info from c1t0d0(the good one) to c1t1d0(the replaced one):(this step may not needed if you use format and can see the new disk has partitions expected already)

root on testserver:/var/tmp # prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2

And use format -> partition to check the partitioning.

You can check device alias through eeprom:
root on testserver:/var/tmp # eeprom | grep devalias
nvramrc=devalias rootdisk /ssm@0,0/pci@18,700000/scsi@2/disk@0,0
devalias rootmirror /ssm@0,0/pci@18,700000/scsi@2/disk@1,0

To see mapping between physical device path and device name, use command format:
root on testserver:/var/tmp # format
0. c1t0d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>
1. c1t1d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848>

At last, to clear corrupted submirror->reinit submirror->attach submirror:
metaclear d10 #may not needed if d10 is there after running metastat -i
metainit -f d10 1 1 c0t0d0s0 #may not needed if d10 is there after running metastat -i
metattach d0 d10 #to see the resync progress, run metastat -i|grep progress
metastat d0
metastat -p
metastat -i
metadb -i #if metadb is not on at least two physical disks, you may need create metadb on the new disk using  metadb -a -c 3 c0t0d0s7

Categories: Hardware, IT Architecture, Storage, Systems, Unix Tags:

alom/ilom/openboot prom commands help

January 11th, 2012
1.Alom Available commands - This is help output from sun T2000 Alom(System Console)
Power and Reset control commands:
  powercycle [-y] [-f]
  poweroff [-y] [-f]
  poweron [-c] [FRU]
  reset [-y] [-c]
Console commands:
  break [-D] [-y] [-c]
  console [-f]
  consolehistory [-b lines|-e lines|-v] [-g lines] [boot|run]
Boot control commands:
  bootmode [normal|reset_nvram|bootscript="string"]
  setkeyswitch [-y] <normal|stby|diag|locked>
Locator LED commands:
  setlocator [on|off]
Status and Fault commands:
  clearfault <UUID>
  disablecomponent [asr-key]
  enablecomponent [asr-key]
  removefru [-y] <FRU>
  setfru -c [data]
  showcomponent [asr-key]
  showfaults [-v]
  showfru [-g lines] [-s|-d] [FRU]
  showlogs [-b lines|-e lines|-v] [-g lines] [-p logtype[r|p]]
  shownetwork [-v]
  showplatform [-v]
ALOM Configuration commands:
  setdate <[mmdd]HHMM | mmddHHMM[cc]yy][.SS]>
  setsc [param] [value]
  showhost [version]
  showsc [-v] [param]
ALOM Administrative commands:
  flashupdate <-s IPaddr -f pathname> [-v]
  help [command]
  resetsc [-y]
  restartssh [-y]
  setdefaults [-y] [-a]
  ssh-keygen [-l|-r] <-t {rsa|dsa}>
  showusers [-g lines]
  useradd <username>
  userdel [-y] <username>
  userpassword <username>
  userperm <username> [c][u][a][r]
  usershow [username]
2.ilom Available commands - This is help output from sun Fire E2900
addcodlicense      -- add a cod license
bootmode           -- configure the way Solaris boots at the next reboot
break              -- send break to the Solaris console
console            -- connect to the Solaris console
deletecodlicense   -- delete a cod license
disablecomponent   -- add a component to the blacklist
enablecomponent    -- remove a component from the blacklist
flashupdate        -- update firmware
forcepci           -- force pci mode
help               -- show help for a command or list of commands
history            -- show command history
inventory          -- show seprom contents of a FRU/system
logout             -- logout from this connection
password           -- set the system controller (LOM) access password
poweroff           -- power off system or components
poweron            -- power on system or components
reset              -- reset the Solaris system
resetsc            -- reset the system controller (LOM)
restartssh         -- restart SSH server (SSH must be enabled)
setalarm           -- set the alarm leds
setdate            -- set the date and time for the system
setescape          -- set system controller (LOM) escape sequence
seteventreporting  -- set event reporting
setlocator         -- set the system locator led
setls              -- set FRU location status
setupnetwork       -- setup system controller (LOM) network settings
setupsc            -- configure the system controller (LOM)
showalarm          -- show state of system alarms leds
showboards         -- show board information
showchs            -- show component health status
showcodlicense     -- show COD licenses
showcodusage       -- show COD resource usage
showcomponent      -- show state of a component
showdate           -- show the current date and time for the system
showenvironment    -- show environmental information
showerrorbuffer    -- show the contents of the error buffer
showescape         -- show system controller (LOM) escape sequence
showeventreporting -- show status of event reporting
showfault          -- show state of system fault led
showhostname       -- show hostname
showlocator        -- show state of system locator led
showlogs           -- show the logs
showmodel          -- show the platform model
shownetwork        -- show system controller (LOM) network settings and MAC addresses
showresetstate     -- show CPU registers after reset
showsc             -- show system controller (LOM) version and uptime, show /System/BIOS system_bios_version
shutdown           -- shutdown solaris and take to standby mode
ssh-keygen         -- generate SSH host keys or show SSH host key fingerprint
testboard          -- test a CPU/Memory board
show /System/Networking/Ethernet_NICs/Ethernet_NIC_0
show /System/Networking/Ethernet_NICs/Ethernet_NIC_1
show /System/Networking/Ethernet_NICs/Ethernet_NIC_2
show /System/Networking/Ethernet_NICs/Ethernet_NIC_3
set /System/BIOS/config dump_uri=scp://root:pass1@

#### Upgrade iLOM

show /System/BIOS system_bios_version
load -script -source
load -script -source
show /System/BIOS system_bios_version
set /SP/services/kvms/host_storage_device/remote server_URI=nfs://
set /SP/services/kvms/host_storage_device/ mode=remote
start /SYS -script
reset /SYS -script
set /System/BIOS/config restore_options=all load_uri=scp://
set /System/BIOS/config restore_options=all load_uri=scp://
set /System/BIOS/config restore_options=all load_uri=scp://
set /SP/services/kvms/host_storage_device/ mode=disabled
reset /SYS -script
show /System/Open_Problems

set /SYS/MB clear_fault_action=true -script
For oracle ilom:
stop -script -force /SYS #if stop /SYS won't work
start /HOST/console #if you found server stuck in ilom GUI, then try this from serial console. It may show disk corruption. 
reset -script -force /SP
start /SP/faultmgmt/shell  #and then you can do: fmadm faulty -a/fmadm repair
There are cases where the ilom hangs and needs to be restarted. Since the node is inaccessible, ipmitool will not work to reset the ilom so it must be done from a remote node as such:
ipmitool -H <ip address of problematic db node> -U root -P mypassword1 mc reset cold
Here's all the help message:

-> help
The help command is used to view information about commands and targets

Usage: help [-o|-output terse|verbose] [<command>|legal|targets|<target>|<target> <property>]

Special characters used in the help command are
[] encloses optional keywords or options
<> encloses a description of the keyword
(If <> is not present, an actual keyword is indicated)
| indicates a choice of keywords or options

help <target> displays description if this target and its properties
help <target> <property> displays description of this property of this target
help targets displays a list of targets
help legal displays the product legal notice

Commands are:

-> help reset
The reset command is used to reset a target.

Usage: reset [-script] [<target>]

Available options for this command:
-script : do not prompt for yes/no confirmation and act as if yes was specified

-> help targets

Target Meaning

/ Hierarchy Root
/HOST Host Information
/HOST/console Redirection of console stream to SP, to check for console history(diag info), use "show /HOST/console/history". #if you found server stuck in ilom GUI, then try this from serial console. It may show disk corruption. 
/HOST/diag SP/HOST/diag Configuration #set /HOST/diag generate_host_nmi = true will panic the OS and save the core in current memory for support to analysis. It will immediately failover to other node but could take 15-30 mins to save big memory panic core. Before doing NMI(NonMaskable Interrupt) panic, spot any single disk with 100% busy from analytics tab. From our experience, it could be single bad disk hanging the bus and causes the issue. (From the Oracle ILOM web interface, click Host Management > Diagnostics, and then click Generate NMI.)

-> cd /HOST
-> show
        generate_host_nmi = (Cannot show property)
-> set generate_host_nmi=true
set ‘generate_host_nmi' to ‘true'

To set next boot device to cdrom in CLI:

-> cd /HOST

-> show


boot_device = default
generate_host_nmi = (Cannot show property)


-> set boot_device=cdrom
Set 'boot_device' to 'cdrom'

/HOST0/console Redirection of console stream to SP #start  /HOST0/console is like start /SP/console
/STORAGE Storage information
/STORAGE/raid Contains all RAID related information
/SYS Sensors, Indicators, and FRU Information #e.g. show/start/stop<-force>/reset /SYS, hard reset/reboot the server. show /SYS can see server type, serial number, power state. stop /SYS -h to print help message
/SP Service Processor #reset /SP to Reset ILOM if you met "disconnected" issue on ilom console. This should not impact running OS. You can do this if you find ilom web UI is not responding correctly. 
/SP/alertmgmt Alert rule management
/SP/alertmgmt/rules Alert rules node
/SP/cli Command line interface

show /SP/cli
set /SP/cli timeout=360
set /SP/cli timeout=0
/SP/clients Clients that connect to external services
/SP/clients/activedirectory Active Directory sub-directory
/SP/clients/activedirectory/admingroups administrator groups sub-directory
/SP/clients/activedirectory/alternateservers alternate servers sub-directory
/SP/clients/activedirectory/alternateservers/1/cert cert directory
/SP/clients/activedirectory/alternateservers/2/cert cert directory
/SP/clients/activedirectory/alternateservers/3/cert cert directory
/SP/clients/activedirectory/alternateservers/4/cert cert directory
/SP/clients/activedirectory/alternateservers/5/cert cert directory
/SP/clients/activedirectory/cert cert sub-directory
/SP/clients/activedirectory/customgroups custom groups sub-directory
/SP/clients/activedirectory/dnslocatorqueries DNS service record sub-directory
/SP/clients/activedirectory/opergroups operator groups sub-directory
/SP/clients/activedirectory/userdomains user domain sub-directory
/SP/clients/dns DNS resolution configuration
/SP/clients/ldap LDAP Client Properties
/SP/clients/ldapssl LDAP/SSL sub-directory
/SP/clients/ldapssl/admingroups administrator groups sub-directory
/SP/clients/ldapssl/alternateservers alternate servers sub-directory
/SP/clients/ldapssl/alternateservers/1/cert cert directory
/SP/clients/ldapssl/alternateservers/2/cert cert directory
/SP/clients/ldapssl/alternateservers/3/cert cert directory
/SP/clients/ldapssl/alternateservers/4/cert cert directory
/SP/clients/ldapssl/alternateservers/5/cert cert directory
/SP/clients/ldapssl/cert cert sub-directory
/SP/clients/ldapssl/customgroups custom groups sub-directory
/SP/clients/ldapssl/opergroups operator groups sub-directory
/SP/clients/ldapssl/optionalUserMapping userMapping(optional) sub-directory
/SP/clients/ldapssl/userdomains user domain sub-directory
/SP/clients/ntp NTP configuration
/SP/clients/ntp/server NTP server configuration

set /SP/clock usentpserver=enabled

set /SP/clients/ntp/server/1 address=
set /SP/clients/ntp/server/2 address=
show /SP/clients/ntp/server/1 address
show /SP/clients/ntp/server/2 address
show /SP/clock usentpserver
/SP/clients/radius RADIUS Client Properties
/SP/clients/smtp SMTP Server
/SP/clients/syslog Syslogd remote logging
/SP/clients/syslog/1 Syslogd remote logging server 1
/SP/clients/syslog/2 Syslogd remote logging server 2
/SP/clock Clock management #show /SP/clock/ datetime
/SP/config Config Backup / Restore settings
/SP/diag SP/Host Diagnositics Configuration
/SP/diag/snapshot Take snapshot of system for diagnostic purposes
/SP/faultmgmt FRUs with faults #show /SP/faultmgmt
/SP/faultmgmt/shell Fault management captive shell
/SP/firmware Firmware Base TARGET
/SP/logs Log events
/SP/logs/event Designations for event log
/SP/logs/event/list Designations for event log #show /SP/logs/event/list, get reset/shutdown/ilom software upgrade history

show /SP/logs/event/list
set /SP/logs/event clear=true
/SP/network External network interface
/SP/network/interconnect USB Ethernet Submenu
/SP/network/ipv6 IPv6 Information
/SP/policy Policy Configuration
/SP/serial Serial interfaces
/SP/serial/external External serial interface
/SP/serial/host Host-to-SP serial interface
/SP/serial/portsharing Serial port sharing switch control

show /SP/serial/host/ speed
show /SP/serial/host/ speed
show /SP/serial/external/ speed
set /SP/serial/host/ speed=9600
set /SP/serial/host/ pendingspeed=9600 commitpending=true
set /SP/serial/external/ speed=9600
set /SP/serial/external/ pendingspeed=9600 #commitpending=true
/SP/services Available services
/SP/services/http HTTP service
/SP/services/https HTTPS service
/SP/services/https/ssl HTTPS SSL Certficate Settings
/SP/services/https/ssl/custom_cert Custom SSL Certficate Settings
/SP/services/https/ssl/custom_key Custom SSL Private Key Settings
/SP/services/https/ssl/default_cert Default SSL Certficate Settings
/SP/services/ipmi Management of the IPMI service
/SP/services/kvms Management of the KVMS service
/SP/services/servicetag Servicetag configuration
/SP/services/snmp SNMP agent service configuration
/SP/services/snmp/communities snmp communities
/SP/services/snmp/users SNMP users
/SP/services/ssh Secure shell
/SP/services/ssh/keys Keys for secure shell
/SP/services/ssh/keys/dsa DSA key for secure shell
/SP/services/ssh/keys/rsa RSA key for secure shell
/SP/services/sso Single Sign-on Configuration
/SP/services/wsman Management of the WSMAN service
/SP/sessions Session description #show /SP/sessions
/SP/users User description #e.g. show /SP/users/root and then check the "role" part. set /SP/users/admin password=newpass

-> create /SP/users/ilomroot password=<your password>
-> set /SP/users/ilomroot role=aucro #aucro is equivalent to the setting the Administrator (administrator) profile. more roles here

# -> set /SP/users/lroot password=password #to reset password

-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n)? y

faultmgmtsp> help

Built-in commands:
echo - Display information to user.
Typical use: echo $?
help - Produces this help.
Use 'help <command>' for more information about an external command.
exit - Exit this shell.

External commands:
fmadm - Administers the fault management service
fmdump - Displays contents of the fault and ereport/error logs
fmstat - Displays statistics on fault management operations
etcd - ereport injector

You can open another terminal in iLOM by the following steps:

First, click "Keyboard" -> "Left Alt Key". Then press F2. You'll get the terminal with "sh-3.2#". Then you should click again "Keyboard" -> "Left Alt Key" to release that control. You can now issue command like "fdisk" so that you can change partition's system id(Linux/SWAP/EFI GPT/VMware VMFS etc). If you want to turn back to the first console, just click "Keyboard" -> "Left Alt Key" and then press F1. (don't forget to uncheck "Left Alt key" after this)


Here's more about oracle ilom commands -

3.KVM Available commands - This is help output from raritan kvm
?                 clear         connect
console_cmd       disconnect    exit
grep              help          list_interfaces
list_nodes        list_ports    listdevices
listinterfaces    listnodes     listports
ls                more          ssh
3.XSCF alom - this is help output about Fujitsu ilom

SCF> help
date Show date.
env-monitor Show system environment.
exit Exit XSCF Shell.
help Show help of shell command.
hangup Kill XSCF telnet connections.
lan-config Show LAN configuration.
logtest Save Test Log to check setting.
net-status Show SCF-LAN status.
nodeled Show and Control Check LED status.
por por,Power On Reset.
power-on Power on.
power-off Power off.
rci-config Show RCI configuration.
request Panic request.
send-break Send Break Signal to TTYA console.
set-console-device Set console device [serial | lan]
set-shell-command Change shell keyword.
show-access-logs Show the access logs.
show-config Show system configuration.
show-connections Show XSCF network connection status.(Telnet SSH)
show-console-device Show console device setting as TTYA Port.
show-console-logs Show console messages.
show-error-logs Show error logs.
show-event-logs Show event logs.
show-ipl-logs Show IPL,Initial Program Loading, messages.
show-mail-report Show Mail Report configuration.
show-panic-logs Show Panic messages.
show-power-logs Show power logs.
show-remcs Show REMCS configuration.
show-shell-command Show shell keyword.
show-status Show system error status.
shutdown Shutdown request.
thermal-history Show recorded thermal history.
version Show version.
who Who is on the XSCF system.
xir xir,eXternally Initiated Reset.


1.console -d 0. Also when you're in "SCF>", type "exit", you'll go back to console or OK mode.

2.type ~. to go back to XSCF

3.To log in Fujitsu console via SMC:

  • Login to DCM and find out the SMC for the partition
  • ssh as root to the SMC system
  • Find out the actual name of the partition name by viewing either /etc/hosts file or /etc/FJSVscstargets
  • Run the following command to get connected the partition console: /opt/FJSVcsl/bin/get_console -w -n <partition_name>
If you want to get into OK prompt of a Fujitsu partition:
  • ctrl+] to get the telnet prompt
  • From telnet prompt, type "send break"  to get OK prompt
  • to check xscf logs file fmdump -m, fmdump -v

4.Some HP-DL boxes have a DNS name called hostname-rib, you can do a nslookup -qt=all <hostname>, and then visit https://<dns> for a try. you can also have a try on https://<hostname>:2381 or telnet hostname_con for a try. Also, you can try ssh <hostname-rib> and then do a "start /SP/console" to the serial console access.

5.If the server is SUN Fire series, you may be interested in commands in /usr/platform/`uname -i`/sbin/{scadm, eeprom, fruadm, prtdiag, trapstat, wrsmconf, wrsmstat}

4.OpenBoot parameters and commands(this part is from url

About Openboot :
The firmware in Sun’s boot PROM is called OpenBoot. The main features of openboot are – initial program loading , & debugging features to assist kernel debugging. OpenBoot supports plug-in device drivers which are written in language Forth. This plug in feature allows Sun or any third-party vendors to develop new boot devices but without making any changes to boot PROM.

Accessing the openboot
Openboot console can be accessed by any of the following means . Be careful not to do this on a live system as you might end up in rebooting the server .

1. Rebooting a system , if auto-boot is not set to true rebooted system returns to OK> prompt which is openboot prompt

2. Pressing the keys L1 and A or STOP A , at the same time will bring you to the OpenBoot system. You will see the display
Type b (boot), c (continued), or n (new command mode)
Typing b boots the operating system . Typing c resumes the execution of a halted program. Typing n gets you to the Forth monitor, and the prompt will change to ok.

OpenBoot Parameters & commands

Following two tables gives a list of Openboot parameters & commands

Following two tables gives a list of
Openboot parameters & commands 
Display all variables and current
setenv <variable> 
Set variable to the given
set-default  <variable> 
Reset the value of variable
to the factory default.
Reset variable values to the factory
System directly boots without
stopping at OK> after power on.
command passed on to auto boot if true.
File  for booting  Solaris , default is empty string .This
variable contains the default boot arguments that are used when
OpenBoot is not in diagnostic mode.
boot-device=disk net
to boot from , multiple devices can be specified using spaces .Other
devices will be selected if  first device fails.


Tests the UTP  Ethernet port
link and flashes error messages if there is no network  link.
Use the system’s  MAC address
instead of network card’s MAC address .
boot file for diagnostic mode This
variable contains the default diagnostic mode boot arguments.
booting device in diagnostic mode.
If true system runs in diagnostic
Level for diagnostics information , can
be  min , max and minus . There
may be additional platform specific values. If set to off, POST is
not called. The default value is platform-dependent.
Input device used at power-on (
keyboard, ttya, or  ttyb).
keyboard click sound
For custom keyboards
Output device used at power-on
ttya, or ttyb).
controls  the behavior of the
terminal emulator.
The value false causes  the terminal emulator to  stop
interpreting ANSI escape sequences resulting in  echoing them
to the output device.
Columns and Rows of display screen.


SCSI bus address of host adapter,
range 0-7. Used  in shared scsi storage envornment.
PS: If you want to change SCSI initiator ID on a PCI(Peripheral Component Interconnect) adapter/controller, you can refer to the following:
Order to probe pci and sbus
buses for devices.
If true , execute commands in
NVRAMRC during sys-
tem start-up. Defaults to false .
Displays contents of NVRAM
Firmware security level (options:
none, command ,
or full). If set to command or full, system will prompt for PROM
security password.
Security password setting when
security mode is command or full.
No. of bad security login .
password Set




this command shows the following
systems hardware information : Model, architecture, processor, keyboard, openboot
version, Serial no. Ethernet  address & host id.
test floppy – test floppy disk drive
test net - test network loop backs
test scsi – test scsi interface
test-all    test for all devices with self test
Show ticks of real-time clock
Monitor network broadcast packets
Monitor broadcast packets on all net
Show attached SCSI devices
Show attached SCSI devices for all
host adapters- internal & external.


boot – boot kernel from default
Factory default is to boot
from DISK if present, otherwise from NET.
boot net – boot kernel from network
boot cdrom – boot kernel from CD-ROM
boot disk1:h - boot from disk1 partition h
boot tape – boot default file from tape
disk myunix
 -as – boot myunix from disk with
flags "-as"
ok cd /pci@1f,4000/scsi@3
ok .properties
ok ls
f00809d8 tape
f007ecdc disk
ok .speed
CPU Speed : 200.00MHz
UPA Speed : 100.00MHz
PCI Bus A : 66Mhz
PCI Bus B : 33Mhz


commands at OK prompt.
nvedit Start
nvramrc line editor using a temporary edit buffer
use-nvramrc? If this variable is true , Contents of nvramrc is
executed automatically. Set using setenv command
nvrun Execute the contents of nvedit edit buffer
nvstore Save the contents of the nvedit buffer into NVRAM
nvrecover Recover nvramrc after a set-defaults
nvalias <name> <path> Edit nvramrc to include
devalias called ‘name’
nvunalias <name> Edit nvramrc to remove devalias called
Key Sequences
These commands are disabled if the PROM security is on. Also, if
your system has full security enabled, you cannot apply any of the
suggested commands unless you have the password to get to the ok
Stop – Bypass POST. This command does not depend on
security-mode. (Note: some systems bypass POST as a default; in
such cases, use Stop-D to
start POST.)
Stop-D -
Enter diagnostic mode (set diag-switch? to true).
Stop-F -
Enter Forth on TTYA instead of probing. Use exit to
continue with the initialization sequence. Useful if hardware is
Reset NVRAM contents to default values.
Categories: Hardware, Servers Tags:

ldap.conf and ldap_client_file not the same

January 2nd, 2012

You may find it weird that a ldap client has both ldap.conf and ldap_client_file, and the two files are referring to different ldap servers.

In short, this is because the default OpenLDAP client configuration file is located in /etc/ldap.conf, and on a typical Solaris LDAP client you will find a /var/ldap/ldap_client_file holding the information about which server(s) to contact and what authentication method to use. 

Here's more infomation: