Archive

Archive for the ‘Unix’ Category

ssh passwordless login with private key

June 7th, 2018 Comments off

On Server Side:

su - username

cd .ssh/

cat id_rsa.pub >> authorized_keys #if there is no id_rsa/id_rsa.pub, then generate them using "ssh-keygen -t rsa". When prompt for password, leave it empty

On Server Side:

Make sure "RSAAuthentication yes", "PubkeyAuthentication yes" is there in /etc/ssh/sshd_config (restart ssh if modified)

Make sure .ssh is 700, authorized_keys is 600

Copy id_rsa to client side, rename it as "private.key"

On client side:

chmod 600 private.key

ssh -i private.key username@server

Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – ipmitool Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory

January 16th, 2018 Comments off
If you met below error on physical servers (not VMs, as VM do not support IPMI)

    [root@localhost ~]# ipmitool
    Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory

Then firstly you need make sure your server systemboard supports IPMI. 
Old system-boards might not support IPMI technology.

    [root@localhost ~]# dmidecode | grep -A 6 -i ipmi
    IPMI Device Information
        Interface Type: KCS (Keyboard Control Style)
        Specification Version: 1.5
        I2C Slave Address: 0x10
        NV Storage Device: Not Present
        Base Address: 0x0000000000000CA8 (I/O) #if not all zeros, then it supports IPMI
        Register Spacing: 32-bit Boundaries

If it's supported, then you need enable IPMI related modules:

    modprobe ipmi_devintf
    modprobe ipmi_si

Then add it to /etc/modules to have them loaded automatically:

    ipmi_devintf
    ipmi_si

To start IPMI:
    
    /etc/init.d/ipmi start
    /etc/init.d/ipmi status

PS:
    1. If there's no ipmitool command, try install it by "yum install -y OpenIPMI ipmitool"
    2. You may need add more modules

        [root@localhost ~]# modprobe ipmi_devintf
        [root@localhost ~]# modprobe ipmi_si
        [root@localhost ~]# modprobe ipmi_watchdog
        [root@localhost ~]# modprobe ipmi_poweroff
        [root@localhost ~]# modprobe ipmi_msghandler
Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

VM shutdown stuck in “mount: you must specify the filesystem type, please stand by while rebooting the system”

November 16th, 2016 Comments off

When you issue "shutdown" or "reboot" on linux box and found "mount: you must specify the filesystem type, please stand by while rebooting the system":

Then one possible reason is that you have specified wrong mount options for nfs shares in /etc/fstab. For example, for nfsv3, please make sure to use below nfs options when you mount shares:

<share name> <mount dir> nfs rsize=32768,wsize=32768,hard,nolock,timeo=14,noacl,intr,mountvers=3,vers=3 0 0

And using below option will make VM shutdown stuck in "mount: you must specify the filesystem type". DO NOT use below:

<share name> <mount dir> nfs vers=3,rsize=32768,wsize=32768,hard,nolock,timeo=14,noacl,intr 0 0

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

TCP wrappers /etc/hosts.allow /etc/hosts.deny

June 2nd, 2016 Comments off

A simple example on linux box:

[root@test ~]# cat /etc/hosts.allow
sshd : ALL EXCEPT host1.example.com
snmpd : ALL EXCEPT host1.example.com
ALL : localhost

[root@test ~]# cat /etc/hosts.deny
ALL:ALL

And here's explaining:

Service "sshd/snmpd" will accept connections from all hosts except host1.example.com. All services will accept connections from localhost. Other services will deny connections from all hosts.

 

Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – net/core/dev.c:1894 skb_gso_segment+0x298/0x370()

April 19th, 2016 Comments off

Today on one of our servers, there were a lot of errors in /var/log/messages like below:

║Apr 14 21:50:25 test01 kernel: WARNING: at net/core/dev.c:1894
║skb_gso_segment+0x298/0x370()
║Apr 14 21:50:25 test01 kernel: Hardware name: SUN FIRE X4170 M3
║Apr 14 21:50:25 test01 kernel: : caps=(0x60014803, 0x0) len=255
║data_len=215 ip_summed=1
║Apr 14 21:50:25 test01 kernel: Modules linked in: dm_nfs nfs fscache
║auth_rpcgss nfs_acl xen_blkback xen_netback xen_gntdev xen_evtchn lockd
║ @ sunrpc 8021q garp bridge stp llc bonding be2iscsi iscsi_boot_sysfs ib_iser
║ @ rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio
║ @ ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp dm_round_robin libiscsi
║ @ dm_multipath scsi_transport_iscsi xenfs xen_privcmd dm_mirror video sbs sbshc
║acpi_memhotplug acpi_ipmi ipmi_msghandler parport_pc lp parport sr_mod cdrom
║ixgbe hwmon dca snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
║snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore
║snd_page_alloc iTCO_wdt iTCO_vendor_support pcspkr ghes i2c_i801 hed i2c_core
║dm_region_hash dm_log dm_mod usb_storage ahci libahci sg shpchp megaraid_sas
║sd_mod crc_t10dif ext3 jbd mbcache
║Apr 14 21:50:25 test01 kernel: Pid: 0, comm: swapper Tainted: G W
║ 2.6.39-400.264.4.el5uek #1
║Apr 14 21:50:25 test01 kernel: Call Trace:
║Apr 14 21:50:25 test01 kernel: <IRQ> [<ffffffff8143dab8>] ?
║skb_gso_segment+0x298/0x370
║Apr 14 21:50:25 test01 kernel: [<ffffffff8106f300>]
║warn_slowpath_common+0x90/0xc0
║Apr 14 21:50:25 test01 kernel: [<ffffffff8106f42e>]
║warn_slowpath_fmt+0x6e/0x70
║Apr 14 21:50:25 test01 kernel: [<ffffffff810d73a7>] ?
║irq_to_desc+0x17/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffff812faf0c>] ?
║notify_remote_via_irq+0x2c/0x40
║Apr 14 21:50:25 test01 kernel: [<ffffffff8100a820>] ?
║xen_clocksource_read+0x20/0x30
║Apr 14 21:50:25 test01 kernel: [<ffffffff812faf4c>] ?
║xen_send_IPI_one+0x2c/0x40
║Apr 14 21:50:25 test01 kernel: [<ffffffff81011f10>] ?
║xen_smp_send_reschedule+0x10/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffff81056e0b>] ?
║ttwu_queue_remote+0x4b/0x60
║Apr 14 21:50:25 test01 kernel: [<ffffffff81509a7e>] ?
║_raw_spin_unlock_irqrestore+0x1e/0x30
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143dab8>]
║skb_gso_segment+0x298/0x370
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143dba6>]
║dev_gso_segment+0x16/0x50
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143dfb5>]
║dev_hard_start_xmit+0x3d5/0x530
║Apr 14 21:50:25 test01 kernel: [<ffffffff8145a074>]
║sch_direct_xmit+0xc4/0x1d0
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143e811>]
║dev_queue_xmit+0x161/0x410
║Apr 14 21:50:25 test01 kernel: [<ffffffff815099de>] ?
║_raw_spin_lock+0xe/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045820c>]
║br_dev_queue_push_xmit+0x6c/0xa0 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81076e77>] ?
║local_bh_enable+0x27/0xa0
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045e7ba>]
║br_nf_dev_queue_xmit+0x2a/0x90 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045f668>]
║br_nf_post_routing+0x1f8/0x2e0 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>]
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffff8146777c>]
║nf_hook_slow+0x7c/0x130
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04581a0>] ?
║br_forward_finish+0x70/0x70 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04581a0>] ?
║br_forward_finish+0x70/0x70 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458130>] ?
║br_flood_deliver+0x20/0x20 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458186>]
║br_forward_finish+0x56/0x70 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045eba4>]
║br_nf_forward_finish+0xb4/0x180 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045f36f>]
║br_nf_forward_ip+0x26f/0x370 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>]
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffff8146777c>]
║nf_hook_slow+0x7c/0x130
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458130>] ?
║br_flood_deliver+0x20/0x20 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>] ?
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458130>] ?
║br_flood_deliver+0x20/0x20 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04582c8>]
║__br_forward+0x88/0xc0 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458356>]
║br_forward+0x56/0x60 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04591fc>]
║br_handle_frame_finish+0x1ac/0x240 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045ee1b>]
║br_nf_pre_routing_finish+0x1ab/0x350 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff8115bfe9>] ?
║kmem_cache_alloc_trace+0xc9/0x1a0
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045fc55>]
║br_nf_pre_routing+0x305/0x370 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff8100122a>] ?
║xen_hypercall_xen_version+0xa/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>]
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffff8146777c>]
║nf_hook_slow+0x7c/0x130

To fix this, we should disable LRO(large receive offload) first:

for i in eth0 eth1 eth2 eth3;do /sbin/ethtool -K $i lro off;done

And if the NICs are of Intel 10G, the we should disable GRO(generic receive offload) too:

for i in eth0 eth1 eth2 eth3;do /sbin/ethtool -K $i gro off;done

Here's the command to disable both of LRO/GRO:

for i in eth0 eth1 eth2 eth3;do /sbin/ethtool -K $i gro off;/sbin/ethtool -K $i lro off;done

 

resolved – /etc/rc.local not executed on boot in linux

November 11th, 2015 Comments off

When you find your scripts in /etc/rc.local not executed along with system boots, then one possibility is that the previous subsys script takes too long to execute, as /etc/rc.local is usually the last one to execute, i.e. S99local. To prove which is the culprit subsys that gets stuck, you can edit /etc/rc.d/rc(which is from /etc/inittab):

[root@host1 tmp] vi /etc/rc.d/rc
# Now run the START scripts.
for i in /etc/rc$runlevel.d/S* ; do
        check_runlevel "$i" || continue

        # Check if the subsystem is already up.
        subsys=${i#/etc/rc$runlevel.d/S??}
        [ -f /var/lock/subsys/$subsys -o -f /var/lock/subsys/$subsys.init ] \
                && continue

        # If we're in confirmation mode, get user confirmation
        if [ -f /var/run/confirm ]; then
                confirm $subsys
                test $? = 1 && continue
        fi

        update_boot_stage "$subsys"
        # Bring the subsystem up.
        if [ "$subsys" = "halt" -o "$subsys" = "reboot" ]; then
                export LC_ALL=C
                exec $i start
        fi
        if LC_ALL=C egrep -q "^..*init.d/functions" $i \
                        || [ "$subsys" = "single" -o "$subsys" = "local" ]; then
                echo $i>>/var/tmp/process.txt
                $i start
                echo $i>>/var/tmp/process_end.txt
        else
                echo $i>>/var/tmp/process_self.txt
                action $"Starting $subsys: " $i start
                echo $i>>/var/tmp/process_self_end.txt
        fi
done

Then you can reboot the system, and check files /var/tmp/{process.txt,process_end.txt,process_self.txt,process_self_end.txt}. In one of the host, I found below entries:

[root@host1 tmp]# tail process.txt
/etc/rc3.d/S85gpm
/etc/rc3.d/S90crond
/etc/rc3.d/S90xfs
/etc/rc3.d/S91vncserver
/etc/rc3.d/S95anacron
/etc/rc3.d/S95atd
/etc/rc3.d/S95emagent_public
/etc/rc3.d/S97rhnsd
/etc/rc3.d/S98avahi-daemon
/etc/rc3.d/S98gcstartup

[root@host1 tmp]# tail process_end.txt
/etc/rc3.d/S85gpm
/etc/rc3.d/S90crond
/etc/rc3.d/S90xfs
/etc/rc3.d/S91vncserver
/etc/rc3.d/S95anacron
/etc/rc3.d/S95atd
/etc/rc3.d/S95emagent_public
/etc/rc3.d/S97rhnsd
/etc/rc3.d/S98avahi-daemon

So from here, we can see /etc/rc3.d/S98gcstartup tried start, but it took too long to finish. To make sure scripts in /etc/rc.local get executed and also the stuck script /etc/rc3.d/S98gcstartup get executed also, we can do this:

[root@host1 tmp]# mv /etc/rc3.d/S98gcstartup /etc/rc3.d/s98gcstartup
[root@host1 tmp]# vi /etc/rc.local

#!/bin/sh

touch /var/lock/subsys/local

#put your scripts here - begin

#put your scripts here - end

#put the stuck script here and make sure it's the last line
/etc/rc3.d/s98gcstartup start

After this, reboot the host and check whether scripts in /etc/rc.local got executed.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

resolved – sar -d failed with Requested activities not available in file

September 11th, 2015 Comments off

Today when I tried to get report for activity for each block device using "sar -d", error "Requested activities not available in file" prompted:

[root@test01 ~]# sar -f /var/log/sa/sa11 -d
Requested activities not available in file

To fix this, I did the following:

[root@test01 ~]# cat /etc/cron.d/sysstat
# run system activity accounting tool every 10 minutes
*/10 * * * * root /usr/lib64/sa/sa1 -d 1 1 #add -d. It was */10 * * * * root /usr/lib64/sa/sa1 1 1
# generate a daily summary of process accounting at 23:53
53 23 * * * root /usr/lib64/sa/sa2 -A

Later, move /var/log/sa/sa11 and run sa1 with "-d" to generate a new one:

[root@test01 ~]# mv /var/log/sa/sa11{,.bak}
[root@test01 ~]# /usr/lib64/sa/sa1 -d 1 1 #this generated /var/log/sa/sa11
[root@test01 ~]# /usr/lib64/sa/sa1 -d 1 1 #this put data into /var/log/sa/sa11

After this, the disk activity data could be retrieved:

[root@test01 ~]# sar -f /var/log/sa/sa11 -d
Linux 2.6.18-238.0.0.0.1.el5xen (slc03nsv) 09/11/15

09:26:04 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
09:26:22 dev202-0 10.39 0.00 133.63 12.86 0.00 0.06 0.06 0.07
09:26:22 dev202-1 10.39 0.00 133.63 12.86 0.00 0.06 0.06 0.07
09:26:22 dev202-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: dev202-0 10.39 0.00 133.63 12.86 0.00 0.06 0.06 0.07
Average: dev202-1 10.39 0.00 133.63 12.86 0.00 0.06 0.06 0.07
Average: dev202-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

For column "DEV", you can check mapping in /dev/*(dev202-2 is /dev/xvda2):

[root@test01 ~]# ls -l /dev/xvda2
brw-r----- 1 root disk 202, 2 Jan 26 2015 /dev/xvda2

Or you can add "-p" to sar which is simper

[root@test01 ~]# sar -f /var/log/sa/sa11 -d -p
Linux 2.6.18-238.0.0.0.1.el5xen (slc03nsv) 09/11/15

09:26:04 DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util
09:26:22 xvda 10.39 0.00 133.63 12.86 0.00 0.06 0.06 0.07
09:26:22 root 10.39 0.00 133.63 12.86 0.00 0.06 0.06 0.07
09:26:22 xvda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: xvda 10.39 0.00 133.63 12.86 0.00 0.06 0.06 0.07
Average: root 10.39 0.00 133.63 12.86 0.00 0.06 0.06 0.07
Average: xvda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

PS:

Here is more info about sysstat in Linux.

Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – nfsv4 Warning: rpc.idmapd appears not to be running. All uids will be mapped to the nobody uid

August 31st, 2015 Comments off

Today when we tried to mount a nfs share as NFSv4(mount -t nfs4 testnas:/export/testshare01 /media), the following message prompted:

Warning: rpc.idmapd appears not to be running.
All uids will be mapped to the nobody uid.

And I had a check of file permissions under the mount point, they were owned by nobody as indicated:

[root@testvm~]# ls -l /u01/local
total 8
drwxr-xr-x 2 nobody nobody 2 Dec 18 2014 SMKIT
drwxr-xr-x 4 nobody nobody 4 Dec 19 2014 ServiceManager
drwxr-xr-x 4 nobody nobody 4 Mar 31 08:47 ServiceManager.15.1.5
drwxr-xr-x 4 nobody nobody 4 May 13 06:55 ServiceManager.15.1.6

However, as I checked, rpcidmapd was running:

[root@testvm ~]# /etc/init.d/rpcidmapd status
rpc.idmapd (pid 11263) is running...

After some checking, I found it's caused by low nfs version and some missed nfs4 packages on the OEL5 boxes. You can do below to fix this:

yum -y update nfs-utils nfs-utils-lib nfs-utils-lib-devel sblim-cmpi-nfsv4 nfs4-acl-tools
/etc/init.d/nfs restart
/etc/init.d/rpcidmapd restart

If you are using Oracle SUN ZFS appliance, then please make sure to set on ZFS side anonymous user mapping to "root" and also Custom NFSv4 identity domain to the one in your env(e.g. example.com) to avoid NFS clients nobody owner issue.

resolved – yum Error performing checksum Trying other mirror and finally No more mirrors to try

August 27th, 2015 Comments off

Today when I was installing one package in Linux, below error prompted:

[root@testhost yum.repos.d]# yum list --disablerepo=* --enablerepo=yumpaas
Loaded plugins: rhnplugin, security
This system is not registered with ULN.
ULN support will be disabled.
yumpaas | 2.9 kB 00:00
yumpaas/primary_db | 30 kB 00:00
http://yumrepo.example.com/paas_oel5/repodata/b8e385ebfdd7bed69b7619e63cd82475c8bacc529db7b8c145609b64646d918a-primary.sqlite.bz2: [Errno -3] Error performing checksum
Trying other mirror.
yumpaas/primary_db | 30 kB 00:00
http://yumrepo.example.com/paas_oel5/repodata/b8e385ebfdd7bed69b7619e63cd82475c8bacc529db7b8c145609b64646d918a-primary.sqlite.bz2: [Errno -3] Error performing checksum
Trying other mirror.
Error: failure: repodata/b8e385ebfdd7bed69b7619e63cd82475c8bacc529db7b8c145609b64646d918a-primary.sqlite.bz2 from yumpaas: [Errno 256] No more mirrors to try.

The repo "yumpaas" is hosted on OEL 6 VM, which by default use sha2 for checksum. However, for OEL5 VMs(the VM running yum), yum uses sha1 by default. So I've WA this by install python-hashlib to extend yum's capability to handle sha2(python-hashlib from external repo EPEL).

[root@testhost yum.repos.d]# yum install python-hashlib

And after this, the problematic repo can be used. But to resolve this issue permanently without WA on OEL5 VMs, we should recreate the repo with sha1 for checksum algorithm(createrepo -s sha1).

resolved – passwd: User not known to the underlying authentication module

August 12th, 2015 Comments off

Today we met the following error when tried to change one user's password:

[root@test ~]# echo 2cool|passwd --stdin test
Changing password for user test.
passwd: User not known to the underlying authentication module

And after some searching work, we found it's caused by /etc/shadow file missing:

[root@test ~]# ls -l /etc/shadow
ls: /etc/shadow: No such file or directory

To generate the /etc/shadow file, use pwconv command:

[root@test ~]# pwconv

[root@test ~]# ls -l /etc/shadow
-r-------- 1 root root 1254 Aug 11 12:13 /etc/shadow

After this, we can reset password without issue:

[root@test ~]# echo mypass|passwd --stdin test
Changing password for user test.
passwd: all authentication tokens updated successfully.

Categories: IT Architecture, Linux, Systems, Unix Tags:

remove usb disk from LVM

July 29th, 2015 Comments off

On some servers, USB stick may become part of the LVM volume group. The usb is more prone to fail, which will cause big issues when they start. Also they work at a different speed than the drives, and this also causes performance issues.

For example, on one server, you can see below:

[root@test ~]# vgs
  VG      #PV #LV #SN Attr   VSize VFree
  DomUVol   2   1   0 wz--n- 3.77T    0

[root@test~]# pvs
  PV         VG      Fmt  Attr PSize PFree
  /dev/sdb1  DomUVol lvm2 a--  3.76T    0 #all PE allocated
  /dev/sdc1  DomUVol lvm2 a--  3.59G    0 #this is usb device

[root@test~]# lvs
  LV      VG      Attr   LSize Origin Snap%  Move Log Copy%  Convert
  scratch DomUVol -wi-ao 3.77T

[root@test ~]# df -h /scratch
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/DomUVol-scratch
                      3.7T  257G  3.3T   8% /scratch

[root@test~]# pvdisplay /dev/sdc1
  --- Physical volume ---
  PV Name               /dev/sdc1
  VG Name               DomUVol
  PV Size               3.61 GB / not usable 14.61 MB
  Allocatable           yes (but full)
  PE Size (KByte)       32768
  Total PE              115
  Free PE               0
  Allocated PE          115 #so physical extents are allocated on this usb device
  PV UUID               a8a0P5-AlCz-Cu5e-acC2-ldEQ-NPCn-kc4Du0

As you can see from above, as the USB device has PE allocated, so to remove that(pvreduce), we need first move PE from it to other PV. We can see the other PV has all space allocated(can also confirm from vgs above):

[root@test~]# pvdisplay /dev/sdb1
  --- Physical volume ---
  PV Name               /dev/sdb1
  VG Name               DomUVol
  PV Size               3.76 TB / not usable 30.22 MB
  Allocatable           yes (but full)
  PE Size (KByte)       32768
  Total PE              123360
  Free PE               0
  Allocated PE          123360
  PV UUID               5IyCgh-JsiV-EnpO-XKj4-yxNq-pRjI-d7LKGy

Here are the steps for taking usb device out of VG:

umount /scratch
#fsck -y /dev/mapper/DomUVol-scratch
lvreduce --size -5G /dev/mapper/DomUVol-scratch
resize2fs /dev/mapper/DomUVol-scratch

[root@test ~]# vgs
  VG      #PV #LV #SN Attr   VSize VFree
  DomUVol   2   1   0 wz--n- 3.77T 5.00G

[root@test ~]# pvs
  PV         VG      Fmt  Attr PSize PFree
  /dev/sdb1  DomUVol lvm2 a--  3.76T 1.41G
  /dev/sdc1  DomUVol lvm2 a--  3.59G 3.59G #PEs on the usb device are all freed, if not, use pvmove /dev/sdc1. More info is here about pvmove

[root@test ~]# pvdisplay /dev/sdc1
  --- Physical volume ---
  PV Name               /dev/sdc1
  VG Name               DomUVol
  PV Size               3.61 GB / not usable 14.61 MB
  Allocatable           yes
  PE Size (KByte)       32768
  Total PE              115
  Free PE               115
  Allocated PE          0
  PV UUID               a8a0P5-AlCz-Cu5e-acC2-ldEQ-NPCn-kc4Du0

[root@test ~]# vgreduce DomUVol /dev/sdc1
  Removed "/dev/sdc1" from volume group "DomUVol"

[root@test ~]# pvs
  PV         VG      Fmt  Attr PSize PFree
  /dev/sdb1  DomUVol lvm2 a--  3.76T 1.41G
  /dev/sdc1          lvm2 a--  3.61G 3.61G #VG column is empty for the usb device, this confirms the usb device is taken out of VG. You can run pvremove /dev/sdc1 to remove the pv.

PS:

  1. If you want to shrink lvm volume(lvreduce) on /, then you'll need go to linux rescue mode. Select "Skip" when the system prompts for the option of mounting / to /mnt/sysimage. Run "lvm vgchange -a y" first and then the other steps are more or less the same as above, but you'll need type "lvm" before any lvm command, such as "lvm lvs", "lvm pvs", "lvm lvreduce" etc.
  2. You can refer to this article about using vgcfgrestore to restore vg config from /etc/lvm/archive/.
  3. If having enough space on other PV, then use pvmove, refer to http://www.tldp.org/HOWTO/LVM-HOWTO/removeadisk.html

resolved – yum install Error: Protected multilib versions

June 30th, 2015 Comments off

Today when I tried to install firefox.i686 on Linux using yum, the following error occurred:

Protected multilib versions: librsvg2-2.26.0-14.el6.i686 != librsvg2-2.26.0-5.el6_1.1.x86_64
Error: Protected multilib versions: devhelp-2.28.1-6.el6.i686 != devhelp-2.28.1-3.el6.x86_64
Error: Protected multilib versions: ImageMagick-6.5.4.7-7.el6_5.i686 != ImageMagick-6.5.4.7-6.el6_2.x86_64
Error: Protected multilib versions: vte-0.25.1-9.el6.i686 != vte-0.25.1-8.el6_4.x86_64
Error: Protected multilib versions: polkit-gnome-0.96-4.el6.i686 != polkit-gnome-0.96-3.el6.x86_64
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest

To resolve this, just run yum update <package names>, then the problem will go away.

Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – Checking for glibc-devel-2.12-1.7-i686; Not found. Failed

May 5th, 2015 Comments off

Today when I tried to install Oracle EM Cloud Control 12c, below error prompted when pre-checking:

pre-check failed

So from above, we can see that it's complaining about missing package "glibc-devel-2.12-1.7-i686"(Checking for glibc-devel-2.12-1.7-i686; Not found. Failed). And I found there were glibc related packages on the system:

[root@testvm ~]# rpm -qa|grep glibc
glibc-common-2.12-1.149.el6_6.7.x86_64
glibc-devel-2.12-1.149.el6_6.7.x86_64
glibc-headers-2.12-1.149.el6_6.7.x86_64
glibc-2.12-1.149.el6_6.7.x86_64

But they were all x86_64 version, not the missing i686 one. So I determined to install i686 ones:

[root@testvm]# yum install -y glibc.i686 glibc-devel.i686 glibc-static.i686

After this, and press "Rerun", the check succeeded.

Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – file filelists.xml.gz [Errno 5] OSError: [Errno 2] No such file or directory [Errno 256] No more mirrors to try

April 8th, 2015 Comments off

Today below error prompted when running yum install some packages in linux:

file://localhost/tmp/common1/x86_64/redhat/50/base/ga/Server/repodata/filelists.xml.gz: [Errno 5] OSError: [Errno 2] No such file or directory: '/tmp/common1/x86_64/redhat/50/base/ga/Server/repodata/filelists.xml.gz'
Trying other mirror.
Error: failure: repodata/filelists.xml.gz from base: [Errno 256] No more mirrors to try.
You could try running: package-cleanup --problems
package-cleanup --dupes
rpm -Va --nofiles --nodigest

After some checking(yum clean all, download repo to /etc/yum.repos.d, etc), I finally found it's caused by the following entries in /etc/yum.conf:

[base]
name=Red Hat Linux - Base
baseurl=file://localhost/tmp/common1/x86_64/redhat/50/base/ga/Server

After I commented them, yum install can work now.

 

Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – ext3: No journal on filesystem on disk

March 23rd, 2015 Comments off

Today I met below error when trying to mount a disk:

[root@testvm ~]# mount /scratch
mount: wrong fs type, bad option, bad superblock on /dev/xvdb1,
missing codepage or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

First I ran fsck -y /dev/xvdb1, but after it's done, the issue was still there(sometimes fsck -y /dev/xvdb1 could resolve this though). So as it suggested, I ran a dmesg | tail:

[root@testvm scratch]# dmesg | tail
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period
ext3: No journal on filesystem on xvdb1
ext3: No journal on filesystem on xvdb1
ext3: No journal on filesystem on xvdb1
ext3: No journal on filesystem on xvdb1

So from here we can see that the root cause for mounting failure was "ext3: No journal on filesystem on xvdb1". I first ran "fsck -y /dev/xvdb1", and try mount again. But the issue was still there. So I tried with adding ext3 journal on that disk:

[root@testvm qgomsdc1]# tune2fs -j /dev/xvdb1
tune2fs 1.39 (29-May-2006)
Creating journal inode:

done
This filesystem will be automatically checked every 20 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.

After this, the mount succeeded.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

resolved – su: cannot set user id: Resource temporarily unavailable

January 12th, 2015 Comments off

When i try to log on as user "test", error occurred:

su: cannot set user id: Resource temporarily unavailable

I had a check of limits.conf:

[root@testvm ~]# cat /etc/security/limits.conf|egrep -v '^$|^#'
oracle   soft   nofile    131072
oracle   hard   nofile    131072
oracle   soft   nproc    131072
oracle   hard   nproc    131072
oracle   soft   core    unlimited
oracle   hard   core    unlimited
oracle   soft   memlock    50000000
oracle   hard   memlock    50000000
@svrtech    soft    memlock         500000
@svrtech    hard    memlock         500000
*   soft   nofile    131072
*   hard   nofile    131072
*   soft   nproc    131072
*   hard   nproc    131072
*   soft   core    unlimited
*   hard   core    unlimited
*   soft   memlock    50000000
*   hard   memlock    50000000

Then I had a check of the number of processes/threads with the maximum number of processes to see whether it's coming over the line:

[root@c9qa131-slcn03vmf0293 ~]# ps -eLF | grep test | wc -l
1026

So it's not exceeding. Then I had a check of open files:

[root@testvm ~]# lsof | grep aime | wc -l

6059

It's not exceeding 131072 either, then why the error "su: cannot set user id: Resource temporarily unavailable" was there? Actually the culprit was in file /etc/security/limits.d/90-nproc.conf:

[root@testvm ~]# cat /etc/security/limits.d/90-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

* soft nproc 1024
root soft nproc unlimited

After I modified 1024 to 131072, the issue gone away immediately.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

resolved – cssh installation on linux server

December 29th, 2014 Comments off

ClusterSSH can be used if you need controls a number of xterm windows via a single graphical console window, and you want to run commands interactively on multiple servers over an ssh connection. This guide will show the process to install clusterssh on a linux box from tarball.

At the very first, you should download cssh tarball App-ClusterSSH-4.03_04.tar.gz from sourceforge. You may need export proxy settings if it's needed in your env:

export https_proxy=http://my-proxy.example.com:80/
export http_proxy=http://my-proxy.example.com:80/
export ftp_proxy=http://my-proxy.example.com:80/

After the proxy setting, you can now get the package:

wget 'http://sourceforge.net/projects/clusterssh/files/latest/download'
tar zxvf App-ClusterSSH-4.03_04.tar.gz
cd App-ClusterSSH-4.03_04
cat README

Before installing, let's install some prerequisites packages:

yum install gcc libX11-devel gnome* -y
yum groupinstall "X Window System" -y
yum groupinstall "GNOME Desktop Environment" -y
yum groupinstall "Graphical Internet" -y
yum groupinstall "Graphics" -y

Now run "perl Build.PL" as indicated by README:

[root@centos-32bits App-ClusterSSH-4.03_04]# perl Build.PL
Can't locate Module/Build.pm in @INC (@INC contains: /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.8/i386-linux-thread-multi /usr/lib/perl5/5.8.8 .) at Build.PL line 5.
BEGIN failed--compilation aborted at Build.PL line 5.

As it challenged, you need install Module::Build.pm first. Let's use cpan to install that module.

Run "cpan" and enter "follow" when below info occurred:

Policy on building prerequisites (follow, ask or ignore)? [ask] follow

If you had already ran cpan before, then you can configure the policy as below:

cpan> o conf prerequisites_policy follow
cpan> o conf commit

Now Let's install Module::Build:

cpan> install Module::Build

After the installation, let's run "perl Build.PL" again:

[root@centos-32bits App-ClusterSSH-4.03_04]# perl Build.PL
Checking prerequisites...
  requires:
    !  Exception::Class is not installed
    !  Tk is not installed
    !  Try::Tiny is not installed
    !  X11::Protocol is not installed
  build_requires:
    !  CPAN::Changes is not installed
    !  File::Slurp is not installed
    !  File::Which is not installed
    !  Readonly is not installed
    !  Test::Differences is not installed
    !  Test::DistManifest is not installed
    !  Test::PerlTidy is not installed
    !  Test::Pod is not installed
    !  Test::Pod::Coverage is not installed
    !  Test::Trap is not installed

ERRORS/WARNINGS FOUND IN PREREQUISITES.  You may wish to install the versions
of the modules indicated above before proceeding with this installation

Run 'Build installdeps' to install missing prerequisites.

Created MYMETA.yml and MYMETA.json
Creating new 'Build' script for 'App-ClusterSSH' version '4.03_04'

As the output says, run "./Build installdeps" to install the missing packages. Make sure you're in GUI env(through vncserver maybe), as "perl Build.PL" has a step to test GUI.

[root@centos-32bits App-ClusterSSH-4.03_04]# ./Build installdeps

......

Running Mkbootstrap for Tk::Xlib ()
chmod 644 "Xlib.bs"
"/usr/bin/perl" "/usr/lib/perl5/5.8.8/ExtUtils/xsubpp" -typemap "/usr/lib/perl5/5.8.8/ExtUtils/typemap" -typemap "/root/.cpan/build/Tk-804.032/Tk/typemap" Xlib.xs > Xlib.xsc && mv Xlib.xsc Xlib.c
make[1]: *** No rule to make target `pTk/tkInt.h', needed by `Xlib.o'. Stop.
make[1]: Leaving directory `/root/.cpan/build/Tk-804.032/Xlib'
make: *** [subdirs] Error 2
/usr/bin/make -- NOT OK
Running make test
Can't test without successful make
Running make install
make had returned bad status, install seems impossible

Errors again, we can see it's complaining something about TK related thing. To resolve this, I manully installed the latest perl-tk module as below:

wget --no-check-certificate 'https://github.com/eserte/perl-tk/archive/master.zip'
unzip master
cd perl-tk-master
perl Makefile.PL
make
make install

After this, let's run "./Build installdeps" and "perl Build.PL" again which all went through good:

[root@centos-32bits App-ClusterSSH-4.03_04]# ./Build installdeps

[root@centos-32bits App-ClusterSSH-4.03_04]# perl Build.PL

And let's run ./Build now:

[root@centos-32bits App-ClusterSSH-4.03_04]# ./Build
Building App-ClusterSSH
Generating: ccon
Generating: crsh
Generating: cssh
Generating: ctel

And now "./Build install" which is the last step:

[root@centos-32bits App-ClusterSSH-4.03_04]# ./Build install

After installation, let's have a test:

[root@centos-32bits App-ClusterSSH-4.03_04]# echo 'svr testserver1 testserver2' > /etc/clusters

Now run 'cssh svr', and you'll get the charm!

clusterssh

PS: 

If you met error like below:

Can't connect to display `unix:1': No such file or directory at /usr/local/share/perl5/X11/Protocol.pm line 2264.

And you are connecting to vnc session like below:

root 3291 1 0 07:36 ? 00:00:02 /usr/bin/Xvnc :1 -desktop Yue-test:1 (root) -auth /root/.Xauthority -geometry 1600x900 -rfbwait 30000 -rfbauth /root/.vnc/passwd -rfbport 5901 -fp catalogue:/etc/X11/fontpath.d -pn

Then make sure to do below:

export DISPLAY=localhost:1.0

Categories: Clouding, IT Architecture, Linux, Systems, Unix Tags:

resolved – openssl error:0D0C50A1:asn1 encoding routines:ASN1_item_verify:unknown message digest algorithm

December 17th, 2014 Comments off

Today when I tried using curl to get url info, error occurred like below:

[root@centos-doxer ~]# curl -i --user username:password -H "Content-Type: application/json" -X POST --data @/u01/shared/addcredential.json https://testserver.example.com/actions -v

* About to connect() to testserver.example.com port 443

*   Trying 10.242.11.201... connected

* Connected to testserver.example.com (10.242.11.201) port 443

* successfully set certificate verify locations:

*   CAfile: /etc/pki/tls/certs/ca-bundle.crt

  CApath: none

* SSLv2, Client hello (1):

SSLv3, TLS handshake, Server hello (2):

SSLv3, TLS handshake, CERT (11):

SSLv3, TLS alert, Server hello (2):

error:0D0C50A1:asn1 encoding routines:ASN1_item_verify:unknown message digest algorithm

* Closing connection #0

After some searching, I found that it's caused by the current version of openssl(openssl-0.9.8e) does not support SHA256 Signature Algorithm. To resolve this, there are two ways:

1. add -k parameter to curl to ignore the SSL error

2. update openssl to "OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008", just try "yum update openssl".

# openssl version
OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008

3. upgrade openssl to at least openssl-0.9.8o. Here's the way to upgrade openssl: --- this may not be needed, try method 2 above

wget --no-check-certificate 'https://www.openssl.org/source/old/0.9.x/openssl-0.9.8o.tar.gz'
tar zxvf openssl-0.9.8o.tar.gz
cd openssl-0.9.8o
./config --prefix=/usr --openssldir=/usr/openssl
make
make test
make install

After this, run openssl version to confirm:

[root@centos-doxer openssl-0.9.8o]# /usr/bin/openssl version
OpenSSL 0.9.8o 01 Jun 2010

PS:

  • If you installed openssl from rpm package, then you'll find the openssl version is still the old one even after you install the new package. This is expected so don't rely too much on rpm:

[root@centos-doxer openssl-0.9.8o]# /usr/bin/openssl version
OpenSSL 0.9.8o 01 Jun 2010

Even after rebuilding rpm DB(rpm --rebuilddb), it's still the old version:

[root@centos-doxer openssl-0.9.8o]# rpm -qf /usr/bin/openssl
openssl-0.9.8e-26.el5_9.1
openssl-0.9.8e-26.el5_9.1

[root@centos-doxer openssl-0.9.8o]# rpm -qa|grep openssl
openssl-0.9.8e-26.el5_9.1
openssl-devel-0.9.8e-26.el5_9.1
openssl-0.9.8e-26.el5_9.1
openssl-devel-0.9.8e-26.el5_9.1

  • If you met error "Unknown SSL protocol error in connection to xxx" (curl) or "write:errno=104" (openssl) on OEL5, then it's due to curl/openssl comes with OEL5 does not support TLSv1.2 (you can run "curl --help|grep -i tlsv1.2" to confirm). You need manually compile & install curl/openssl to fix this. More info here. (you can softlink /usr/local/bin/curl to /usr/bin/curl, but can leave openssl as it is)
    • OpenSSL to /usr/local/ssl (won't overwirte default openssl)
      • wget --no-check-certificate 'https://www.openssl.org/source/openssl-1.0.2l.tar.gz'
        tar zxvf openssl-1.0.2l.tar.gz
        cd openssl-1.0.2l
        export PATH=/usr/bin:/u01/Allshared/JAVA7/jdk1.7.0_80/bin:/usr/sbin:/usr/bin:/usr/sbin:/usr/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin #to avoid /usr/local readonly error (autofs,perl "/usr/local/bin/perl5: error while loading shared libraries: libdb.so.3: cannot open shared object file: No such file or directory", then export PATH to exclude /usr/local/bin/)
        ./config -fpic shared && make && make install
        echo "/usr/local/ssl/lib" >> /etc/ld.so.conf
        ldconfig
        ls -l /usr/bin/openssl*
        openssl version
      • You will need specify openssl lib path if you compile other packages (such as python):
        • ./configure --prefix=/usr/local/ --enable-shared LDFLAGS="-L/usr/local/ssl/lib" #if you are using non-standard Lib path
  • Curl to /usr/local/curl/bin/curl
    • wget --no-check-certificate 'http://curl.haxx.se/download/curl-7.53.1.tar.gz'
      tar zxvf curl-7.53.1.tar.gz
      cd curl-7.53.1.tar.gz
      ./configure --prefix=/usr/local/curl --with-ssl=/usr/local/ssl LDFLAGS="-L/usr/local/ssl/lib" --disable-ldap
      ls -l /usr/bin/curl*
      mv /usr/bin/curl /usr/bin/curl.old; ln -s /usr/local/curl/bin/curl /usr/bin/curl
      curl -V
  • You will need specify openssl lib path if you compile other packages:
    • ./configure --prefix=/usr/local/ --enable-shared LDFLAGS="-L/usr/local/ssl/lib" #if you are using non-standard Lib path

output analysis of linux last command

December 9th, 2014 Comments off

Here's the output of "last|less" on my linux host:

root     pts/9        remote.example   Tue Dec  9 14:51   still logged in
testuser pts/2        :3               Tue Dec  9 14:49   still logged in
aime     pts/1        :2               Tue Dec  9 14:49   still logged in
root     pts/0        :1               Tue Dec  9 14:49   still logged in
testuser pts/13       remote.example   Tue Dec  9 10:48 - 10:52  (00:02)
reboot   system boot  2.6.23           Tue Dec  9 10:11          (04:39)
root     pts/11       10.182.120.179   Thu Dec  4 17:14 - 17:20  (00:06)
root     pts/11       10.182.120.179   Thu Dec  4 17:14 - 17:14  (00:00)
root     pts/10       10.182.120.179   Thu Dec  4 15:55 - 15:55  (00:00)
testuser pts/14       :3.0             Tue Dec  2 15:44 - 15:46  (00:01)
testuser pts/12       :3.0             Tue Dec  2 15:44 - 15:46  (00:01)
testuser pts/13       :3.0             Tue Dec  2 15:44 - 15:46  (00:01)
testuser pts/15       :3.0             Tue Dec  2 15:44 - 15:46  (00:01)
testuser pts/11       :3.0             Tue Dec  2 15:44 - 15:46  (00:01)
testuser pts/16       :3.0             Tue Dec  2 15:44 - 15:46  (00:01)
root     pts/10       10.182.120.179   Tue Dec  2 11:20 - 11:20  (00:00)
root     pts/7        10.182.120.179   Tue Dec  2 10:15 - down  (6+07:39)
root     pts/6        10.182.120.179   Tue Dec  2 10:15 - 17:55 (6+07:39)
root     pts/5        10.182.120.179   Tue Dec  2 10:15 - 17:55 (6+07:39)
root     pts/4        10.182.120.179   Tue Dec  2 10:15 - 17:55 (6+07:39)
root     pts/3        10.182.120.179   Tue Dec  2 10:15 - 17:55 (6+07:39)
root     pts/2        :1               Tue Dec  2 10:00 - down  (6+07:55)
aime     pts/1        :2               Tue Dec  2 10:00 - down  (6+07:55)
testuser pts/0        :3               Tue Dec  2 10:00 - down  (6+07:55)
reboot   system boot  2.6.23           Tue Dec  2 09:58         (6+07:56)

Here's some analysis:

  • User "reboot" is a pseudo-user for system reboot. Entries between two reboots are users who log on the system during two reboots. For info about login shells(.bash_profile) and interactive non-login shells(.bashrc), you can refer to here.
  • Here're columns meanings:

Column 1: User logged on

Column 2: The tty name after logging on

Column 3: Remote IP or hostname from which the user logged on. You can see ":1", ":2", ":3", that's vnc port number which vncserver are rendering against.

Column 4: Begin/End time of the session. If "still logged in", then means the user is still logged on; if there's value in parenthesis, then that's the total time of the logged on. For the latest "reboot"(red line 1), means the uptime till now; For the second "reboot"(red line 2), means the uptime between two reboots. Note however that this time is not always accurate, for example after system crash and unusual restart sequence. last calculates it as time between it and next reboot/shutdown.

 

Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – switching from Unbreakable Enterprise Kernel Release 2(UEKR2) to UEKR3 on Oracle Linux 6

November 24th, 2014 Comments off

As we can see from here, the available kernels include the following 3 for Oracle Linux 6:

3.8.13 Unbreakable Enterprise Kernel Release 3 (x86_64 only)
2.6.39 Unbreakable Enterprise Kernel Release 2**
2.6.32 (Red Hat compatible kernel)

On one of our OEL6 VM, we found that it's using UEKR2:

[root@testbox aime]# cat /etc/issue
Oracle Linux Server release 6.4
Kernel \r on an \m

[root@testbox aime]# uname -r
2.6.39-400.211.1.el6uek.x86_64

So how can we switch the kernel to UEKR3(3.8)?

If your linux version is 6.4, first do a "yum update -y" to upgrade to 6.5 and uppper, and then reboot the host, and follow steps below.

[root@testbox aime]# ls -l /etc/grub.conf
lrwxrwxrwx. 1 root root 22 Aug 21 18:24 /etc/grub.conf -> ../boot/grub/grub.conf

[root@testbox aime]# yum update -y

If your linux version is 6.5 and upper, you'll find /etc/grub.conf and /boot/grub/grub.conf are different files(for yum update one. If your host is OEL6.5 when installed, then /etc/grub.conf should be softlink too):

[root@testbox ~]# ls -l /etc/grub.conf
-rw------- 1 root root 2356 Oct 20 05:26 /etc/grub.conf

[root@testbox ~]# ls -l /boot/grub/grub.conf
-rw------- 1 root root 1585 Nov 23 21:46 /boot/grub/grub.conf

In /etc/grub.conf, you'll see entry like below:

title Oracle Linux Server Unbreakable Enterprise Kernel (3.8.13-44.1.3.el6uek.x86_64)
root (hd0,0)
kernel /vmlinuz-3.8.13-44.1.3.el6uek.x86_64 ro root=/dev/mapper/vg01-lv_root rd_LVM_LV=vg01/lv_root rd_NO_LUKS rd_LVM_LV=vg01/lv_swap LANG=en_US.UTF-8 KEYTABLE=us console=hvc0 rd_NO_MD SYSFONT=latarcyrheb-sun16 rd_NO_DM rhgb quiet
initrd /initramfs-3.8.13-44.1.3.el6uek.x86_64.img

What you'll need to do is just copying the entries above from /etc/grub.conf to /boot/grub/grub.conf(make sure /boot/grub/grub.conf not be a softlink, else you may met error "Boot loader didn't return any data"), and then reboot the VM.

After rebooting, you'll find the kernel is now at UEKR3(3.8).

PS:

If you find the VM is OEL6.5 and /etc/grub.conf is a softlink to /boot/grub/grub.conf, then you could do the following to upgrade kernel to UEKR3:

1. add the following lines to /etc/yum.repos.d/public-yum-ol6.repo:

[public_ol6_UEKR3]
name=UEKR3 for Oracle Linux 6 ($basearch)
baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/UEKR3/latest/$basearch/
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
gpgcheck=1
enabled=1

2. List and install UEKR3:

[root@testbox aime]# yum list|grep kernel-uek|grep public_ol6_UEKR3
kernel-uek.x86_64 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-debug.x86_64 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-debug-devel.x86_64 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-devel.x86_64 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-doc.noarch 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-firmware.noarch 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-headers.x86_64 3.8.13-26.2.4.el6uek public_ol6_UEKR3

[root@testbox aime]# yum install -y kernel-uek* --disablerepo=* --enablerepo=public_ol6_UEKR3

3. Reboot

resolved – auditd STDERR: Error deleting rule Error sending enable request (Operation not permitted)

September 19th, 2014 Comments off

Today when I try to restart auditd, the following error message prompted:

[2014-09-18T19:26:41+00:00] ERROR: service[auditd] (cookbook-devops-kernelaudit::default line 14) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of /sbin/service auditd restart ----
STDOUT: Stopping auditd: [  OK  ]
Starting auditd: [FAILED]
STDERR: Error deleting rule (Operation not permitted)
Error sending enable request (Operation not permitted)
---- End output of /sbin/service auditd restart ----
Ran /sbin/service auditd restart returned 1

After some reading of manpage auditd, I realized that when audit "enabled" was set to 2(locked), any attempt to change the configuration in this mode will be audited and denied. And that maybe the reason of "STDERR: Error deleting rule (Operation not permitted)", "Error sending enable request (Operation not permitted)". Here's from man page of auditctl:

-e [0..2] Set enabled flag. When 0 is passed, this can be used to temporarily disable auditing. When 1 is passed as an argument, it will enable auditing. To lock the audit configuration so that it can't be changed, pass a 2 as the argument. Locking the configuration is intended to be the last command in audit.rules for anyone wishing this feature to be active. Any attempt to change the configuration in this mode will be audited and denied. The configuration can only be changed by rebooting the machine.

You can run auditctl -s to check the current setting:

[root@centos-doxer ~]# auditctl -s
AUDIT_STATUS: enabled=1 flag=1 pid=3154 rate_limit=0 backlog_limit=320 lost=0 backlog=0

And you can run auditctl -e <0|1|2> to change this attribute on the fly, or you can add -e <0|1|2> in /etc/audit/audit.rules. Please note after you modify this, a reboot is a must to make this into effect.

PS:

Here's more about linux audit.

resolved – Permission denied even after chmod 777 world readable writable

September 19th, 2014 Comments off

Several team members asked me that when they want to change to some directories or read some files ,the system reported error "Permission denied". Even after setting world writable(chmod 777), the error was still there:

-bash-3.2$ cd /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs
-bash: cd: /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs: Permission denied

-bash-3.2$ cat /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out
cat: /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out: Permission denied

-bash-3.2$ ls -l /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out
-rwxrwxrwx 1 oracle oinstall 1100961066 Sep 19 07:37 /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out

In summary, if you want to read some file(e.g. wls_sdi1.out) under some directory(e.g. /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs), then except for "read bit" set on that file(chmod +r wls_sdi1.out), it's also needed that all parent directories of that file(/u01, /u01/local, /u01/local/wls, ......, /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs) have both "read bit" & "execute bit" set(you can check it by ls -ld <dir name>):

chmod +r wls_sdi1.out #first set "read bit" on the file
chmod +r /u01; chmod +x /u01; chmod +r /u01/local; chmod +x /u01/local; <...skipped...>chmod +r /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs; chmod +x /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs; #then set both "read bit" & "execute bit" on all parent directories

And at last, if you can log on as the file owner, then everything will be smooth. For /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out, it's owned by oracle user. So you can try log on as oracle user and do the operations.

PS:

  • -rw-rw-r--+ extended security information, + means ACL set (getfacl, setfacl). If it's NFS and 777 not working, then make sure no_root_squash enabled. For ZFS share, make sure to set on ZFS (User/Group & Anonymous user mapping -> root/root, Permission & Root directory Access rwx/rwx/rwx), and /etc/fstab "rsize=32768,wsize=32768,hard,nolock,timeo=14,noacl,intr,mountvers=3,vers=3" (may need reboot to fix permission issue)
  • -rw-rw-r--. means SELinux context set, to remove it, try find / -exec setfattr -h -x security.selinux {} \;
Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

crontab cronjob failed with date single apostrophe date +%d-%b-%Y-%H-%M on linux

August 4th, 2014 Comments off

I tried to creat one linux cronjob today, and want to note down date & time when the job was running, and here's the content:

echo '10 10 * * 1 root cd /var/log/ovm-manager/;tar zcvf oc4j.log.`date +%m-%d-%y`.tar.gz oc4j.log;echo "">/var/log/ovm-manager/oc4j.log' > /etc/cron.d/oc4j

However, this entry failed to run, and when check log in /var/log/cron:

Aug 4 06:24:01 testhost crond[1825]: (root) RELOAD (cron/root)
Aug 4 06:24:01 testhost crond[1825]: (root.bak) ORPHAN (no passwd entry)
Aug 4 06:25:01 testhost crond[28376]: (root) CMD (cd /var/log/ovm-manager/;tar zcvf oc4j.log.`date +)

So, the command was intercepted and that's the reason for the failure.

Eventually, I figured out that cron treats the % character specially (it is turned into a newline in the command). You must precede all % characters with a \ in a crontab file, which tells cron to just put a % in the command. And here's the updated version:

echo '10 10 * * 1 root cd /var/log/ovm-manager/;tar zcvf oc4j.log.`date +\%m-\%d-\%y`.tar.gz oc4j.log;echo "">/var/log/ovm-manager/oc4j.log' > /etc/cron.d/oc4j

This time, the job got ran successfully:

Aug 4 06:31:01 testhost crond[1825]: (root) RELOAD (cron/root)
Aug 4 06:31:01 testhost crond[1825]: (root.bak) ORPHAN (no passwd entry)
Aug 4 06:31:01 testhost crond[28503]: (root) CMD (cd /var/log/ovm-manager/;tar zcvf oc4j.log.`date +%m-%d-%y`.tar.gz oc4j.log;echo "">/var/log/ovm-manager/oc4j.log)

PS:

  1. You should use in cron - "su - oracle -c <cmd>" if the script bound to special user
  2. More on here http://stackoverflow.com/questions/1486088/cron-fails-on-single-apostrophe
Categories: IT Architecture, Linux, Systems, Unix Tags:

linux process accounting set up

July 8th, 2014 Comments off

Ensure package psacct is installed and make it boot with system:

rpm -qa|grep -i psacct
chkconfig psacct on
service psacct start

Here're some useful commands

[root@qg-dc2-tas_sdi ~]# ac -p #Display time totals for each user
emcadm 0.00
test1 2.57
aime 37.04
oracle 32819.22
root 12886.86
testuser 1.47
total 45747.15

[root@qg-dc2-tas_sdi ~]# lastcomm testuser #Display command executed by user testuser
top testuser pts/5 0.02 secs Fri Jul 4 03:59
df testuser pts/5 0.00 secs Fri Jul 4 03:59

[root@qg-dc2-tas_sdi ~]# lastcomm top #Search the accounting logs by command name
top testuser pts/5 0.03 secs Fri Jul 4 04:02

[root@qg-dc2-tas_sdi ~]# lastcomm pts/5 #Search the accounting logs by terminal name pts/5
top testuser pts/5 0.03 secs Fri Jul 4 04:02
sleep X testuser pts/5 0.00 secs Fri Jul 4 04:02

[root@qg-dc2-tas_sdi ~]# sa |head #Use sa command to print summarizes information(e.g. the number of times the command was called and the system resources used) about previously executed commands.
332 73.36re 0.03cp 8022k
33 8.76re 0.02cp 7121k ***other*
14 0.02re 0.01cp 26025k perl
7 0.00re 0.00cp 16328k ps
49 0.00re 0.00cp 2620k find
42 0.00re 0.00cp 13982k grep
32 0.00re 0.00cp 952k tmpwatch
11 0.01re 0.00cp 13456k sh
11 0.00re 0.00cp 2179k makewhatis*
8 0.01re 0.00cp 2683k sort

[root@qg-dc2-tas_sdi ~]# sa -u |grep testuser #Display output per-user
testuser 0.00 cpu 14726k mem sleep
testuser 0.03 cpu 4248k mem top
testuser 0.00 cpu 22544k mem sshd *
testuser 0.00 cpu 4170k mem id
testuser 0.00 cpu 2586k mem hostname

[root@qg-dc2-tas_sdi ~]# sa -m | grep testuser #Display the number of processes and number of CPU minutes on a per-user basis
testuser 22 8.18re 0.00cp 7654k

Categories: IT Architecture, Linux, Systems, Unix Tags:

Enable NIS client on linux host

July 2nd, 2014 1 comment

After you set up NIS server, you need set up NIS client. Here's the steps for enabling NIS client on linux box.

Ensure required packages are installed

rpm -qa|egrep 'yp-tools|ypbind|portmap'

Edit /etc/sysconfig/network

NISDOMAIN=example.com

Edit /etc/yp.conf
domain example.com server 10.229.169.88
domain example.com server 10.229.192.99

Set NIS domain-name

domainname example.com
ypdomainname example.com

Set /etc/nsswitch.conf

passwd: files nis
shadow: files nis
group: files nis
hosts: files dns nis
bootparams: nisplus [NOTFOUND=return] files
ethers: files
netmasks: files
networks: files
protocols: files
rpc: files
services: files
netgroup: nisplus
publickey: nisplus
automount: files nisplus
aliases: files nisplus
sudoers: files nis

Make sure the portmap service is running:

service portmap start

chkconfig portmap on

Start ypbind service:

service ypbind start
chkconfig ypbind on

Test it out:

rpcinfo -u localhost ypbind

ypcat passwd|egrep 'username'

If you want to set up sudo privileges for NIS users, then you can refer to this article resolved – /etc/sudoers: syntax error near line 10

PS:

  • If there's firewall between Linux NIS clients and NIS servers, then you should not startup ypbind(chkconfig ypbind off; service ypbind stop), if you startup ypbind, then the box will try to connect to NIS servers without stopping. Your linux box will get stuck and will take a long time for you to log on even as root. This is rule of thumb.
  • Here's about ypwhich:

#a list of available maps

[root@testvm ~]# ypwhich -m

networks.byaddr dcsun2
netgroup.byuser dcsun2
ethers.byaddr dcsun2-new
ypservers dcsun2
hosts.byaddr dcsun2
auto.utils ap101nis
mail.byaddr dcsun2
passwd.byname dcsun2
passwd.byuid dcsun2
protocols.bynumber dcsun2-new
stnis st-yp
group.bygid dcsun2
networks.byname dcsun2
tnsnames dcsun2-new
ethers.byname dcsun2-new
netmasks.byaddr st-yp
hosts.byname dcsun2
auto_home st-yp
auto_home_slc st-yp
printers.conf.byname st-yp
mail.aliases dcsun2
protocols.byname dcsun2-new
bootparams dcsun2-new
rpc.bynumber dcsun2-new
timezone.byname st-yp-refresh
netid.byname dcsun2
group.byname dcsun2
auto_ade st-yp-refresh
netgroup dcsun2
publickey.byname dcsun2-new
netgroup.byhost dcsun2
services.byname dcsun2-new
auto_home_appsdev st-yp

#Display the map nickname translation table, in /var/yp/nicknames

[root@testvm ~]# ypwhich -x

Use "ethers" for map "ethers.byname"
Use "aliases" for map "mail.aliases"
Use "services" for map "services.byname"
Use "protocols" for map "protocols.bynumber"
Use "hosts" for map "hosts.byname"
Use "networks" for map "networks.byaddr"
Use "group" for map "group.byname"
Use "passwd" for map "passwd.byname"

  • TBC
Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – /etc/sudoers: syntax error near line 10

July 2nd, 2014 Comments off

When using /usr/sbin/visudo, after modification, errors occurred(you can run visudo -c for manually check):

>>> /etc/sudoers: syntax error near line 10 <<<

Here's line 10:

User_Alias Users_SDITAS = username1, username2

Then I changed it as following:

User_Alias USERS_SDITAS = username1, username2

And now everything is ok. So this means that the alias name must all be uppercase.

PS:
1. Here's the explanation about User_Alias Users_SDITAS = username1, username2

The first part is the user,
The second is the terminal from where the user can use sudo command,
The third part is which users he may act as,
The last one, is which commands he may run when using sudo.
For example, root ALL=(ALL) ALL, means the root user can execute from ALL terminals, acting as ALL (any) users, and run ALL (any) command. And USERS_SDITAS ALL=(oracle) NOPASSWD:SETENV: CMD_MIGRATIONDC1DC3 means users in group USERS_SDITAS can execute from ALL terminals, acting as oracle user, and run commands in group CMD_MIGRATIONDC1DC3. (sudo -E -u oracle <command>, -E will pass invoking users env variables to target user if SETENV tag is added to sudo commands in /etc/sudoers. You'll get error message "sudo: sorry, you are not allowed to preserve the environment" if you did not add SETENV tag in /etc/sudoers. You can run sudo -l or sudo -ll to get a list of privilege commands for you or for others if you run sudo -l -U <username> )

2. One sample of /etc/sudoers configuration in linux(use visudo to edit, as visudo can check for errors after modification. You may need set "echo 'export PATH=/usr/bin:$PATH' >> /etc/profile" in some circumstances so that sudo will be /usr/bin/sudo):

Defaults logfile=/var/log/sudo.log

Defaults always_set_home #switched to target user's home directory when running sudo. Note that HOME is already set when the the env_reset option is enabled, so always_set_home is only effective for configurations where either env_reset is disabled(Defaults !env_reset) or HOME is present in the env_keep list(Defaults env_keep += HOME). This flag is off by default.
Host_Alias HOSTS_MIGRATIONDC1DC3 = slcn06vmf0012, slcn06vmf0013
Cmnd_Alias CMD_MIGRATIONDC1DC3 = /u01/local/wls/user_projects/domains/base_domain/bin/tasctl, /u01/shared/wls/Oracle_SDI1/sdictl/sdictl.sh
User_Alias USERS_SDITAS =username1, username2
USERS_SDITAS ALL=(ALL) NOPASSWD: /bin/su - oracle #users in USERS_SDITAS group can now sudo su - oracle without asking for a password
oracle ALL=(ALL) NOPASSWD:SETENV: CMD_MIGRATIONDC1DC3 #oracle user can run all commands in commands group CMD_MIGRATIONDC1DC3.

3. To check  whether some NIS users are using/bin/false shell(means they can not log on the host by ssh), use the following commands:

ypcat passwd|awk -F: '{if($1 ~ /^username1$|^username2$/) print}'|grep false

 4. To disallow some commands, you can check below(for disabling su to root):

  username='user1'
password=`date +%s | sha256sum | base64 | head -c 9 ; echo`
echo "Username is $username and Password is $password"
useradd $username;mkdir -p /home/$username;chown $username:$username /home/$username;chmod 755   /home/$username;echo $password|passwd --stdin $username
set +H
echo "$username ALL=(ALL) NOPASSWD:SETENV:ALL, !/usr/bin/passwd, !/bin/su root , !/bin/su - root, !/bin/su -, !/bin/bash, !/bin/sh, !/bin/tcsh, !/bin/csh, !/bin/ksh" >> /etc/sudoers
visudo -c

Categories: IT Architecture, Linux, Systems, Unix Tags: ,

Resolved – rm cannot remove some files with error message “Device or resource busy”

June 11th, 2014 1 comment

If you meet problem when remove one file on linux with below error message:

[root@test-host ~]# rm -rf /u01/shared/*
rm: cannot remove `/u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001': Device or resource busy
rm: cannot remove `/u01/shared/WLS/oracle_common/modules/oracle.jrf_11.1.1/.nfs0000000000005c7a00000002': Device or resource busy
rm: cannot remove `/u01/shared/WLS/OracleHome/soa/modules/oracle.soa.fabric_11.1.1/.nfs0000000000006bcf00000003': Device or resource busy

Then it means that some progresses were still referring to these files. You have to stop these processes before remove these files. You can use linux command lsof to find the processes using specific files:

[root@test-host ~]# lsof |grep nfs0000000000004abf00000001
java 2956 emcadm mem REG 0,21 1095768 19135 /u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001 (slce49sn-nas:/export/C9QA123_DC1/tas_central_shared)
java 2956 emcadm 88r REG 0,21 1095768 19135 /u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001 (slce49sn-nas:/export/C9QA123_DC1/tas_central_shared)

So from here you can see that processe with PID 2956 is still using file /u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001.

However, some systems have no lsof installed by default. Then you can install it or by using the alternative one "fuser":

[root@test-host ~]# fuser -cu /u01/shared/WLS/oracle_common
/u01/shared/WLS/oracle_common: 2956m(emcadm) 7358c(aime)

Then you can see also that progresses with PIDs 2956 and 7358 are referring to the directory /u01/shared/WLS/oracle_common.

so you'll need stop the process first by killing it(or stop it using the processes own stop() method if defined):

kill -9 2956

After that, you can try remove the files again, should be ok this time.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

avoid putty ssh connection sever or disconnect

January 17th, 2014 2 comments

After sometime, ssh will disconnect itself. If you want to avoid this, you can try run the following command:

while [ 1 ];do echo hi;sleep 60;done &

This will print message "hi" every 60 seconds on the standard output.

PS:

You can also set some parameters in /etc/ssh/sshd_config, you can refer to http://www.doxer.org/make-ssh-on-linux-not-to-disconnect-after-some-certain-time/

hpux tips

June 30th, 2013 Comments off
tusc #like truss or strace
lsdev
swapinfo -tm #memory usage
/var/adm/syslog/syslog.log, /etc/shutdownlog,
/etc/rc.log
/var/adm/syslog/syslog.log #like /var/log/messages
/etc/shutdownlog
/var/adm/crash/crash.X
/etc/rc.config.d/netconf #the interfaces which are started at boot up
/opt/fcms/bin/fcdutil /dev/fcd0 #HP-ux HBA and driver info
swlist -l product | grep "Fibre Channel Driver" #HP-UX
/usr/sbin/swlist and swinstall
bdf #HP, report number of free disk blocks
Extend the logical volume: lvextend –L <new LV size in MB> /dev/vgxx/lvolXX
Extend the filesystem to use the space added: fsadm –F vxfs –b <new size in 1 KB sectors> <mount point>
/opt/fcms/bin/fcdutil /dev/fcd0 | grep "World Wide Name"
dlmsetconf, dlmcfgmgr
hrdconf #HP
model #find out what model of machine we're on, like 9000/800/rp4440
/opt/ignite/bin/print_manifest #To display system information and configuration(model, memory, CPU, Storage, partitions, I/O devices, software installed, kernel parameters, ip address)
/usr/sbin/lanscan #lists all network adapters, deprecated(after this, use ifconfig <name of NIC> to check details). related commands lanadmin, linkloop, lan, nwmgr<this is recommended>
lanadmin -x 0 #PPA number
ioscan #HP, scan I/O system, scan newly added disks, check processor type etc
ioscan -f #all devices
ioscan –fknC fc #list HBA devices
/opt/fcms/bin/fcmsutil #Fibre Channel Mass Storage Utility Command, fcmsutil /dev/tdX -> Display HBA details
/sbin/rc3.d #run levels, All the scripts should take the appropriate action depending on the argument given. The stop script for a subsystem should be in the rc directory one run level below its start script, e.g. if the start script is in rc3.d then the stop script should be in rc2.d
/stand   #kernel and kernel configuration files
/usr/bin/bdf -l #FS, like df -k
/usr/sbin/sam #system admin tool
/usr/sbin/fsadm #linux belongs to lvm2(JBOD - Raid0)
/usr/sbin/{cstm,xstm,mstm} #Support Tools Manager,
/sbin/ipf #rules in /etc/opt/ipf/ipf.conf, ipf –Fa –f /etc/opt/ipf/ipf.conf to re-read rules file
/usr/lbin/modprpw #To unlock the account (if TCB is used) use: /usr/lbin/modprpw -l -k <loginid>
/opt/perf/bin/extract #performance monitoring
/etc/pam.conf
/etc/nsswitch.conf
/etc/opt/ldapux/ldapux_client.conf
/opt/ldapux/config/setup
nsquery passwd liandy
A Guide to HP-UX Document Collections HP documents
Categories: IT Architecture, Systems, Unix Tags:

AIX tips

June 30th, 2013 Comments off
fuser -cuxk /oracle #kill all the process using filesystem /oracle
procstack #show current stack of a process
bootlist -m normal -o          # Lists the current bootlist
bootlist -m normal cd0 hdisk0  # To set cd0 and hdisk0 as first and second boot devices
bootlist -m service cd0 rmt0   # To change the bootlist for service mode
alog -L
alog -o -t boot
alog -L -t boot #find out the properties of boot log file
##Device Configuration Database(Predefined, Customized)
01. Available  - Device is ready and can be used
02. Defined    - Device is unavailable
03. Unknown    - Undefined
04. Stopped    - Configured but unavailable
lsdev
      -C  to list customized database
      -P  to list predefined database
      -c (class)
      -t (type)
      -s (subtype)
To list all customised devices ie installed
 # lsdev -C
To list all the Hard Drives in a system
 # lsdev -Cc disk
To list all the adapters in a sytem
 # lsdev -Cc adapter
lscfg -v  #list all installed devices in detail
lscfg -vpl fcs0<ent0> #find out the WWN, FRU #, firmware level of fibre adapter fcs0
entstat -d ent0 #link status, link speed and mac address and statistics of an Ethernet adapter ent0
##Setting multiple IP address for a single network card
 # ifconfig lo0 alias 195.60.60.1
 # ifconfig en0 alias <IPadress> netmask <net_mask>
/etc/rc.net, /etc/rc.tcpip #make the above permanent
lsattr -El ent0 -a media_speed -R #find out the possible media_speed values for ethernet card ent0
lsattr -El mem0 #find out the effective attribute of a device "mem0"
lsattr -El sys0 #list the defaults in the pre-defined db for device ent0
To change the maximum number of processes allowed per user
Find out the valid range of values using lsattr command
 # lsattr -l sys0 -a maxuproc -R
 40...131072 (+1)
Change the maxuproc value using chdev command
 # chdev -l sys0 -a maxuproc=10000
rmdev -l (device) -d #delete the device
To delete a static route manually
Syntax:- chdev -l inet0 -a delroute=<net>,<destination_address>,<Gate_way_address>,<Subnet_mask>
 # chdev -l inet0 -a delroute='net','0.0.0.0','172.26.160.2'
To change the IP address of an interface manually
 # chdev -l en0 -a netaddr=192.168.123.1 -a netmask=255.255.255.0 -a state=up
To set the IP address initially
 # mktcpip -h <hostname> -a <ipaddress> -m <subnet_mask> -i <if_name> -n <NameServer_address>
   -d <domain_name> -g <gateway_address> -A no
##add device to system
To define a tape device
 # mkdev -d -c tape -t 8mm -s scsi -p scsi0  -w 5,0
To make the predefined rmt0 tape to available status
 # mkdev -l rmt0
##configure new devices using cfgmgr
cfgmgr -l fcs0 #configure detected devices attached to the fcs0 adapter
cfgmgr -i /tmp/drivers #cfgmgr -i /tmp/drivers
getconf -a
prtconf -c/m/s
bootinfo -K
mksysb/savevg/restore
sysdumpdev/sysdumpstart/snap/kdb
swapon/swapoff/lsps/chps/mkps/rmps, /etc/swapspaces, /etc/filesystems
LVM - lsvg/lspv/mkvg/mklv/logform/crfs/chfs/extendvg/mklvcopy/syncvf/bosboot/synclvodm/chpv
NIM - Network Installation Management
/etc/netsvc.conf #Name resolution order
##no command is used to change the network tuning parameters. ioo for IO tuning(aio, asynchronous IO), vmo for virtual memory manager parameters
To list the current network parameters / network options
 # no  -a
To enable IP forwarding
 # no -o "ipforwarding=1"
To make ipforwarding=1 permanent now and after reboot
 # no -p -o ipforwarding=1
###/etc/tunables/xxx, tuncheck/tunsave/tunrestore/tundefault
startsrc/lssrc, iptrace, tcpdump #The startsrc command sends the System Resource Controller (SRC) a request to start a subsystem or a group of subsystems, or to pass on a packet to the subsystem that starts a subserver.
##ODM, object data manager
/etc/objrepos
/usr/lib/objrepos
/usr/share/lib/objrepos
odmget CuDv #list all records with an Object Class CuDv
odmget -q "name=sys0 and attribute=maxuproc" CuAt
svmon -G #memory. pin(frames that cannot be swapped), pg space(paging space, ie swap)
pagesize
svmon -P 13548 -i 1 2 #monitor memory leak by  looking for processes whose working segment continually grows
trcon/filemon/trcstop #Most Active Logical/physical Volumes, most active Files
rmss #a means to simulate different sizes of real memory that are smaller than your actual machine
netpmon #network monitoring
##package management
oslevel -r/s/l xxxx/-g/-rq
###PATH
/etc/objrepos
/usr/lib/objrepos
/usr/share/lib/objrepos
lslpp -l [software name]
lslpp -f <fileset name> #display the names of all the files of fileset
lslpp -w /usr/sbin/nfsd #which fileset a file belongs to
##NFS
service portmap start
service nfs start
showmount -e localhost
/var/lib/nfs/
/etc/exports #/backup/downloads *(sync,ro,root_squash,wdelay), exportfs -a, exprotfs *:/backup/downloads
mount -fv -t nfs <xx> <dir> #check ports used
lslpp -ha #installation history of filesets
To list all installable software in media /dev/cd0
 installp [-L|-l] -d /dev/cd0
To cleanup all failed installtion
 installp -C
To install bos.net software (apply and commit) package with all pre-requisites from directory /tmp/net
 installp -acgx -d /tmp/net bos.net
To commit teh applied updates
 installp -cgx all
To remove bos.net package
 installp -ug bos.net
To find out whether a Fix is installed or not
 # instfix -i -k <APAR Number>
To list all the fixes that are installed on your system
 # instfix -i -v
To list filesets which are lesser than the specified maintenance level
 # instfix -ciqk 5100-04_AIX_ML | grep ":-:"
To install all filesets associated with fix Ix38794 from the tape
 # instfix  -k Ix38794  -d /dev/rmt0
To Display the entire list of fixes present on the media
 # instfix -T -d /dev/cd0
To confirm the AIX preventive maintenance level on your system
 # instfix -i | grep ML
 All filesets for 5.0.0.0_AIX_ML were found.
 All filesets for 5.1.0.0_AIX_ML were found.
 All filesets for 5.1.0.0_AIX_ML were found.
 All filesets for 5100-01_AIX_ML were found.
 All filesets for 5100-02_AIX_ML were found.
Updating the software to the latest level
01. Using smit
    # smit update_all
02. To update all filesets in a system using command line
    a. Create the list of filesets installed
       # lslpp -Lc | awk -F: '{print $2}'| tail -n +2 > /tmp/lslpp
    b. Update the softwares using installp command
       # installp -agxYd /dev/cd0 -e /tmp/<exclude_list> -f /tmp/lslpp
Another way of updating all the filesets
 # /usr/lib/instl/sm_inst installp_cmd  -acgNXY -d <localtion_of_updates> -f '_update_all'
For not committing and saving all replaced files
 # /usr/lib/instl/sm_inst installp_cmd  -agX -d <localtion_of_updates> -f '_update_all'
To list all the installed efixes on a system
 # emgr -l
To install a efix IY93496.070302.epkg.Z in /mnt directory
 # emgr -e /mnt/IY93496.070302.epkg.Z
inutoc
The inutoc command creates the .toc file in Directory. If a .toc file already exists, it is recreated with new information. The inutoc command adds table of contents entries in the .toc file for every installation image in Directory.
The installp command and the bffcreate command call this command automatically upon the creation or use of an installation image in a directory without a .toc file
To create a .toc file for the /tmp/images directory, enter:
 # inutoc /tmp/images
bffcreate
The bffcreate command creates an installation image file in backup file format (bff) to support software installation operations. It creates an installation image file from an installation image file on the specified installation media
To create an installation image file from the bos.net software package on the tape in the /dev/rmt0 tape drive and use /var/tmp as the working directory, type:
 # bffcreate  -d /dev/rmt0.1 -w /var/tmp bos.net
##security
/etc/security/environ
/etc/security/group
/etc/security/lastlog
/etc/security/limits
/etc/security/login.cfg
/usr/lib/security/mkuser.default
/etc/security/passwd
/etc/security/portlog
/etc/security/user
/etc/security/failedlogin
/etc/security/ldap/ldap.cnf
chsec -f /etc/security/limits -s joe -a cpu=3600 #change the CPU time limit of user joe to 1 hour
chuser rlogin=true smith #enable user smith to access this system remotely
pwdadm -c user1 # To reset the ADMCHG flag for the user user1<forces the user to change the password the next time a login command or an su command is given for the user>
who -a /etc/security/failedlogin # read failed login attempts
##LPAR and HMC<logical partition and hardware management console>
lsslot/hmcshutdown/chsysstate/lsrsrc/smtctl/vtmenu/mkvterm/rmvterm/lssysconff/lssysconn/ssysconn/lssyscfg/lsled/chled/lparstat -i/chhmc/mkvdev
##HACMP
HACMP Daemon
01. clstrmgr
02. clinfo
03. clmuxpd
04. cllockd
cllsgrp/clshowres/clRGmove/clfindres/clRGinfo/cldump/varyonvg/importvg/chvg/cllsappmon/clclear
##Storage
fcstat -D fcs0 | grep Attention #To find out the fiber channel link status
lsvpcfg #List all Vpath devices and their states
dpovgfix vg00 #fixe a DPO Vpath Volume group that has mixed vpath and hdisk volumes
###EMC powerpath
To configure all the emc hdisks, run emc_cfgmgr script. This script invokes the AIX cfgmgr tool to probe each adapter bus separately
To remove the Symettrix hdisks
 # lsdev -CtSYMM* -Fname | xargs -n1 rmdev -dl
To remove hdisks corresponding to CLARiiON devices
 # lsdev -CtCLAR* -Fname | xargs -n1 rmdev -dl
To probe all emc disks
 # inq
To set up multipathing to the root device
 # pprootdev on
To Remove all hdiskpower devices
 # lsdev -Ct power -c disk -F name | xargs -n1 rmdev -l
To find out which hdiskpower device contains hdsik132
 # powermt display dev=hdisk132
###HP Autopath
dlnkmgr view -drv
dlmrmdev #remove all the DLM drivers
dlmpr -a -c #To clear the SCSI reserves on the disks
/usr/DynamicLinkManager/drv/dlmfdrv.conf
###MPIO
To list all the paths which are in Enabled status
 # lspath -s ena -Fname -p fscsi0
 # chpath -s ena -l hdisk0
 paths Enabled
To list all available disks and their paths
  # lspath | sort +1
To list all disks which paths are in failed state
 # lspath -s failed
To list all disks which paths are in Defined state
 # lspath -s defined
To remove a path
 rmpath -dl <disk_name> -p <parent> -w <connection>
 rmpath -dl hdisk3 -p fscsi0 -w 5005076801105daf,1000000000000
Categories: IT Architecture, Systems, Unix Tags: