Archive

Archive for the ‘Kernel’ Category

VM shutdown stuck in “mount: you must specify the filesystem type, please stand by while rebooting the system”

November 16th, 2016 Comments off

When you issue "shutdown" or "reboot" on linux box and found "mount: you must specify the filesystem type, please stand by while rebooting the system":

vm-stuck

Then one possible reason is that you have specified wrong mount options for nfs shares in /etc/fstab. For example, for nfsv3, please make sure to use below nfs options when you mount shares:

<share name> <mount dir> nfs rsize=32768,wsize=32768,hard,nolock,timeo=14,noacl,intr,mountvers=3,vers=3 0 0

And using below option will make VM shutdown stuck in "mount: you must specify the filesystem type". DO NOT use below:

<share name> <mount dir> nfs vers=3,rsize=32768,wsize=32768,hard,nolock,timeo=14,noacl,intr 0 0

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

resolved – net/core/dev.c:1894 skb_gso_segment+0x298/0x370()

April 19th, 2016 Comments off

Today on one of our servers, there were a lot of errors in /var/log/messages like below:

║Apr 14 21:50:25 test01 kernel: WARNING: at net/core/dev.c:1894
║skb_gso_segment+0x298/0x370()
║Apr 14 21:50:25 test01 kernel: Hardware name: SUN FIRE X4170 M3
║Apr 14 21:50:25 test01 kernel: : caps=(0x60014803, 0x0) len=255
║data_len=215 ip_summed=1
║Apr 14 21:50:25 test01 kernel: Modules linked in: dm_nfs nfs fscache
║auth_rpcgss nfs_acl xen_blkback xen_netback xen_gntdev xen_evtchn lockd
║ @ sunrpc 8021q garp bridge stp llc bonding be2iscsi iscsi_boot_sysfs ib_iser
║ @ rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio
║ @ ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp dm_round_robin libiscsi
║ @ dm_multipath scsi_transport_iscsi xenfs xen_privcmd dm_mirror video sbs sbshc
║acpi_memhotplug acpi_ipmi ipmi_msghandler parport_pc lp parport sr_mod cdrom
║ixgbe hwmon dca snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
║snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore
║snd_page_alloc iTCO_wdt iTCO_vendor_support pcspkr ghes i2c_i801 hed i2c_core
║dm_region_hash dm_log dm_mod usb_storage ahci libahci sg shpchp megaraid_sas
║sd_mod crc_t10dif ext3 jbd mbcache
║Apr 14 21:50:25 test01 kernel: Pid: 0, comm: swapper Tainted: G W
║ 2.6.39-400.264.4.el5uek #1
║Apr 14 21:50:25 test01 kernel: Call Trace:
║Apr 14 21:50:25 test01 kernel: <IRQ> [<ffffffff8143dab8>] ?
║skb_gso_segment+0x298/0x370
║Apr 14 21:50:25 test01 kernel: [<ffffffff8106f300>]
║warn_slowpath_common+0x90/0xc0
║Apr 14 21:50:25 test01 kernel: [<ffffffff8106f42e>]
║warn_slowpath_fmt+0x6e/0x70
║Apr 14 21:50:25 test01 kernel: [<ffffffff810d73a7>] ?
║irq_to_desc+0x17/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffff812faf0c>] ?
║notify_remote_via_irq+0x2c/0x40
║Apr 14 21:50:25 test01 kernel: [<ffffffff8100a820>] ?
║xen_clocksource_read+0x20/0x30
║Apr 14 21:50:25 test01 kernel: [<ffffffff812faf4c>] ?
║xen_send_IPI_one+0x2c/0x40
║Apr 14 21:50:25 test01 kernel: [<ffffffff81011f10>] ?
║xen_smp_send_reschedule+0x10/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffff81056e0b>] ?
║ttwu_queue_remote+0x4b/0x60
║Apr 14 21:50:25 test01 kernel: [<ffffffff81509a7e>] ?
║_raw_spin_unlock_irqrestore+0x1e/0x30
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143dab8>]
║skb_gso_segment+0x298/0x370
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143dba6>]
║dev_gso_segment+0x16/0x50
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143dfb5>]
║dev_hard_start_xmit+0x3d5/0x530
║Apr 14 21:50:25 test01 kernel: [<ffffffff8145a074>]
║sch_direct_xmit+0xc4/0x1d0
║Apr 14 21:50:25 test01 kernel: [<ffffffff8143e811>]
║dev_queue_xmit+0x161/0x410
║Apr 14 21:50:25 test01 kernel: [<ffffffff815099de>] ?
║_raw_spin_lock+0xe/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045820c>]
║br_dev_queue_push_xmit+0x6c/0xa0 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81076e77>] ?
║local_bh_enable+0x27/0xa0
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045e7ba>]
║br_nf_dev_queue_xmit+0x2a/0x90 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045f668>]
║br_nf_post_routing+0x1f8/0x2e0 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>]
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffff8146777c>]
║nf_hook_slow+0x7c/0x130
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04581a0>] ?
║br_forward_finish+0x70/0x70 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04581a0>] ?
║br_forward_finish+0x70/0x70 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458130>] ?
║br_flood_deliver+0x20/0x20 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458186>]
║br_forward_finish+0x56/0x70 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045eba4>]
║br_nf_forward_finish+0xb4/0x180 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045f36f>]
║br_nf_forward_ip+0x26f/0x370 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>]
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffff8146777c>]
║nf_hook_slow+0x7c/0x130
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458130>] ?
║br_flood_deliver+0x20/0x20 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>] ?
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458130>] ?
║br_flood_deliver+0x20/0x20 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04582c8>]
║__br_forward+0x88/0xc0 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa0458356>]
║br_forward+0x56/0x60 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa04591fc>]
║br_handle_frame_finish+0x1ac/0x240 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045ee1b>]
║br_nf_pre_routing_finish+0x1ab/0x350 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff8115bfe9>] ?
║kmem_cache_alloc_trace+0xc9/0x1a0
║Apr 14 21:50:25 test01 kernel: [<ffffffffa045fc55>]
║br_nf_pre_routing+0x305/0x370 [bridge]
║Apr 14 21:50:25 test01 kernel: [<ffffffff8100122a>] ?
║xen_hypercall_xen_version+0xa/0x20
║Apr 14 21:50:25 test01 kernel: [<ffffffff81467428>]
║nf_iterate+0x78/0x90
║Apr 14 21:50:25 test01 kernel: [<ffffffff8146777c>]
║nf_hook_slow+0x7c/0x130

To fix this, we should disable LRO(large receive offload) first:

for i in eth0 eth1 eth2 eth3;do /sbin/ethtool -K $i lro off;done

And if the NICs are of Intel 10G, the we should disable GRO(generic receive offload) too:

for i in eth0 eth1 eth2 eth3;do /sbin/ethtool -K $i gro off;done

Here's the command to disable both of LRO/GRO:

for i in eth0 eth1 eth2 eth3;do /sbin/ethtool -K $i gro off;/sbin/ethtool -K $i lro off;done

 

resolved – /etc/rc.local not executed on boot in linux

November 11th, 2015 Comments off

When you find your scripts in /etc/rc.local not executed along with system boots, then one possibility is that the previous subsys script takes too long to execute, as /etc/rc.local is usually the last one to execute, i.e. S99local. To prove which is the culprit subsys that gets stuck, you can edit /etc/rc.d/rc(which is from /etc/inittab):

[root@host1 tmp] vi /etc/rc.d/rc
# Now run the START scripts.
for i in /etc/rc$runlevel.d/S* ; do
        check_runlevel "$i" || continue

        # Check if the subsystem is already up.
        subsys=${i#/etc/rc$runlevel.d/S??}
        [ -f /var/lock/subsys/$subsys -o -f /var/lock/subsys/$subsys.init ] \
                && continue

        # If we're in confirmation mode, get user confirmation
        if [ -f /var/run/confirm ]; then
                confirm $subsys
                test $? = 1 && continue
        fi

        update_boot_stage "$subsys"
        # Bring the subsystem up.
        if [ "$subsys" = "halt" -o "$subsys" = "reboot" ]; then
                export LC_ALL=C
                exec $i start
        fi
        if LC_ALL=C egrep -q "^..*init.d/functions" $i \
                        || [ "$subsys" = "single" -o "$subsys" = "local" ]; then
                echo $i>>/var/tmp/process.txt
                $i start
                echo $i>>/var/tmp/process_end.txt
        else
                echo $i>>/var/tmp/process_self.txt
                action $"Starting $subsys: " $i start
                echo $i>>/var/tmp/process_self_end.txt
        fi
done

Then you can reboot the system, and check files /var/tmp/{process.txt,process_end.txt,process_self.txt,process_self_end.txt}. In one of the host, I found below entries:

[root@host1 tmp]# tail process.txt
/etc/rc3.d/S85gpm
/etc/rc3.d/S90crond
/etc/rc3.d/S90xfs
/etc/rc3.d/S91vncserver
/etc/rc3.d/S95anacron
/etc/rc3.d/S95atd
/etc/rc3.d/S95emagent_public
/etc/rc3.d/S97rhnsd
/etc/rc3.d/S98avahi-daemon
/etc/rc3.d/S98gcstartup

[root@host1 tmp]# tail process_end.txt
/etc/rc3.d/S85gpm
/etc/rc3.d/S90crond
/etc/rc3.d/S90xfs
/etc/rc3.d/S91vncserver
/etc/rc3.d/S95anacron
/etc/rc3.d/S95atd
/etc/rc3.d/S95emagent_public
/etc/rc3.d/S97rhnsd
/etc/rc3.d/S98avahi-daemon

So from here, we can see /etc/rc3.d/S98gcstartup tried start, but it took too long to finish. To make sure scripts in /etc/rc.local get executed and also the stuck script /etc/rc3.d/S98gcstartup get executed also, we can do this:

[root@host1 tmp]# mv /etc/rc3.d/S98gcstartup /etc/rc3.d/s98gcstartup
[root@host1 tmp]# vi /etc/rc.local

#!/bin/sh

touch /var/lock/subsys/local

#put your scripts here - begin

#put your scripts here - end

#put the stuck script here and make sure it's the last line
/etc/rc3.d/s98gcstartup start

After this, reboot the host and check whether scripts in /etc/rc.local got executed.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

resolved – ext3: No journal on filesystem on disk

March 23rd, 2015 Comments off

Today I met below error when trying to mount a disk:

[root@testvm ~]# mount /scratch
mount: wrong fs type, bad option, bad superblock on /dev/xvdb1,
missing codepage or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

First I ran fsck -y /dev/xvdb1, but after it's done, the issue was still there(sometimes fsck -y /dev/xvdb1 could resolve this though). So as it suggested, I ran a dmesg | tail:

[root@testvm scratch]# dmesg | tail
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period
ext3: No journal on filesystem on xvdb1
ext3: No journal on filesystem on xvdb1
ext3: No journal on filesystem on xvdb1
ext3: No journal on filesystem on xvdb1

So from here we can see that the root cause for mounting failure was "ext3: No journal on filesystem on xvdb1". I first ran "fsck -y /dev/xvdb1", and try mount again. But the issue was still there. So I tried with adding ext3 journal on that disk:

[root@testvm qgomsdc1]# tune2fs -j /dev/xvdb1
tune2fs 1.39 (29-May-2006)
Creating journal inode:

done
This filesystem will be automatically checked every 20 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.

After this, the mount succeeded.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

resolved – su: cannot set user id: Resource temporarily unavailable

January 12th, 2015 Comments off

When i try to log on as user "test", error occurred:

su: cannot set user id: Resource temporarily unavailable

I had a check of limits.conf:

[root@testvm ~]# cat /etc/security/limits.conf|egrep -v '^$|^#'
oracle   soft   nofile    131072
oracle   hard   nofile    131072
oracle   soft   nproc    131072
oracle   hard   nproc    131072
oracle   soft   core    unlimited
oracle   hard   core    unlimited
oracle   soft   memlock    50000000
oracle   hard   memlock    50000000
@svrtech    soft    memlock         500000
@svrtech    hard    memlock         500000
*   soft   nofile    131072
*   hard   nofile    131072
*   soft   nproc    131072
*   hard   nproc    131072
*   soft   core    unlimited
*   hard   core    unlimited
*   soft   memlock    50000000
*   hard   memlock    50000000

Then I had a check of the number of processes/threads with the maximum number of processes to see whether it's coming over the line:

[root@c9qa131-slcn03vmf0293 ~]# ps -eLF | grep test | wc -l
1026

So it's not exceeding. Then I had a check of open files:

[root@testvm ~]# lsof | grep aime | wc -l

6059

It's not exceeding 131072 either, then why the error "su: cannot set user id: Resource temporarily unavailable" was there? Actually the culprit was in file /etc/security/limits.d/90-nproc.conf:

[root@testvm ~]# cat /etc/security/limits.d/90-nproc.conf
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

* soft nproc 1024
root soft nproc unlimited

After I modified 1024 to 131072, the issue gone away immediately.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

resolved – switching from Unbreakable Enterprise Kernel Release 2(UEKR2) to UEKR3 on Oracle Linux 6

November 24th, 2014 Comments off

As we can see from here, the available kernels include the following 3 for Oracle Linux 6:

3.8.13 Unbreakable Enterprise Kernel Release 3 (x86_64 only)
2.6.39 Unbreakable Enterprise Kernel Release 2**
2.6.32 (Red Hat compatible kernel)

On one of our OEL6 VM, we found that it's using UEKR2:

[root@testbox aime]# cat /etc/issue
Oracle Linux Server release 6.4
Kernel \r on an \m

[root@testbox aime]# uname -r
2.6.39-400.211.1.el6uek.x86_64

So how can we switch the kernel to UEKR3(3.8)?

If your linux version is 6.4, first do a "yum update -y" to upgrade to 6.5 and uppper, and then reboot the host, and follow steps below.

[root@testbox aime]# ls -l /etc/grub.conf
lrwxrwxrwx. 1 root root 22 Aug 21 18:24 /etc/grub.conf -> ../boot/grub/grub.conf

[root@testbox aime]# yum update -y

If your linux version is 6.5 and upper, you'll find /etc/grub.conf and /boot/grub/grub.conf are different files(for yum update one. If your host is OEL6.5 when installed, then /etc/grub.conf should be softlink too):

[root@testbox ~]# ls -l /etc/grub.conf
-rw------- 1 root root 2356 Oct 20 05:26 /etc/grub.conf

[root@testbox ~]# ls -l /boot/grub/grub.conf
-rw------- 1 root root 1585 Nov 23 21:46 /boot/grub/grub.conf

In /etc/grub.conf, you'll see entry like below:

title Oracle Linux Server Unbreakable Enterprise Kernel (3.8.13-44.1.3.el6uek.x86_64)
root (hd0,0)
kernel /vmlinuz-3.8.13-44.1.3.el6uek.x86_64 ro root=/dev/mapper/vg01-lv_root rd_LVM_LV=vg01/lv_root rd_NO_LUKS rd_LVM_LV=vg01/lv_swap LANG=en_US.UTF-8 KEYTABLE=us console=hvc0 rd_NO_MD SYSFONT=latarcyrheb-sun16 rd_NO_DM rhgb quiet
initrd /initramfs-3.8.13-44.1.3.el6uek.x86_64.img

What you'll need to do is just copying the entries above from /etc/grub.conf to /boot/grub/grub.conf(make sure /boot/grub/grub.conf not be a softlink, else you may met error "Boot loader didn't return any data"), and then reboot the VM.

After rebooting, you'll find the kernel is now at UEKR3(3.8).

PS:

If you find the VM is OEL6.5 and /etc/grub.conf is a softlink to /boot/grub/grub.conf, then you could do the following to upgrade kernel to UEKR3:

1. add the following lines to /etc/yum.repos.d/public-yum-ol6.repo:

[public_ol6_UEKR3]
name=UEKR3 for Oracle Linux 6 ($basearch)
baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/UEKR3/latest/$basearch/
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle
gpgcheck=1
enabled=1

2. List and install UEKR3:

[root@testbox aime]# yum list|grep kernel-uek|grep public_ol6_UEKR3
kernel-uek.x86_64 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-debug.x86_64 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-debug-devel.x86_64 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-devel.x86_64 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-doc.noarch 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-firmware.noarch 3.8.13-44.1.5.el6uek public_ol6_UEKR3
kernel-uek-headers.x86_64 3.8.13-26.2.4.el6uek public_ol6_UEKR3

[root@testbox aime]# yum install -y kernel-uek* --disablerepo=* --enablerepo=public_ol6_UEKR3

3. Reboot

resolved – /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

November 7th, 2014 Comments off

In one of our script, error prompted when we ran it today:

[root@testhost01 ~]# su - user1
[user1@testhost01 ~]$ /home/testuser/run_as_root 'su'
-bash: /usr/local/packages/aime/ias/run_as_root: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

From the output, we can see that it's complaining for not founding file /lib/ld-linux.so.2:

[user1@testhost01 ~]$ ls -l /lib/ld-linux.so.2
ls: cannot access /lib/ld-linux.so.2: No such file or directory

I then checked on another host and found /lib/ld-linux.so.2 belonged to package glibc:

[root@centos-doxer ~]# ls -l /lib/ld-linux.so.2
lrwxrwxrwx 1 root root 9 May 9 2013 /lib/ld-linux.so.2 -> ld-2.5.so
[root@centos-doxer ~]# rpm -qf /lib/ld-linux.so.2
glibc-2.5-107.el5_9.4

However, on the problematic host, glibc was installed:

[root@testhost01 user1]# rpm -qa|grep glibc
glibc-headers-2.12-1.149.el6.x86_64
glibc-common-2.12-1.149.el6.x86_64
glibc-devel-2.12-1.149.el6.x86_64
glibc-2.12-1.149.el6.x86_64

I then tried making a soft link from /lib64/ld-2.12.so to /lib/ld-linux.so.2:

[root@testhost01 ~]# ln -s /lib64/ld-2.12.so /lib/ld-linux.so.2
[root@testhost01 ~]# su - user1
[user1@testhost01 ~]$ /usr/local/packages/aime/ias/run_as_root su
-bash: /usr/local/packages/aime/ias/run_as_root: Accessing a corrupted shared library

Hmmm, so it now complained about corrupted shared library. Maybe we need 32bit of glibc? So I removed the softlink, and then installed glibc.i686:

rm -rf /lib/ld-linux.so.2
yum -y install glibc.i686

After installation, I found /lib/ld-linux.so.2 was there already:

[root@testhost01 user1]# ls -l /lib/ld-linux.so.2
lrwxrwxrwx 1 root root 10 Nov 7 03:46 /lib/ld-linux.so.2 -> ld-2.12.so
[root@testhost01 user1]# rpm -qf /lib/ld-linux.so.2
glibc-2.12-1.149.el6.i686

And when I ran again the command, it returned ok:

[root@testhost01 user1]# su - user1
[user1@testhost01 ~]$ /home/testuser/run_as_root 'su'
[root@testhost01 user1]#

So from this, we can see that the issue was caused by /usr/local/packages/aime/ias/run_as_root supports only 32bit of glibc.

Categories: IT Architecture, Kernel, Linux, Systems Tags:

resolved – auditd STDERR: Error deleting rule Error sending enable request (Operation not permitted)

September 19th, 2014 Comments off

Today when I try to restart auditd, the following error message prompted:

[2014-09-18T19:26:41+00:00] ERROR: service[auditd] (cookbook-devops-kernelaudit::default line 14) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of /sbin/service auditd restart ----
STDOUT: Stopping auditd: [  OK  ]
Starting auditd: [FAILED]
STDERR: Error deleting rule (Operation not permitted)
Error sending enable request (Operation not permitted)
---- End output of /sbin/service auditd restart ----
Ran /sbin/service auditd restart returned 1

After some reading of manpage auditd, I realized that when audit "enabled" was set to 2(locked), any attempt to change the configuration in this mode will be audited and denied. And that maybe the reason of "STDERR: Error deleting rule (Operation not permitted)", "Error sending enable request (Operation not permitted)". Here's from man page of auditctl:

-e [0..2] Set enabled flag. When 0 is passed, this can be used to temporarily disable auditing. When 1 is passed as an argument, it will enable auditing. To lock the audit configuration so that it can't be changed, pass a 2 as the argument. Locking the configuration is intended to be the last command in audit.rules for anyone wishing this feature to be active. Any attempt to change the configuration in this mode will be audited and denied. The configuration can only be changed by rebooting the machine.

You can run auditctl -s to check the current setting:

[root@centos-doxer ~]# auditctl -s
AUDIT_STATUS: enabled=1 flag=1 pid=3154 rate_limit=0 backlog_limit=320 lost=0 backlog=0

And you can run auditctl -e <0|1|2> to change this attribute on the fly, or you can add -e <0|1|2> in /etc/audit/audit.rules. Please note after you modify this, a reboot is a must to make this into effect.

PS:

Here's more about linux audit.

resolved – Permission denied even after chmod 777 world readable writable

September 19th, 2014 Comments off

Several team members asked me that when they want to change to some directories or read some files ,the system reported error "Permission denied". Even after setting world writable(chmod 777), the error was still there:

-bash-3.2$ cd /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs
-bash: cd: /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs: Permission denied

-bash-3.2$ cat /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out
cat: /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out: Permission denied

-bash-3.2$ ls -l /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out
-rwxrwxrwx 1 oracle oinstall 1100961066 Sep 19 07:37 /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out

In summary, if you want to read some file(e.g. wls_sdi1.out) under some directory(e.g. /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs), then except for "read bit" set on that file(chmod +r wls_sdi1.out), it's also needed that all parent directories of that file(/u01, /u01/local, /u01/local/wls, ......, /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs) have both "read bit" & "execute bit" set(you can check it by ls -ld <dir name>):

chmod +r wls_sdi1.out #first set "read bit" on the file
chmod +r /u01; chmod +x /u01; chmod +r /u01/local; chmod +x /u01/local; <...skipped...>chmod +r /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs; chmod +x /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs; #then set both "read bit" & "execute bit" on all parent directories

And at last, if you can log on as the file owner, then everything will be smooth. For /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out, it's owned by oracle user. So you can try log on as oracle user and do the operations.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

Resolved – rm cannot remove some files with error message “Device or resource busy”

June 11th, 2014 1 comment

If you meet problem when remove one file on linux with below error message:

[root@test-host ~]# rm -rf /u01/shared/*
rm: cannot remove `/u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001': Device or resource busy
rm: cannot remove `/u01/shared/WLS/oracle_common/modules/oracle.jrf_11.1.1/.nfs0000000000005c7a00000002': Device or resource busy
rm: cannot remove `/u01/shared/WLS/OracleHome/soa/modules/oracle.soa.fabric_11.1.1/.nfs0000000000006bcf00000003': Device or resource busy

Then it means that some progresses were still referring to these files. You have to stop these processes before remove these files. You can use linux command lsof to find the processes using specific files:

[root@test-host ~]# lsof |grep nfs0000000000004abf00000001
java 2956 emcadm mem REG 0,21 1095768 19135 /u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001 (slce49sn-nas:/export/C9QA123_DC1/tas_central_shared)
java 2956 emcadm 88r REG 0,21 1095768 19135 /u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001 (slce49sn-nas:/export/C9QA123_DC1/tas_central_shared)

So from here you can see that processe with PID 2956 is still using file /u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001.

However, some systems have no lsof installed by default. Then you can install it or by using the alternative one "fuser":

[root@test-host ~]# fuser -cu /u01/shared/WLS/oracle_common
/u01/shared/WLS/oracle_common: 2956m(emcadm) 7358c(aime)

Then you can see also that progresses with PIDs 2956 and 7358 are referring to the directory /u01/shared/WLS/oracle_common.

so you'll need stop the process first by killing it(or stop it using the processes own stop() method if defined):

kill -9 2956

After that, you can try remove the files again, should be ok this time.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

resolved – /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

April 1st, 2014 Comments off

When I ran perl command today, I met problem below:

[root@test01 bin]# /usr/local/bin/perl5.8
-bash: /usr/local/bin/perl5.8: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory

Now let's check which package /lib/ld-linux.so.2 belongs to on a good linux box:

[root@test02 ~]# rpm -qf /lib/ld-linux.so.2
glibc-2.5-118.el5_10.2

So here's the resolution to the issue:

[root@test01 bin]# yum install -y glibc.x86_64 glibc.i686 glibc-devel.i686 glibc-devel.x86_64 glibc-headers.x86_64

Categories: IT Architecture, Kernel, Linux, Systems Tags:

resolved – ssh Read from socket failed: Connection reset by peer and Write failed: Broken pipe

March 13th, 2014 Comments off

If you met following errors when ssh to linux box:

Read from socket failed: Connection reset by peer

Write failed: Broken pipe

Then there's one possibility that the linux box's filesystem was corrupted. As in my case there's output to stdout:

EXT3-fs error ext3_lookup: deleted inode referenced

To resolve this, you need make linux go to single user mode and fsck -y <filesystem>. You can get corrupted filesystem names when booting:

[/sbin/fsck.ext3 (1) -- /usr] fsck.ext3 -a /dev/xvda2
/usr contains a file system with errors, check forced.
/usr: Directory inode 378101, block 0, offset 0: directory corrupted

/usr: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)

[/sbin/fsck.ext3 (1) -- /oem] fsck.ext3 -a /dev/xvda5
/oem: recovering journal
/oem: clean, 8253/1048576 files, 202701/1048233 blocks
[/sbin/fsck.ext3 (1) -- /u01] fsck.ext3 -a /dev/xvdb
u01: clean, 36575/14548992 files, 2122736/29081600 blocks
[FAILED]

So in this case, I did fsck -y /dev/xvda2 && fsck -y /dev/xvda5. Later reboot host, and then everything went well.

PS:

If two VMs are booted up in two hypervisors and these VMs shared the same filesystem(like NFS), then after fsck -y one FS and booted up the VM, the FS will corrupt soon as there're other copies of itself is using that FS. So you need first make sure that only one copy of VM is running on hypervisors of the same server pool.

Categories: IT Architecture, Kernel, Linux, Systems Tags:

resolved – mount clntudp_create: RPC: Program not registered

December 2nd, 2013 Comments off

When I did a showmount -e localhost, error occured:

[root@centos-doxer ~]# showmount -e localhost
mount clntudp_create: RPC: Program not registered

So I checked what RPC program number of showmount was using:

[root@centos-doxer ~]# grep showmount /etc/rpc
mountd 100005 mount showmount

As this indicated, we should startup mountd daemon to make showmount -e localhost work. And mountd is part of nfs, so I started up nfs:

[root@centos-doxer ~]# /etc/init.d/nfs start
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS daemon: [ OK ]
Starting NFS mountd: [ OK ]

Now as mountd was running, showmount -e localhost should work.

 

resolved – kernel panic not syncing: Fatal exception Pid: comm: not Tainted

November 13th, 2013 Comments off

We're install IDM OAM today and the linux server panic every time we run the startup script. Server panic info was like this:

Pid: 4286, comm: emdctl Not tainted 2.6.32-300.29.1.el5uek #1
Process emdctl (pid: 4286, threadinfo ffff88075bf20000, task ffff88073d0ac480)
Stack:
ffff88075bf21958 ffffffffa02b1769 ffff88075bf21948 ffff8807cdcce500
<0> ffff88075bf95cc8 ffff88075bf95ee0 ffff88075bf21998 ffffffffa01fd5c6
<0> ffffffffa02b1732 ffff8807bc2543f0 ffff88075bf95cc8 ffff8807bc2543f0
Call Trace:
[<ffffffffa02b1769>] nfs3_xdr_writeargs+0x37/0x7a [nfs]
[<ffffffffa01fd5c6>] rpcauth_wrap_req+0x7f/0x8b [sunrpc]
[<ffffffffa02b1732>] ? nfs3_xdr_writeargs+0x0/0x7a [nfs]
[<ffffffffa01f612a>] call_transmit+0x199/0x21e [sunrpc]
[<ffffffffa01fc8ba>] __rpc_execute+0x85/0x270 [sunrpc]
[<ffffffffa01fcae2>] rpc_execute+0x26/0x2a [sunrpc]
[<ffffffffa01f5546>] rpc_run_task+0x57/0x5f [sunrpc]
[<ffffffffa02abd86>] nfs_write_rpcsetup+0x20b/0x22d [nfs]
[<ffffffffa02ad1e8>] nfs_flush_one+0x97/0xc3 [nfs]
[<ffffffffa02a86b4>] nfs_pageio_doio+0x37/0x60 [nfs]
[<ffffffffa02a87c5>] nfs_pageio_complete+0xe/0x10 [nfs]
[<ffffffffa02ac264>] nfs_writepages+0xa7/0xe4 [nfs]
[<ffffffffa02ad151>] ? nfs_flush_one+0x0/0xc3 [nfs]
[<ffffffffa02acd2e>] nfs_write_mapping+0x63/0x9e [nfs]
[<ffffffff810f02fe>] ? __pmd_alloc+0x5d/0xaf
[<ffffffffa02acd9c>] nfs_wb_all+0x17/0x19 [nfs]
[<ffffffffa029f6f7>] nfs_do_fsync+0x21/0x4a [nfs]
[<ffffffffa029fc9c>] nfs_file_flush+0x67/0x70 [nfs]
[<ffffffff81117025>] filp_close+0x46/0x77
[<ffffffff81059e6b>] put_files_struct+0x7c/0xd0
[<ffffffff81059ef9>] exit_files+0x3a/0x3f
[<ffffffff8105b240>] do_exit+0x248/0x699
[<ffffffff8100e6a1>] ? xen_force_evtchn_callback+0xd/0xf
[<ffffffff8106898a>] ? freezing+0x13/0x15
[<ffffffff8105b731>] sys_exit_group+0x0/0x1b
[<ffffffff8106bd03>] get_signal_to_deliver+0x303/0x328
[<ffffffff8101120a>] do_notify_resume+0x90/0x6d7
[<ffffffff81459f06>] ? kretprobe_table_unlock+0x1c/0x1e
[<ffffffff8145ac6f>] ? kprobe_flush_task+0x71/0x7c
[<ffffffff8103164c>] ? paravirt_end_context_switch+0x17/0x31
[<ffffffff81123e8f>] ? path_put+0x22/0x27
[<ffffffff8101207e>] int_signal+0x12/0x17
Code: 55 48 89 e5 0f 1f 44 00 00 48 8b 06 0f c8 89 07 48 8b 46 08 0f c8 89 47 04 c9 48 8d 47 08 c3 55 48 89 e5 0f 1f 44 00 00 48 0f ce <48> 89 37 c9 48 8d 47 08 c3 55 48 89 e5 53 0f 1f 44 00 00 f6 06
RIP [<ffffffffa02b03c3>] xdr_encode_hyper+0xc/0x15 [nfs]
RSP <ffff88075bf21928>
---[ end trace 04ad5382f19cf8ad ]---
Kernel panic - not syncing: Fatal exception
Pid: 4286, comm: emdctl Tainted: G D 2.6.32-300.29.1.el5uek #1
Call Trace:
[<ffffffff810579a2>] panic+0xa5/0x162
[<ffffffff81450075>] ? threshold_create_device+0x242/0x2cf
[<ffffffff8100ed2f>] ? xen_restore_fl_direct_end+0x0/0x1
[<ffffffff814574b0>] ? _spin_unlock_irqrestore+0x16/0x18
[<ffffffff810580f5>] ? release_console_sem+0x194/0x19d
[<ffffffff810583be>] ? console_unblank+0x6a/0x6f
[<ffffffff8105766f>] ? print_oops_end_marker+0x23/0x25
[<ffffffff814583a6>] oops_end+0xb7/0xc7
[<ffffffff8101565d>] die+0x5a/0x63
[<ffffffff81457c7c>] do_trap+0x115/0x124
[<ffffffff81013731>] do_alignment_check+0x99/0xa2
[<ffffffff81012cb5>] alignment_check+0x25/0x30
[<ffffffffa02b03c3>] ? xdr_encode_hyper+0xc/0x15 [nfs]
[<ffffffffa02b06be>] ? xdr_encode_fhandle+0x15/0x17 [nfs]
[<ffffffffa02b1769>] nfs3_xdr_writeargs+0x37/0x7a [nfs]
[<ffffffffa01fd5c6>] rpcauth_wrap_req+0x7f/0x8b [sunrpc]
[<ffffffffa02b1732>] ? nfs3_xdr_writeargs+0x0/0x7a [nfs]
[<ffffffffa01f612a>] call_transmit+0x199/0x21e [sunrpc]
[<ffffffffa01fc8ba>] __rpc_execute+0x85/0x270 [sunrpc]
[<ffffffffa01fcae2>] rpc_execute+0x26/0x2a [sunrpc]
[<ffffffffa01f5546>] rpc_run_task+0x57/0x5f [sunrpc]
[<ffffffffa02abd86>] nfs_write_rpcsetup+0x20b/0x22d [nfs]
[<ffffffffa02ad1e8>] nfs_flush_one+0x97/0xc3 [nfs]
[<ffffffffa02a86b4>] nfs_pageio_doio+0x37/0x60 [nfs]
[<ffffffffa02a87c5>] nfs_pageio_complete+0xe/0x10 [nfs]
[<ffffffffa02ac264>] nfs_writepages+0xa7/0xe4 [nfs]
[<ffffffffa02ad151>] ? nfs_flush_one+0x0/0xc3 [nfs]
[<ffffffffa02acd2e>] nfs_write_mapping+0x63/0x9e [nfs]
[<ffffffff810f02fe>] ? __pmd_alloc+0x5d/0xaf
[<ffffffffa02acd9c>] nfs_wb_all+0x17/0x19 [nfs]
[<ffffffffa029f6f7>] nfs_do_fsync+0x21/0x4a [nfs]
[<ffffffffa029fc9c>] nfs_file_flush+0x67/0x70 [nfs]
[<ffffffff81117025>] filp_close+0x46/0x77
[<ffffffff81059e6b>] put_files_struct+0x7c/0xd0
[<ffffffff81059ef9>] exit_files+0x3a/0x3f
[<ffffffff8105b240>] do_exit+0x248/0x699
[<ffffffff8100e6a1>] ? xen_force_evtchn_callback+0xd/0xf
[<ffffffff8106898a>] ? freezing+0x13/0x15
[<ffffffff8105b731>] sys_exit_group+0x0/0x1b
[<ffffffff8106bd03>] get_signal_to_deliver+0x303/0x328
[<ffffffff8101120a>] do_notify_resume+0x90/0x6d7
[<ffffffff81459f06>] ? kretprobe_table_unlock+0x1c/0x1e
[<ffffffff8145ac6f>] ? kprobe_flush_task+0x71/0x7c
[<ffffffff8103164c>] ? paravirt_end_context_switch+0x17/0x31
[<ffffffff81123e8f>] ? path_put+0x22/0x27
[<ffffffff8101207e>] int_signal+0x12/0x17

We tried a lot(application coredump, kdump etc) but still not got solution until we notice that there were a lot of nfs related message in the kernel panic info(marked as red above).

As our linux server was not using NFS or autofs, so we tried upgrade nfs client(nfs-utils) and disabled autofs:

yum update nfs-utils

chkconfig autofs off

After this, the startup for IDM succeeded, and no server panic found anymore!

linux kernel module installation

June 30th, 2013 Comments off
yum install kernel-devel
#only install module
#compile kernel
cp /boot/config-xxx .config, then make oldconfig(will ask only for new features. make help for help info)
AND
cp /usr/src/kernels/linux-2.6.23/.config /boot/config-2.6.23
mkdir -p /usr/share/doc/linux-2.6.23
cp -r Documentation/* /usr/share/doc/linux-2.6.23/
grep -r  read_bytes /usr/share/doc/linux-2.6.23/
Categories: IT Architecture, Kernel, Systems Tags:

application core dump and kernel core dump(kdump) in linux

May 30th, 2013 Comments off

---App core dump
vi /etc/profile

ulimit -c unlimited >/dev/null 2>&1

vi /etc/sysctl.conf

kernel.core_uses_pid = 1
kernel.core_pattern = /var/tmp/app-core-%e-%s-%u-%g-%p-%t
fs.suid_dumpable = 2

echo "DAEMON_COREFILE_LIMIT='unlimited'" >> /etc/sysconfig/init #enable debugging for all apps. For specific app, e.g /etc/init.d/httpd(DAEMON_COREFILE_LIMIT='unlimited')
sysctl -p

After this, you can kill -s SIGSEGV <pid> to generate a linux application core dump(send a Segmentation fault signal)

---Kernel core dump

You can also check here for kdump on Oracle Linux. And here is about the way kdump works.

yum install system-config-kdump kexec-tools crash kernel-debuginfo kernel-debuginfo-common
vi /boot/grub/grub.conf

crashkernel=0M-2G:128M,2G-6G:256M,6G-8G:512M,8G-:768M nmi_watchdog=1 #Or crashkernel=256M@16M, tells the system kernel to reserve 256 MB of memory starting at physical address 0x01000000 (16MB) for the dump-capture kernel, crashkernel=256M

vi /etc/kdump.conf #more on http://linux.die.net/man/5/kdump.conf

ext3 /dev/mapper/VolGroup00-LogVol00
core_collector makedumpfile -c -d 31
path /var/log/dumps

vi /etc/sysctl.conf

kernel.unknown_nmi_panic=1 #should be there by default
fs.suid_dumpable = 2
kernel.core_uses_pid = 1 #should be there by default
kernel.panic_on_unrecovered_nmi=1
vm.panic_on_oom=1

vi /etc/security/limits.conf #can not exceed values defined in /etc/security/limits.conf though

* soft core unlimited
* hard core unlimited

vi /etc/profile

ulimit -c unlimited >/dev/null 2>&1

sysctl -p
chkconfig kdump on
reboot

dmesg|egrep -i 'crash|dump' #sometimes memory reservation may failed with message "crashkernel reservation failed - memory is in use"
cat /proc/iomem | grep -i crash

cat /proc/meminfo|grep Slab #The total amount of memory, in kilobytes, used by the kernel to cache data structures for its own use
cat /proc/cmdline #check this after configuring crash kernel ram

/etc/init.d/kdump status
/etc/init.d/kdump restart
sync;sync;sync
echo 1 > /proc/sys/kernel/sysrq
echo "c" > /proc/sysrq-trigger #will reboot, to test the kdump

echo "b" > /proc/sysrq-trigger #will reboot system, useful when no memory can be allocated and you can reboot by this

ls -lrth /var/log/dumps/
---Analyse coredump file
crash /usr/lib/debug/lib/modules/2.6.18-194.17.4.el5/vmlinux /var/crash/127.0.0.1-2011-03-16-12\:23\:06/vmcore
objdump -a ./ls #binutils, file format elf64-x86-64
gdb /path/to/application /path/to/corefile
strace

PS:

For kdump kernel, if "KDUMP_KERNELVER" in /etc/sysconfig/kdump is not specified, then the init script will try to find a kdump kernel with the same version number as the running kernel. You can specify this parameter as "2.6.18-407.0.0.0.1.el5PAE" for example.

hwclock and 11 minute mode in linux

May 28th, 2013 Comments off

From manpage of hwclock:

You should be aware of another way that the Hardware Clock is kept synchronized in some systems. The Linux kernel has a mode wherein it copies the System Time to the Hardware Clock every 11 minutes. This is a good mode to use when you are using something sophisticated like ntp to keep your System Time synchronized. (ntp is a way to keep your System Time synchronized either to a time server somewhere on the network or to a radio clock hooked up to your system. See RFC 1305).

This mode (we'll call it "11 minute mode") is off until something turns it on. The ntp daemon xntpd is one thing that turns it on. You can turn it off by running anything, including hwclock --hctosys
, that sets the System Time the old fashioned way.

To see if it is on or off, use the command adjtimex
 --print and look at the value of "status". If the "64" bit of this number (expressed in binary) equal to 0, 11 minute mode is on. Otherwise, it is off.

If your system runs with 11 minute mode on, don't use hwclock
 --adjust or hwclock --hctosys
. You'll just make a mess. It is acceptable to use a hwclock
 --hctosys at startup time to get a reasonable System Time until your system is able to set the System Time from the external source and start 11 minute mode.

To test whether 11 minutes mode is on or not, you can use the following script(yum install adjtimex first!):

[root@centos-doxer ~]# cat 11minmode
#!/bin/bash

adjtimex --print | awk '/status/ {
if ( and($2, 64) == 0 ) {
print "11 minute mode is enabled"
} else {
print "11 minute mode is disabled"
}
}'

Actually, we better use ntp to do time syncing on OS, and then sync system time to hardware time using hwclock --systohc in cron job to do the time syncing between OS & hardware.

Categories: IT Architecture, Kernel, Linux, Systems Tags:

resolved – differences between zfs ARC L2ARC ZIL

January 31st, 2013 Comments off
  • ARC

zfs ARC(adaptive replacement cache) is a very fast cache located in the server’s memory.

For example, our ZFS server with 12GB of RAM has 11GB dedicated to ARC, which means our ZFS server will be able to cache 11GB of the most accessed data. Any read requests for data in the cache can be served directly from the ARC memory cache instead of hitting the much slower hard drives. This creates a noticeable performance boost for data that is accessed frequently.

  • L2ARC

As a general rule, you want to install as much RAM into the server as you can to make the ARC as big as possible. At some point, adding more memory is just cost prohibitive. That is where the L2ARC becomes important. The L2ARC is the second level adaptive replacement cache. The L2ARC is often called “cache drives” in the ZFS systems.

L2ARC is a new layer between Disk and the cache (ARC) in main memory for ZFS. It uses dedicated storage devices to hold cached data. The main role of this cache is to boost the performance of random read workloads. The intended L2ARC devices include 10K/15K RPM disks like short-stroked disks, solid state disks (SSD), and other media with substantially faster read latency than disk.

  • ZIL

ZIL(ZFS Intent Log) exists for performance improvement on synchronous writes. Synchronous write is very slow than asynchronous write, but it's more stable. Essentially, the intent log of a file system is nothing more than an insurance against power failures, a to-do list if you will, that keeps track of the stuff that needs to be updated on disk, even if the power fails (or something else happens that prevents the system from updating its disks).

To get better performance, use separated disks(SSD) for ZIL, such as zpool add pool log c2d0.

Now I'm giving you an true example about zfs ZIL/L2ARC/ARC on SUN ZFS 7320 head:

test-zfs# zpool iostat -v exalogic
capacity operations bandwidth
pool alloc free read write read write
------------------------- ----- ----- ----- ----- ----- -----
exalogic 6.78T 17.7T 53 1.56K 991K 25.1M
mirror 772G 1.96T 6 133 111K 2.07M
c0t5000CCA01A5FDCACd0 - - 3 36 57.6K 2.07M #these are the physical disks
c0t5000CCA01A6F5CF4d0 - - 2 35 57.7K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A6F5D00d0 - - 2 36 56.2K 2.07M
c0t5000CCA01A6F64F4d0 - - 2 35 57.3K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A76A7B8d0 - - 2 36 56.3K 2.07M
c0t5000CCA01A746CCCd0 - - 2 36 56.8K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A749A88d0 - - 2 35 56.7K 2.07M
c0t5000CCA01A759E90d0 - - 2 35 56.1K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A767FDCd0 - - 2 35 56.1K 2.07M
c0t5000CCA01A782A40d0 - - 2 35 57.1K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A782D10d0 - - 2 35 57.2K 2.07M
c0t5000CCA01A7465F8d0 - - 2 35 56.3K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A7597FCd0 - - 2 35 57.6K 2.07M
c0t5000CCA01A7828F4d0 - - 2 35 56.2K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A7829ACd0 - - 2 35 57.1K 2.07M
c0t5000CCA01A78278Cd0 - - 2 35 57.4K 2.07M
mirror 772G 1.96T 6 133 111K 2.07M
c0t5000CCA01A736000d0 - - 3 35 57.3K 2.07M
c0t5000CCA01A738000d0 - - 2 35 57.3K 2.07M
c0t5000A72030061B82d0 224M 67.8G 0 98 1 1.62M #ZIL(SSD write cache, ZFS Intent Log)
c0t5000A72030061C70d0 224M 67.8G 0 98 1 1.62M
c0t5000A72030062135d0 223M 67.8G 0 98 1 1.62M
c0t5000A72030062146d0 224M 67.8G 0 98 1 1.62M
cache - - - - - -
c2t2d0 334G 143G 15 6 217K 652K #L2ARC(SSD cache drives)
c2t3d0 332G 145G 15 6 215K 649K
c2t4d0 333G 144G 11 6 169K 651K
c2t5d0 333G 144G 13 6 192K 650K
c2t2d0 - - 0 0 0 0
c2t3d0 - - 0 0 0 0
c2t4d0 - - 0 0 0 0
c2t5d0 - - 0 0 0 0

And as for ARC:

test-zfs:> status memory show
Memory:
Cache 63.4G bytes #ARC
Unused 17.3G bytes
Mgmt 561M bytes
Other 491M bytes
Kernel 14.3G bytes

resolved – bnx2i dev eth0 does not support iscsi

September 19th, 2012 Comments off

There's a weird incident occurred on a linux box. The linux box turned not responsible to ping or ssh, although from ifconfig and /proc/net/bonding/bond0 file, the system said it's running ok. After some google work, I found that the issue may related to the NIC driver. I tried bring down/bring up NICs one by one, but got error:

Bringing up loopback interface bond0: bnx2i: dev eth0 does not support iscsi

bnx2i: iSCSI not supported, dev=eth0

bonding: no command found in slaves file for bond bond0. Use +ifname or -ifname

At last, I tried restart the whole network i.e. /etc/init.d/network restart. And that did the trick, the networking was then running ok and can ping/ssh to it without problem.

resolved – semget failed with status 28 failed oracle database starting up

August 2nd, 2012 Comments off

Today we met a problem with semaphore and unable to start oracle instances. Here's the error message:

ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpcreates

So it turns out, the max number of arrays have been reached:
#check limits of all IPC
root@doxer# ipcs -al

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 67108864
max total shared memory (kbytes) = 17179869184
min seg size (bytes) = 1

------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 1024000
max ops per semop call = 100
semaphore max value = 32767

------ Messages: Limits --------
max queues system wide = 16
max size of message (bytes) = 65536
default max size of queue (bytes) = 65536

#check summary of semaphores
root@doxer# ipcs -su

------ Semaphore Status --------
used arrays = 127
allocated semaphores = 16890

To resolve this, we need increase value of max number of semaphore arrays:

root@doxer# cat /proc/sys/kernel/sem
250 1024000 100 128
                 ^---needs to be increased

PS:

Here's an example with toilets that describes differences between mutex and semaphore LOL http://koti.mbnet.fi/niclasw/MutexSemaphore.html

Resolved – bash /usr/bin/find Arg list too long

July 3rd, 2012 Comments off

Have you ever met error like the following?

root@doxer# find /PRD/*/connectors/A01/QP*/*/logFiles/* -prune -name "*.log" -mtime +7 -type f |wc -l

bash: /usr/bin/find: Arg list too long

0

The cause of issue is kernel limitation for argument count which can be passed to find (as well as ls, and other utils). ARG_MAX defines

the maximum length of arguments for a new process. You can get the number of it using command:

root@doxer# getconf ARG_MAX
1048320

To quickly fix this, you can move your actions into the directory(replace * with subdir_NAME):

cd /PRD/subdir_NAME/connectors/A01/QP*/*/logFiles/;find . -prune -name "*.log" -mtime +7 -type f |wc -l

11382

PS:

  1. you can get all configuration values with getconf -a.
  2. For more solutions about the error "bash: /usr/bin/find: Arg list too long", you can refer to http://www.in-ulm.de/~mascheck/various/argmax/
Categories: IT Architecture, Kernel, Linux, Systems Tags:

ORA-00600 internal error caused by /tmp swap full

June 22nd, 2012 Comments off

Today we encountered a problem when oracle failed to functioning. After some checking, this error was caused by /tmp running out of space. This also confirmed by OS logs:

Jun 20 17:43:59 tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File system full, swap space limit exceeded

Oracle uses /tmp to compile PL/SQL code, so if there no space it unable to compile/execute. Which causing functions/procedures/packeges and trigers to timeout. The same also described in oracle note: ID 1389623.1

So in order to prevent further occurrences of this error, we should increase /tmp on the system to at least 4Gb.

There is an Oracle parameter to change the default location of these temporary files(_ncomp_shared_objects_dir), but it's not a dynamic parameter. And also, while there is a way to resize a tmpfs filesystem online but it's somehow risky. So the best idea is that, we firstly bring down Oracle DB on this host, then modify /etc/vfstab, and then reboot the whole system. This way will protect our data against the risk of corruption or lost etc, also it'll have some outage time.
So finally, here's the steps:
Amend the line in /etc/vfstab from:

swap - /tmp tmpfs - yes size=512m

To:

swap - /tmp tmpfs - yes size=4096m

Reboot machine and bring up oracle DB

solaris kernel bug – ack replied before sync/ack valid outbound packets dropped

April 9th, 2012 Comments off

If you intermittently getting the following error “ldapserver.test.com:389; socket closed.” , and after some tcpdumping you may find the following:

From the network traffic analysing you may find the following incorrect package exchange chain exists:

testhost1 -- > testhost2 (SYN)
testhost1 < -- testhost2 (ACK) -- on this point should be sent SYN ACK package
testhost1 -- > testhost2 (RST) - respectively in case when it didn't receive SYN/ACK – client initiate reset TCP connection

Actually this is a solaris kernel bug, more info you can refer to
The workaround is running this:
ndd -set /dev/ip ip_ire_arp_interval 999999999
After this, the packet drop to 1 per week per host.

More info about this kernel bug can be found here http://wesunsolve.net/bugid/id/6942436

hostname is different between linux and solaris

February 21st, 2012 Comments off

1. For linux, -a is a option for the command hostname:
-a, --alias
Display the alias name of the host (if used).
For example:
[root@linux ~]# hostname -a
linux localhost.localdomain localhost
[root@linux ~]# grep linux /etc/hosts
127.0.0.1 linux.doxer.org linux localhost.localdomain localhost

2.For solaris:

But for solaris, there's no -a option, which means, if you run hostname -a on a solaris box, you're actually setting the hostname to "-a", which in turn will cause many problem especially ldap.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

Too many cron jobs and crond processes running

February 17th, 2012 Comments off

I faced a problem that a ton of crond processes(cronjobs, or crontab) were running on the OS:

root@localhost# ps auxww|grep cron
vare 543 0.0 0.0 141148 5904 ? S 01:43 0:00 crond
root 4085 0.0 0.0 72944 976 ? Ss 2010 1:13 crond
vare 4522 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 5446 0.0 0.0 141148 5904 ? S 02:43 0:00 crond
vare 9202 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 10245 0.0 0.0 141148 5908 ? S 03:43 0:00 crond
vare 13989 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 15487 0.0 0.0 141148 5908 ? S 04:43 0:00 crond
vare 18796 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 20448 0.0 0.0 141148 5908 ? S 05:43 0:00 crond
root 23168 0.0 0.0 6024 596 pts/0 S+ 06:15 0:00 grep cron
vare 23474 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 27183 0.0 0.0 141148 5904 ? S Feb16 0:00 crond
vare 28358 0.0 0.0 141148 5904 ? S 00:43 0:00 crond
vare 32032 0.0 0.0 141148 5904 ? S Feb16 0:00 crond

.....(and more)

Now let's see what cronjobs are running by user vare:
root@localhost# crontab -u vare -l
# run the VERA Deploy routine
43 * * * * cd /share/scripts > /dev/null 2>&1 ; sleep 5 ; /share/scripts/Application/VARE/Deploy > /dev/null 2>&1

After check the script /share/bbscripts/Application/VERA/Deploy, I can see that the script is changing directory to a NFS mount point<i.e. cd /share/scripts> and then do some checks<i.e. /share/scripts/Application/VARE/Deploy>. But as there's problem during the process it's changing to NFS mount point, so the script hung there and didn't quit normally. As such, the number of crond was increasing.

Method to solve this specific problem(specific means you've to check your own script) is to first kill the hung processes of crond, then bounce autofs and then restart crond.

 

using timex to check whether performance degradation caused by OS or VxVM

February 1st, 2012 Comments off

To check for differences between operating system times to access disks and Volume Manager times to access disks, we can know whether to check for differences between operating system times to access disks and Volume Manager times to access disks. This is because they should both be about the same since both commands force a read of disk header information. If one of those is markedly greater then it indicates a problem in that area.

#echo | timex /usr/sbin/format #to avoid prompt for user input. Use time instead of timex for linux
real          13.03

user           0.10

sys            1.49
#timex vxdisk –o alldgs list
real           2.65

user           0.00

sys            0.00

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

Linux hostname domainname dnsdomainname nisdomainname ypdomainname

December 20th, 2011 Comments off

Here's just an excerpt from online man page of "domainname":

NAME
hostname - show or set the system's host name
domainname - show or set the system's NIS/YP domain name
dnsdomainname - show the system's DNS domain name
nisdomainname - show or set system's NIS/YP domain name
ypdomainname - show or set the system's NIS/YP domain name
hostname will print the name of the system as returned by the gethost-
name(2) function.

domainname, nisdomainname, ypdomainname will print the name of the sys-
tem as returned by the getdomainname(2) function. This is also known as
the YP/NIS domain name of the system.

dnsdomainname will print the domain part of the FQDN (Fully Qualified
Domain Name). The complete FQDN of the system is returned with hostname
--fqdn.

Sometime you may find a weird thing that you can use ldap verification to log on a client, but you can not sudo to root. Now you should consider run domainname to check whether it's set to (none). If it does, you should consider set the domainname just using domainname command.

Extending tmpfs’ed /tmp on Solaris 10(and linux) without reboot

November 3rd, 2011 Comments off

Thanks to Eugene.

If you need to extend /tmp that is using tmpfs on Solaris 10 global zone (works with zones too but needs adjustments) and don't want to undertake a reboot, here's a tried working solution.

PLEASE BE CAREFUL, ONE ERROR HERE WILL KILL THE LIVE KERNEL!

echo "$(echo $(echo ::fsinfo | mdb -k | grep /tmp | head -1 | awk '{print $1}')::print vfs_t vfs_data \| ::print -ta struct tmount tm_anonmax | mdb -k | awk '{print $1}')/Z 0x20000" | mdb -kw

Note the 0x20000. This number means new size will be 1GB. It is calculated like this: as an example, 0x10000 in hex is 65535, or 64k. The size is set in pages, each page is 8k, so resulting allocation size is 64k * 8k = 512m. 0x20000 is 1GB, 0x40000 is 2GB etc.

If the server has zones, you will see more then one entry in ::fsinfo, and you need to feed exact struct address to mdb. This way you can change /tmp size for individual zones, but this can only be done from global zone.

Same approach can probably be applied to older Solaris releases but will definitely need adjustments. Oh, and in case you care, on Linux it's as simple as "mount -o remount,size=1G /tmp" :)

 PS:

You may try "mount -t tmpfs -o remount,size=7g shmfs /dev/shm" or "mount -t tmpfs -o remount,size=7g tmpfs /dev/shm" on Linux platform.

Categories: IT Architecture, Kernel, Systems, Unix Tags:

Error extras/ata_id/ata_id.c:42:23: fatal error: linux/bsg.h: No such file or directory when compile LFS udev-166

September 21st, 2011 1 comment

For "Linux From Scratch - Version 6.8", on part "6.60. Udev-166", after configuration on udev, we need compile the package using make, but later, I met the error message like this:

"extras/ata_id/ata_id.c:42:23: fatal error: linux/bsg.h: No such file or directory"

After checking line 42 of ata_id.c under extras/ata_id of udev-166's source file, I can see that:

"#include <linux/bsg.h>"

As there's no bsg.h under $LFS/usr/include/linux, so I was sure that this error was caused by C header file loss. Checking with:
root:/sources/udev-166# /lib/libc.so.6
I can see that Glibc was 2.13, and GCC was 4.5.2:

"
GNU C Library stable release version 2.13, by Roland McGrath et al.
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.5.2.
Compiled on a Linux 2.6.22 system on 2011-09-04.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.
"

After some searching work on google, I can conclude that this was caused by GCC version. I was not going to rebuilt gcc for this, so I tried get this header file and put it under $LFS/usr/include/linux/bsg.h. You can go to http://lxr.free-electrons.com/source/include/linux/bsg.h?v=2.6.25 to download the header file.

After copy & paster & chmod, I ran make again, and it succeeded.

Categories: IT Architecture, Kernel, Linux, Systems Tags:

correctable ecc event detected by cpu0/1/3/4

July 20th, 2011 Comments off

If you receive these kinds of alerts in Solaris, it means your server has Memory Dimm issue. Please check with hrdconf/madmin(HP-UX) or prtconf(SUN solaris) to see the error message.

For more information about ECC memory, you can refer to the following article: http://en.wikipedia.org/wiki/ECC_memory