Posts Tagged ‘linux’

resolved – /lib/ bad ELF interpreter: No such file or directory

November 7th, 2014 Comments off

In one of our script, error prompted when we ran it today:

[root@testhost01 ~]# su - user1
[user1@testhost01 ~]$ /home/testuser/run_as_root 'su'
-bash: /usr/local/packages/aime/ias/run_as_root: /lib/ bad ELF interpreter: No such file or directory

From the output, we can see that it's complaining for not founding file /lib/

[user1@testhost01 ~]$ ls -l /lib/
ls: cannot access /lib/ No such file or directory

I then checked on another host and found /lib/ belonged to package glibc:

[root@centos-doxer ~]# ls -l /lib/
lrwxrwxrwx 1 root root 9 May 9 2013 /lib/ ->
[root@centos-doxer ~]# rpm -qf /lib/

However, on the problematic host, glibc was installed:

[root@testhost01 user1]# rpm -qa|grep glibc

I then tried making a soft link from /lib64/ to /lib/

[root@testhost01 ~]# ln -s /lib64/ /lib/
[root@testhost01 ~]# su - user1
[user1@testhost01 ~]$ /usr/local/packages/aime/ias/run_as_root su
-bash: /usr/local/packages/aime/ias/run_as_root: Accessing a corrupted shared library

Hmmm, so it now complained about corrupted shared library. Maybe we need 32bit of glibc? So I removed the softlink, and then installed glibc.i686:

rm -rf /lib/
yum -y install glibc.i686

After installation, I found /lib/ was there already:

[root@testhost01 user1]# ls -l /lib/
lrwxrwxrwx 1 root root 10 Nov 7 03:46 /lib/ ->
[root@testhost01 user1]# rpm -qf /lib/

And when I ran again the command, it returned ok:

[root@testhost01 user1]# su - user1
[user1@testhost01 ~]$ /home/testuser/run_as_root 'su'
[root@testhost01 user1]#

So from this, we can see that the issue was caused by /usr/local/packages/aime/ias/run_as_root supports only 32bit of glibc.

Categories: IT Architecture, Kernel, Linux, Systems Tags:

resolved – auditd STDERR: Error deleting rule Error sending enable request (Operation not permitted)

September 19th, 2014 Comments off

Today when I try to restart auditd, the following error message prompted:

[2014-09-18T19:26:41+00:00] ERROR: service[auditd] (cookbook-devops-kernelaudit::default line 14) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of /sbin/service auditd restart ----
STDOUT: Stopping auditd: [  OK  ]
Starting auditd: [FAILED]
STDERR: Error deleting rule (Operation not permitted)
Error sending enable request (Operation not permitted)
---- End output of /sbin/service auditd restart ----
Ran /sbin/service auditd restart returned 1

After some reading of manpage auditd, I realized that when audit "enabled" was set to 2(locked), any attempt to change the configuration in this mode will be audited and denied. And that maybe the reason of "STDERR: Error deleting rule (Operation not permitted)", "Error sending enable request (Operation not permitted)". Here's from man page of auditctl:

-e [0..2] Set enabled flag. When 0 is passed, this can be used to temporarily disable auditing. When 1 is passed as an argument, it will enable auditing. To lock the audit configuration so that it can't be changed, pass a 2 as the argument. Locking the configuration is intended to be the last command in audit.rules for anyone wishing this feature to be active. Any attempt to change the configuration in this mode will be audited and denied. The configuration can only be changed by rebooting the machine.

You can run auditctl -s to check the current setting:

[root@centos-doxer ~]# auditctl -s
AUDIT_STATUS: enabled=1 flag=1 pid=3154 rate_limit=0 backlog_limit=320 lost=0 backlog=0

And you can run auditctl -e <0|1|2> to change this attribute on the fly, or you can add -e <0|1|2> in /etc/audit/audit.rules. Please note after you modify this, a reboot is a must to make this into effect.


Here's more about linux audit.

resolved – Permission denied even after chmod 777 world readable writable

September 19th, 2014 Comments off

Several team members asked me that when they want to change to some directories or read some files ,the system reported error "Permission denied". Even after setting world writable(chmod 777), the error was still there:

-bash-3.2$ cd /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs
-bash: cd: /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs: Permission denied

-bash-3.2$ cat /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out
cat: /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out: Permission denied

-bash-3.2$ ls -l /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out
-rwxrwxrwx 1 oracle oinstall 1100961066 Sep 19 07:37 /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out

In summary, if you want to read some file(e.g. wls_sdi1.out) under some directory(e.g. /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs), then except for "read bit" set on that file(chmod +r wls_sdi1.out), it's also needed that all parent directories of that file(/u01, /u01/local, /u01/local/wls, ......, /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs) have both "read bit" & "execute bit" set(you can check it by ls -ld <dir name>):

chmod +r wls_sdi1.out #first set "read bit" on the file
chmod +r /u01; chmod +x /u01; chmod +r /u01/local; chmod +x /u01/local; <...skipped...>chmod +r /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs; chmod +x /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs; #then set both "read bit" & "execute bit" on all parent directories

And at last, if you can log on as the file owner, then everything will be smooth. For /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out, it's owned by oracle user. So you can try log on as oracle user and do the operations.


  • -rw-rw-r--+ extended security information, + means ACL set (getfacl, setfacl). If it's NFS and 777 not working, then make sure no_root_squash enabled. For ZFS share, make sure to set on ZFS (User/Group & Anonymous user mapping -> root/root, Permission & Root directory Access rwx/rwx/rwx), and /etc/fstab "rsize=32768,wsize=32768,hard,nolock,timeo=14,noacl,intr,mountvers=3,vers=3" (may need reboot to fix permission issue)
  • -rw-rw-r--. means SELinux context set, to remove it, try find / -exec setfattr -h -x security.selinux {} \;
Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

yum install specified version of packages

July 15th, 2014 Comments off

Assume that you want to install one specified version of package, say glibc-2.5-118.el5_10.2.x86_64:

[root@centos-doxer ~]# yum list|grep glibc
glibc.i686 2.5-107.el5_9.4 installed
glibc.x86_64 2.5-107.el5_9.4 installed
glibc-common.x86_64 2.5-107.el5_9.4 installed
glibc-devel.i386 2.5-107.el5_9.4 installed
glibc-devel.x86_64 2.5-107.el5_9.4 installed
glibc-headers.x86_64 2.5-107.el5_9.4 installed
compat-glibc.i386 1:2.3.4-2.26 el5_latest
compat-glibc.x86_64 1:2.3.4-2.26 el5_latest
compat-glibc-headers.x86_64 1:2.3.4-2.26 el5_latest
glibc.i686 2.5-118.el5_10.2 el5_latest
glibc.x86_64 2.5-118.el5_10.2 el5_latest
glibc-common.x86_64 2.5-118.el5_10.2 el5_latest
glibc-devel.i386 2.5-118.el5_10.2 el5_latest
glibc-devel.x86_64 2.5-118.el5_10.2 el5_latest
glibc-headers.x86_64 2.5-118.el5_10.2 el5_latest
glibc-utils.x86_64 2.5-118.el5_10.2 el5_latest

Then you should execute glibc-2.5-118.el5_10.2.x86_64. The format of this command is yum install <packagename>-<version>.<platform, such as x86_64>.

Categories: IT Architecture, Linux, Systems Tags:

linux process accounting set up

July 8th, 2014 Comments off

Ensure package psacct is installed and make it boot with system:

rpm -qa|grep -i psacct
chkconfig psacct on
service psacct start

Here're some useful commands

[root@qg-dc2-tas_sdi ~]# ac -p #Display time totals for each user
emcadm 0.00
test1 2.57
aime 37.04
oracle 32819.22
root 12886.86
testuser 1.47
total 45747.15

[root@qg-dc2-tas_sdi ~]# lastcomm testuser #Display command executed by user testuser
top testuser pts/5 0.02 secs Fri Jul 4 03:59
df testuser pts/5 0.00 secs Fri Jul 4 03:59

[root@qg-dc2-tas_sdi ~]# lastcomm top #Search the accounting logs by command name
top testuser pts/5 0.03 secs Fri Jul 4 04:02

[root@qg-dc2-tas_sdi ~]# lastcomm pts/5 #Search the accounting logs by terminal name pts/5
top testuser pts/5 0.03 secs Fri Jul 4 04:02
sleep X testuser pts/5 0.00 secs Fri Jul 4 04:02

[root@qg-dc2-tas_sdi ~]# sa |head #Use sa command to print summarizes information(e.g. the number of times the command was called and the system resources used) about previously executed commands.
332 73.36re 0.03cp 8022k
33 8.76re 0.02cp 7121k ***other*
14 0.02re 0.01cp 26025k perl
7 0.00re 0.00cp 16328k ps
49 0.00re 0.00cp 2620k find
42 0.00re 0.00cp 13982k grep
32 0.00re 0.00cp 952k tmpwatch
11 0.01re 0.00cp 13456k sh
11 0.00re 0.00cp 2179k makewhatis*
8 0.01re 0.00cp 2683k sort

[root@qg-dc2-tas_sdi ~]# sa -u |grep testuser #Display output per-user
testuser 0.00 cpu 14726k mem sleep
testuser 0.03 cpu 4248k mem top
testuser 0.00 cpu 22544k mem sshd *
testuser 0.00 cpu 4170k mem id
testuser 0.00 cpu 2586k mem hostname

[root@qg-dc2-tas_sdi ~]# sa -m | grep testuser #Display the number of processes and number of CPU minutes on a per-user basis
testuser 22 8.18re 0.00cp 7654k

Categories: IT Architecture, Linux, Systems, Unix Tags:

Resolved – Your boot partition is on a disk using the GPT partitioning scheme but this machine cannot boot using GPT

June 12th, 2014 1 comment

Today when I tried to install Oracle VM Server on one server, the following error occurred:

Your boot partition is on a disk using the GPT partitioning scheme but this machine cannot boot using GPT. This can happen if there is not enough space on your hard drive(s) for the installation.

So to went on with the installation, I had to think of a way to erase GPT partition table on the drive.

To do this, the first step is to fall into linux rescue mode when booting from CDROM(another way is when installing OVS, Use Alt-F2 to access a different terminal screen to the installer. Use fdisk from the command line to manually repartition the disk using a dos partition table.):


Later, check with fdisk -l, I could see that /dev/sda was the only disk that needed erasing GPT label. So I used dd if=/dev/zero of=/dev/sda bs=512 count=1 to erase GPT table:




After this, run fdisk -l again, I saw that the partition table was now gone:


Later, re-initializing installation of OVS server. When the following message prompted, select "No":


And select "yes" when below message prompted so that we can make new partition table:


The steps after this was normal ones, and the installation went smoothly.


  • You can disable internal USB device by going to BIOS:

To disable them not participating into system disks when doing system install, you can do below:



  • If the disk is more than 2T, then there's no way to soft convert from GPT to MBR, so you'll need decrease the disk size from BIOS booting process. Here's an example of using LSI Raid Controller MegaRAID BIOS Config Utility Drive 252 to reconfig disks from Oracle iLOM GUI console(you can check here for MegaCLI). (tips - the first sector contains MBR<446 byptes> and partition table<64 byptes>, 3 main partitions and 1 extend partition. The extend partition can have many logical partitions)

Alt + A to enable/disable shortcut select.

When short cut disabled - TAB to move between items, Enter will work as expected when shortcut disabled.

When shortcut enabled - Space key acts as Enter when shortcut enabled.

If you want to go into BIOS, press F2.

Press Ctrl + H to goto MegaRaid WebBIOS(in newer version, press Ctrl + R). Make sure progress is 100% for newly created VD.

屏幕快照 2016-04-11 17.37.27


3-clear configuration

4-add configuration

5-manual configuration










Below is for new version(use small strip size for DB):



Resolved – rm cannot remove some files with error message “Device or resource busy”

June 11th, 2014 1 comment

If you meet problem when remove one file on linux with below error message:

[root@test-host ~]# rm -rf /u01/shared/*
rm: cannot remove `/u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001': Device or resource busy
rm: cannot remove `/u01/shared/WLS/oracle_common/modules/oracle.jrf_11.1.1/.nfs0000000000005c7a00000002': Device or resource busy
rm: cannot remove `/u01/shared/WLS/OracleHome/soa/modules/oracle.soa.fabric_11.1.1/.nfs0000000000006bcf00000003': Device or resource busy

Then it means that some progresses were still referring to these files. You have to stop these processes before remove these files. You can use linux command lsof to find the processes using specific files:

[root@test-host ~]# lsof |grep nfs0000000000004abf00000001
java 2956 emcadm mem REG 0,21 1095768 19135 /u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001 (slce49sn-nas:/export/C9QA123_DC1/tas_central_shared)
java 2956 emcadm 88r REG 0,21 1095768 19135 /u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001 (slce49sn-nas:/export/C9QA123_DC1/tas_central_shared)

So from here you can see that processe with PID 2956 is still using file /u01/shared/WLS/oracle_common/soa/modules/oracle.soa.mgmt_11.1.1/.nfs0000000000004abf00000001.

However, some systems have no lsof installed by default. Then you can install it or by using the alternative one "fuser":

[root@test-host ~]# fuser -cu /u01/shared/WLS/oracle_common
/u01/shared/WLS/oracle_common: 2956m(emcadm) 7358c(aime)

Then you can see also that progresses with PIDs 2956 and 7358 are referring to the directory /u01/shared/WLS/oracle_common.

so you'll need stop the process first by killing it(or stop it using the processes own stop() method if defined):

kill -9 2956

After that, you can try remove the files again, should be ok this time.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

Resolved – input_userauth_request: invalid user root

May 15th, 2014 2 comments

Today when I tried to ssh to one linux box but it failed, and /var/log/secure gave the following messages:

May 15 04:05:07 testbox sshd[22925]: User root from not allowed because not listed in AllowUsers
May 15 04:05:07 testbox sshd[22928]: input_userauth_request: invalid user root
May 15 04:05:07 testbox unix_chkpwd[22929]: password check failed for user (root)
May 15 04:05:07 testbox sshd[22925]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost= user=root
May 15 04:05:09 testbox sshd[22925]: Failed password for invalid user root from port 50362 ssh2
May 15 04:05:10 testbox unix_chkpwd[22930]: password check failed for user (root)
May 15 04:05:11 testbox sshd[22928]: Connection closed by

Then I had a check of /etc/ssh/sshd_config and modified the following:

[root@testbox ~]# egrep 'PermitRoot|AllowUser' /etc/ssh/sshd_config
PermitRootLogin yes #change this to yes
#AllowUsers testuser #comment out this

Later, restart sshd, service sshd restart, and later ssh worked.

Categories: IT Architecture, Linux, Systems Tags: ,

linux tips

April 10th, 2014 Comments off
Linux system tips
ls -lu(access time, like cat file) -lt(modification time, like vi, ls -l defaults to use this) -lc(change time, chmod), stat ./aa.txt <UTC>
ctrl +z #bg and stopped
%1     #fg and running
%1 & #bg and running
man dd > dd.txt #or "man dd | col -b > dd.txt"
cat > listbkup.rman << EOF
pgrep -flu oracle  # processes owned by the user oracle
cat a|grep '\W$' #regular expression of grep, more regular expressions here
#character end with
if [ `cat ${path1}/all_assemblies_BE.list | grep -c "$BEname$"` -gt 1 ]; then
cat ${path1}/all_assemblies_BE.list | grep "$BEname$" >> ${path}/all_assemblies_BE_filter.list
echo $BEname >> ${path1}/Confliction_BE.list
watch free -m #refresh every 2 seconds
pmap -x 30420 #memory mapping.
dmesg -n 1 #prevents all messages, except panic messages, from appearing on the console
head /etc/rsyslog.conf #do not log certain logs
:msg, contains, "Unable to register client with session bus" ~
:msg, contains, "error getting session bus" ~

./configure --prefix=/usr/local/ --enable-shared LDFLAGS="-L/usr/local/ssl/lib" #if you are using non-standard Lib path

mount --bind #more info here
blockdev --getbsz /dev/xvda1 #get blocksize of FS. Here is all about DM multipath(no_path_retry queue in /etc/multipath.conf, this means queueing should not stop until the path is fixed. This can avoid kernel panic for RAC LUNs when there is short network outage). Here is more about some multipath configs using blockdev (multipath -ll; multipath -v 3; ls -ld /sys/block/dm-4/{slaves, holders}/{dm-11,dm-13}, ls /sys/block/dm-4/slaves/sdg/. sysfs has /sys/block/$device/{holders,slaves} directories to represent which devices depend on which others. From here, we can see dm-11/dm-13 are partitions created on top of dm-4, and dm-4 holds /dev/sdg). Sometime when you met error "/dev/sda3 is apparently in use by the system; will not make a filesystem here!", then try to mkfs on "/dev/mapper/xxxx"
[root@test ~]# multipath -l
3600605b00a2f47e0000c5e820a4351fe dm-0 LSI,MR9361-8i
size=1.1T features='0' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=0 status=active
  `- 0:2:2:0 sdc 8:32  active undef running

[root@test ~]# blkid|grep 3600605b00a2f47e0000c5e820a4351fe1
/dev/mapper/3600605b00a2f47e0000c5e820a4351fep1: UUID="da2359ef-815f-455d-9402-d28031129afd" TYPE="ext3" SEC_TYPE="ext2"

[root@test ~]# ls /sys/block/dm-0/holders/
dm-17  dm-18

[root@test ~]# cat /etc/fstab|grep da2359ef-815f-455d-9402-d28031129afd
UUID=da2359ef-815f-455d-9402-d28031129afd /extend2 ext3 defaults 0 0

[root@test ~]# dmsetup ls
3600605b00a2f47e0000c5e820a4351fe       (252, 0)
3600605b00a2f47e0000c5e820a4351fep1     (252, 17)
3600605b00a2f47e0000c5e820a4351fep9     (252, 18)

[root@test ~]# fdisk -l /dev/sdc

    WARNING: GPT (GUID Partition Table) detected on '/dev/sdc'! The util fdisk doesn't support GPT. Use GNU Parted.

    Disk /dev/sdc: 1198.9 GB, 1198999470080 bytes
    255 heads, 63 sectors/track, 145770 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes

       Device Boot      Start         End      Blocks   Id  System
    /dev/sdc1               1      145771  1170897919+  ee  EFI GPT

[root@test ~]# parted /dev/sdc print #more info about parted is here

    Model: LSI MR9361-8i (scsi)
    Disk /dev/sdc: 1199GB
    Sector size (logical/physical): 512B/512B
    Partition Table: gpt

    Number  Start   End     Size    File system  Name     Flags
     1      131kB   1199GB  1199GB  ext3         zfs
     9      1199GB  1199GB  8389kB               solaris

    Information: Don't forget to update /etc/fstab, if necessary.

To make a 8T disk into 8*1T partitions, then we need use GPT partition table, and use parted as disk partition tool(more about parted is here).

[root@server1 kernel]# parted /dev/sdb #more info about parted is here
GNU Parted 2.1
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt
Warning: The existing disk label on /dev/sdb will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? Yes
(parted) mkpart primary 0GB 1024GB #primary/extended/logical, 'extended/logical' are specific to msdos partition table. 
(parted) mkpart primary 1024GB 2048GB
(parted) mkpart primary 2048GB 3072GB
(parted) mkpart primary 3072GB 4096GB
(parted) mkpart primary 4096GB 5120GB
(parted) mkpart primary 5120GB 6144GB
(parted) mkpart primary 6144GB 7168GB
(parted) mkpart primary 7168GB 8192GB
(parted) print
Model: LSI MR9361-8i (scsi)
Disk /dev/sdb: 8192GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number Start End Size File system Name Flags
 1 1049kB 1024GB 1024GB primary
 2 1024GB 2048GB 1024GB primary
 3 2048GB 3072GB 1024GB primary
 4 3072GB 4096GB 1024GB primary
 5 4096GB 5120GB 1024GB primary
 6 5120GB 6144GB 1024GB primary
 7 6144GB 7168GB 1024GB primary
 8 7168GB 8192GB 1024GB primary

(parted) help
 align-check TYPE N check partition N for TYPE(min|opt) alignment
 check NUMBER do a simple check on the file system
 cp [FROM-DEVICE] FROM-NUMBER TO-NUMBER copy file system to another partition
 help [COMMAND] print general help, or help on COMMAND
 mklabel,mktable LABEL-TYPE create a new disklabel (partition table)
 mkfs NUMBER FS-TYPE make a FS-TYPE file system on partition NUMBER
 mkpart PART-TYPE [FS-TYPE] START END make a partition
 mkpartfs PART-TYPE FS-TYPE START END make a partition with a file system
 move NUMBER START END move partition NUMBER
 name NUMBER NAME name partition NUMBER as NAME
 print [devices|free|list,all|NUMBER] display the partition table, available devices, free space, all found partitions, or a particular partition
 quit exit program
 rescue START END rescue a lost partition near START and END
 resize NUMBER START END resize partition NUMBER and its file system
 rm NUMBER delete partition NUMBER
 select DEVICE choose the device to edit
 set NUMBER FLAG STATE change the FLAG on partition NUMBER
 toggle [NUMBER [FLAG]] toggle the state of FLAG on partition NUMBER
 unit UNIT set the default unit to UNIT
 version display the version number and copyright information of GNU Parted

(parted) quit
Information: You may need to update /etc/fstab.
dumpe2fs /dev/xvda1 |grep 'Block size'
parted -l #ext3 or ext4
partprobe /dev/sda
time dd if=/dev/zero of=/media/test01.img bs=1M count=3000 conv=fsync
awk '/abc/{flag=1;next}/def/{flag=0}flag' file #print lines between two matches
sed -ne '/ovm -uadmin/,//p' #print lines after match
sed '/^\s*$/d' #remove empty lines
cat a|tac|sed '/storage on the peer/I,+2 d'|tac #to remove 2 lines above match
sed -i.bak2014 's/HISTSIZE=1000/HISTSIZE=100000/' /etc/profile
echo 71_testhost_emgc | sed -re 's/.*_(slce.*)_.*/\1/g' #output testhost
sed -re 's/User_Alias USERS_SDITAS.*/&, xiaozliu/g' a.txt #'&' refers to matched content
sed -i "s/xvda,w'/xvda,w',\n/g" $i/vm.cfg #this will replace System.img,xvda,w'] with System.img,xvda,w',
#To append after the pattern
sed -i '/pattern/a \
line1 \
line2' inputfile
sed -i '/System.img,xvda,w/a \"'file:/var/ovs/mount/157836ABB7C647088D8EF009923FE2E7/running_pool/$i/u02.img,xvdb,w',"' $i/vm.cfg
#To prepend the lines before
sed -i '/pattern/i \
line1 \
line2' inputfile
echo 'export HISTTIMEFORMAT="%h/%d - %H:%M:%S"' >> /etc/profile
ovm svr ls|sort -rn -k 4 #sort by column 4
cat a|sort -t . -k 3,3n -k 4,4n|less #linux IP sort
awk '{print $NF,$0}' a| sort -nr | cut -f2- -d' ' #sort by last column
echo test_1_2_3|awk -F_ '{ print substr($0, index($0,$2)) }' #print column 2 to last column
cat a1|sort|uniq -c |sort #SUS
[root@test-doxer monitor]# cat /etc/prun_doxer/modules/monitor/result_rac.txt | grep ':' - more than 90% - testhost0058:/scratch 215G 185G 20G 91% /net/testhost0058/scratch
testhost0056 - more than 90% - testhost0058:/scratch 215G 185G 20G 91% /net/testhost0058/scratch[root@test-doxer monitor]# cat /etc/prun_doxer/modules/monitor/result_rac.txt | grep ':' | sort -u -k7,7 | sort -rnk 11 - more than 90% - testhost0058:/scratch 215G 185G 20G 91% /net/testhost0058/scratch
ovm svr ls|uniq -f3 #skip the first three columns, this will list only 1 server per pool
for i in <all OVMs>;do ( $i &);done #instead of using nohup &
ovm vm ls|egrep "`echo testhost{0\|,1\|,2\|,3\|,4}|tr -d '[:space:]'`"
xm info|grep nr_cpu;xm list|egrep -v 'Domain-0|VCPUs'|awk '{ SUM += $4} END { print SUM }' #get cpu allocated in xen hypervisor(OVS)
cat a|awk '{print $5}'|tr '\n' ' '
awk '{print NF" "FILENAME;exit}' file.txt #get column count of file along with filenames
ovm -uadmin -ppassword svrp info -s testpool|egrep 'CPUs|Memory'|awk -F ': ' 'BEGIN{count=0;}{num[count]=$2;print $0;count++;}; END {print num[1]/num[0]*100,(num[2]-num[3])/num[2]*100}' #percent of used cpu, percent of used memory
Total Number of CPUs: 128
Allocated VCPUs: 50
Total Memory(MB): 1048532
Free Memory(MB): 652236
free -m | awk '/Mem:/ { total=$2 } /buffers\/cache/ { free=$4 } END { print free"\\t"total"\\t"free/total*100}'  #percent of free memory
if [ "$1" == "all" ];then
for i in `ovm -uadmin -ppassword svrp ls|sed 1d|awk '{print $1}'`;do echo $i;ovm -uadmin -ppassword svrp info -s $i |egrep 'CPUs|Memory'|awk -F ': ' 'BEGIN{count=0;}{num[count]=$2;print $0;count++;}; END {print num[1]/num[0]*100,(num[2]-num[3])/num[2]*100}';echo;sleep 5;done
echo $pool;ovm -uadmin -ppassword svrp info -s $1 |egrep 'CPUs|Memory'|awk -F ': ' 'BEGIN{count=0;}{num[count]=$2;print $0;count++;}; END {print num[1]/num[0]*100,(num[2]-num[3])/num[2]*100}';echo;
getopt #getopts is builtin
date -d '1970-1-1 1276059000 sec utc'
date -d '2010-09-11 23:20' +%s
df -Ph #no new line, print in one line
find . -name '*txt'|xargs tar cvvf a.tar
find . -maxdepth 1
for i in `find /usr/sbin/ -type f ! -perm -u+x`;do chmod +x $i;done #files that has no execute permisson for owner
cd /u01/app/oracle/admin/andy/adump; find . -type f -name "*.aud" -mtime +5 -print -exec rm -rf {} \;
find ./test -type f -mtime +2 -print #files modified two days ago, older
find ./test -type f -mtime -2 -print #files modified in recent two days, recent
cd /u01/app/oracle/admin/andy/adump; find . -type f -name "*.aud" -mtime +5 -print -exec rm {} \;
cd /u01/app/oracle/diag/rdbms/andy/andy2/trace/; find . -type f -name "*.trc" -or -name "*.trm" -mtime +5 -print -exec rm {} \;
rm -rf /u01/app/oracle/diag/rdbms/andy/andy2/alert/log_*xml
find ./* -prune -print #-prune,do not cascade
find . -fprint file #put result to file
tar tvf a.tar  --wildcards "*ipp*" #globbing patterns
unzip -l #list contents of zip
zip -r -P "yourpassword" /backup/myback-`date +%Y%m%d-%H`.zip /ISO/scripts
tar xvf bfiles.tar --wildcards --no-anchored 'b*'
tar --show-defaults
tar cvf a.tar --totals *.txt #show speed
tar cvf /media/test/1645_bak/latest2.tar latest/ --remove-files
tar --append --file=collection.tar rock #add rock to collection.tar
tar --update -v -f collection.tar blues folk rock classical #only append new or updated ones, not replace
tar --delete --file=collection.tar blues #not on tapes
tar -c -f archive.tar --mode='a+rw'
tar -C sourcedir -cf - . | tar -C targetdir -xf - #copy directories
tar xvf spacereport.html.tar -C /var/tmp/; rm spacereport.html.tar
tar -c -f jams.tar grape prune -C food cherry #-C,change dir, foot file cherry under foot directory
find . -size -400 -print > small-files
find /test -name "*~" -exec rm -f {} \; #remove linux EDIT's backup files
tar czvf kernel.tgz linux-2.6.29
tar zxvf kernel.tgz
tar -cf src.tar --exclude='*.o' src #multiple --exclude can be specified
expr 5 - 1
rpm2cpio ./ash-1.0.1-1.x86_64.rpm |cpio -ivd
eval $cmd
exec menu.viewcards #same to .
ls . | xargs -0 -i cp ./{} /etc #-i,use \n as separator, just like find -exec. -0 for space in filename. find -print0 use space to separate, not enter.(-i or -I {} for revoking filenames in the middle)
ls | xargs -t -i mv {} {}.old #mv source should exclude /,or unexpected errors may occur
mv --strip-trailing-slashes source destination
ls |xargs file /dev/fd/0 #replace -
find . -type d |xargs -i du -sh {} |awk '$1 ~ /G/'
ovm svr ls|awk '$NF ~ /QA_GA_DC2$/'
ypcat passwd|awk -F: '{if($1 ~ /^user1$|^user2$/) print}'|grep false
timeserver=`netstat -rn|awk '{if($1=="") print $2}'`
echo $((16#`dd if=/dev/urandom bs=1 count=4 2>/dev/null | od -A n -t x4|sed s/[^1-9a-fA-F]//g`)) %60 | bc > /tmp/Min
echo $((16#`dd if=/dev/urandom bs=1 count=4 2>/dev/null | od -A n -t x4|sed s/[^1-9a-fA-F]//g`)) %24 | bc > /tmp/Hour
echo `cat aa.txt|tr "\n" "+"|sed 's/.$//'`|bc #calculate numbers in aa.txt
syminq -pdevfile |awk '!/^#/ {print $1,$4,$5}' #ignore lines started with #
sed -i '/virt[0-9]\{5\}/!d' /var/tmp/*.status #only show SDI names.
sed 's/.$//' #remove last character
ls -l -I "*out*" #not include out
for i in `ls -I shared -I oracle`;do du -sh $i;done #exclude shared and oracle directories
find . -type f -name "*20120606" -exec rm {} \; #do not need rm -rf. find . -type f -exec bash -c "ls -l '{}'" \;
ps -ef|grep init|sed -n '1p' #ps -ef --forest, show process hierarchy
ps -eLf #more in "Thread Display" of ps man page
pstree -aAhlup [ PID | USER ]
cut -d ' ' -f1,3 /etc/mtab #first and third seq 15 21 #print 15 to 21
seq -s" " 15 21 #or echo {15..21}. use space as separator
(echo n; echo p; echo 1; echo;echo;echo t; echo 83; echo w) | fdisk /dev/xvdb #here is more about fdisk output. To determine whether partitions are correctly aligned, use fdisk -lu to find the starting sectors of the partitions. Ensure that these are a multiple of 8 (512 byte sectors), which aligns them at 4KB, the OCFS2 block size. Many recent operating systems often start the first partition in sector 2048, aligning it to 1MB, which works well for most storage RAID stripe sizes.
updatedb -e "/u01/local /u01/shared /sftpstaging" #skip paths, also in /etc/updatedb.conf
symbolic link/hard link (more info is here)

symbolic link

points to another file by name, its contents are the name of the real file
can be on different filesystem

hard link

A hardlink isn't a pointer to a file, it's a directory entry (a file) pointing to the same inode. Even if you change the name of the other file, a hardlink still points to the file. A file will only be deleted from disk when the last link to its inode is gone (you rmd or unlinkd the last link)

only work for files, not directories

The file you link to must actually exist and be in the same filesystem where you are trying to create the link

look at every file and compare their inode number to find the other name(s) that have the same inode number

how many names a file has(link count) from the output of ls -l


[root@oel5-doxer ~]# ln b b.h

[root@oel5-doxer ~]# ls -l b b.h
-rw-r--r-- 2 root root 14 Nov 2 02:51 b
-rw-r--r-- 2 root root 14 Nov 2 02:51 b.h

[root@oel5-doxer ~]# rm b
rm: remove regular file `b'? y

[root@oel5-doxer ~]# ls -l b.h
-rw-r--r-- 1 root root 14 Nov 2 02:51 b.h

Inodes are always unique, but unique per partition. To uniquely identify a file, you need the inode and the device (the disk partition). root has inode #2(e.g. /, /boot, ls -i). data blocks contain the contents of the file. The inode contains the following pieces of information:

Mode/permission (protection)
Owner ID
Group ID
Size of file
Number of hard links to the file
Time last accessed
Time last modified
Time inode last modified
The name of the file is in the directory(directory is a file too), the directory is just a table that contains the filenames in the directory, and the matching inode.
[root@oel5-doxer boot]# ls -ld . ..
drwxr-xr-x 4 root root 1024 Aug 20 2012 . #4 hardlinks to /boot, minus ./.. under /boot, so it means there are two sub-directories under /boot/
drwxr-xr-x 44 root root 4096 Dec 9 08:02 .. #44 hardlinks to /, minus ./.. under /, so it means there are 42 sub-directories under /.
 uname -r|grep "el5" -i -q;if [ $? = '0' ];then rpm -Uvh python-ctypes-1.0.2-3.el5.x86_64.rpm;rpm -Uvh iotop-0.4.3-4.0.1.el5.noarch.rpm;else rpm -Uvh iotop-0.3.2-7.el6.noarch.rpm;fi


Categories: IT Architecture, Linux, Systems Tags:

resolved – /lib/ bad ELF interpreter: No such file or directory

April 1st, 2014 Comments off

When I ran perl command today, I met problem below:

[root@test01 bin]# /usr/local/bin/perl5.8
-bash: /usr/local/bin/perl5.8: /lib/ bad ELF interpreter: No such file or directory

Now let's check which package /lib/ belongs to on a good linux box:

[root@test02 ~]# rpm -qf /lib/

So here's the resolution to the issue:

[root@test01 bin]# yum install -y glibc.x86_64 glibc.i686 glibc-devel.i686 glibc-devel.x86_64 glibc-headers.x86_64

Categories: IT Architecture, Kernel, Linux, Systems Tags:

resolved – sudo: sorry, you must have a tty to run sudo

April 1st, 2014 4 comments

The error message below sometimes will occur when you run a sudo <command>:

sudo: sorry, you must have a tty to run sudo

To resolve this, you may comment out "Defaults requiretty" in /etc/sudoers(revoked by running visudo). Here is more info about this method.

However, sometimes it's not convenient or even not possible to modify /etc/sudoers, then you can consider the following:

echo -e "<password>\n"|sudo -S <sudo command>

For -S parameter of sudo, you may refer to sudo man page:

-S' The -S (stdin) option causes sudo to read the password from the standard input instead of the terminal device. The password must be followed by a newline character.

So here -S bypass tty(terminal device) to read the password from the standard input. And by this, we can now pipe password to sudo.


From comments, you may also try below:

1. Comment out Defaults requiretty in /etc/sudoers

2. Defaults:[username] !requiretty #change [username]

3. You can use ssh -t to force pseudo-tty allocation. e.g. ssh -t user1@hostname1 "sudo df -h"

4. If you met error "PTY allocation request failed on channel 0" when SSH, then you can increase pty number

sysctl -a|grep -i pty

kernel.pty.max = 4096 = 237

vi /etc/sysctl.conf #kernel.pty.max = 10000

sysctl -p;sysctl -a|grep pty

set vnc not asking for OS account password

March 18th, 2014 Comments off

As you may know, vncpasswd(belongs to package vnc-server) is used to set password for users when connecting to vnc using a vnc client(such as tightvnc). When you connect to vnc-server, it'll ask for the password:

vnc-0After you connect to the host using VNC, you may also find that the remote server will ask again for OS password(this is set by passwd):

vnc-01For some cases, you may not want the second one. So here's the way to cancel this behavior:




Categories: IT Architecture, Linux, Systems Tags: ,

resolved – ssh Read from socket failed: Connection reset by peer and Write failed: Broken pipe

March 13th, 2014 Comments off

If you met following errors when ssh to linux box:

Read from socket failed: Connection reset by peer

Write failed: Broken pipe

Then there's one possibility that the linux box's filesystem was corrupted. As in my case there's output to stdout:

EXT3-fs error ext3_lookup: deleted inode referenced

To resolve this, you need make linux go to single user mode and fsck -y <filesystem>. You can get corrupted filesystem names when booting:

[/sbin/fsck.ext3 (1) -- /usr] fsck.ext3 -a /dev/xvda2
/usr contains a file system with errors, check forced.
/usr: Directory inode 378101, block 0, offset 0: directory corrupted

(i.e., without -a or -p options)

[/sbin/fsck.ext3 (1) -- /oem] fsck.ext3 -a /dev/xvda5
/oem: recovering journal
/oem: clean, 8253/1048576 files, 202701/1048233 blocks
[/sbin/fsck.ext3 (1) -- /u01] fsck.ext3 -a /dev/xvdb
u01: clean, 36575/14548992 files, 2122736/29081600 blocks

So in this case, I did fsck -y /dev/xvda2 && fsck -y /dev/xvda5. Later reboot host, and then everything went well.


If two VMs are booted up in two hypervisors and these VMs shared the same filesystem(like NFS), then after fsck -y one FS and booted up the VM, the FS will corrupt soon as there're other copies of itself is using that FS. So you need first make sure that only one copy of VM is running on hypervisors of the same server pool.

Categories: IT Architecture, Kernel, Linux, Systems Tags:

psftp through a proxy

March 5th, 2014 Comments off

You may know that, we can set proxy in putty for ssh to remote host, as shown below:

putty_proxyAnd if you want to scp files from remote site to your local box, you can use putty's psftp.exe. There're many options for psftp.exe:

C:\Users\test>d:\PuTTY\psftp.exe -h
PuTTY Secure File Transfer (SFTP) client
Release 0.62
Usage: psftp [options] [user@]host
-V print version information and exit
-pgpfp print PGP key fingerprints and exit
-b file use specified batchfile
-bc output batchfile commands
-be don't stop batchfile processing if errors
-v show verbose messages
-load sessname Load settings from saved session
-l user connect with specified username
-P port connect to specified port
-pw passw login with specified password
-1 -2 force use of particular SSH protocol version
-4 -6 force use of IPv4 or IPv6
-C enable compression
-i key private key file for authentication
-noagent disable use of Pageant
-agent enable use of Pageant
-batch disable all interactive prompts

Although there's proxy setting option for putty.exe, there's no proxy setting for psftp.exe! So what should you do if you want to copy files back to local box, and there's firewall blocking you from doing this directly, and you must use a proxy?

As you may notice, there's "-load sessname" option in psftp.exe:

-load sessname Load settings from saved session

This option means that, if you have session opened by putty.exe, then you can use psftp.exe -load <session name> to copy files from remote site. For example, suppose you opened one session named mysession in putty.exe in which you set proxy there, then you can use "psftp.exe -load mysession" to copy files from remote site(no need for username/password, as you must have entered that in putty.exe session):

C:\Users\test>d:\PuTTY\psftp.exe -load mysession
Using username "root".
Remote working directory is /root
psftp> ls
Listing directory /root
drwx------ 3 ec2-user ec2-user 4096 Mar 4 09:27 .
drwxr-xr-x 3 root root 4096 Dec 10 23:47 ..
-rw------- 1 ec2-user ec2-user 388 Mar 5 05:07 .bash_history
-rw-r--r-- 1 ec2-user ec2-user 18 Sep 4 18:23 .bash_logout
-rw-r--r-- 1 ec2-user ec2-user 176 Sep 4 18:23 .bash_profile
-rw-r--r-- 1 ec2-user ec2-user 124 Sep 4 18:23 .bashrc
drwx------ 2 ec2-user ec2-user 4096 Mar 4 09:21 .ssh
psftp> help
! run a local command
bye finish your SFTP session
cd change your remote working directory
chmod change file permissions and modes
close finish your SFTP session but do not quit PSFTP
del delete files on the remote server
dir list remote files
exit finish your SFTP session
get download a file from the server to your local machine
help give help
lcd change local working directory
lpwd print local working directory
ls list remote files
mget download multiple files at once
mkdir create directories on the remote server
mput upload multiple files at once
mv move or rename file(s) on the remote server
open connect to a host
put upload a file from your local machine to the server
pwd print your remote working directory
quit finish your SFTP session
reget continue downloading files
ren move or rename file(s) on the remote server
reput continue uploading files
rm delete files on the remote server
rmdir remove directories on the remote server

Now you can get/put files as we used to now.


If you do not need proxy connecting to remote site, then you can use psftp.exe CLI to get remote files directly. For example:

d:\PuTTY\psftp.exe root@ -i d:\PuTTY\aws.ppk -b d:\PuTTY\script.scr -bc -be -v

And in d:\PuTTY\script.scr is script for put/get files:

cd /backup
lcd c:\
mget *.tar.gz


Here is an article about FTP data channel active/passive transfer mode(passive is preferred when there's firewall in front of clients, and we need define port ranges from server side if there's firewall also on server side).

Categories: IT Architecture, Linux, Systems Tags: ,

avoid putty ssh connection sever or disconnect

January 17th, 2014 2 comments

After sometime, ssh will disconnect itself. If you want to avoid this, you can try run the following command:

while [ 1 ];do echo hi;sleep 60;done &

This will print message "hi" every 60 seconds on the standard output.


You can also set some parameters in /etc/ssh/sshd_config, you can refer to

make sudo asking for no password on linux

November 1st, 2013 Comments off

Assuming that you have a user named 'test', and he belongs to 'admin' group. So you want user test can sudo to root, and don't want linux prompting for password. Here's the way you can do it:

cp /etc/sudoers{,.bak}
sed -i '/%admin/ s/^/# /' /etc/sudoers
echo '%admin ALL=(ALL) NOPASSWD: ALL' >> /etc/sudoers


make tee to copy stdin as well as stderr & prevent ESC output of script

October 30th, 2013 Comments off
  • Make tee to copy stdin as well as stderr

As said by manpage of tee:

read from standard input and write to standard output and files

So if you have error messages in your script, then the error messages will not copied and write to file.

Here's one workaround for this:

./ 2>&1 | tee -a log

Or you can use the more complicated one:

command > >(tee stdout.log) 2> >(tee stderr.log >&2)

  • Prevent ESC output of script

script literally captures every type of output that was sent to the screen. If you have colored or bold output, this shows up as esc characters within the output file. These characters can significantly clutter the output and are not usually useful. If you set the TERM environmental variable to dumb (using setenv TERM dumb for csh-based shells and export TERM=dumb for sh-based shells), applications will not output the escape characters. This provides a more readable output.

In addition, the timing information provided by script clutters the output. Although it can be useful to have automatically generated timing information, it may be easier to not use script’s timing, and instead just time the important commands with the time command mentioned in the previous chapter.


  1. Here's the full version
  2. Some contents of this article is excerpted from <Optimizing Linux® Performance: A Hands-On Guide to Linux® Performance Tools>.

make label for swap device using mkswap and blkid

August 6th, 2013 Comments off

If you want to label one swap partition in linux, you should not use e2label for this purpose. As e2label is for changing the label on an ext2/ext3/ext4 filesystem, which do not include swap filesystem.

If you use e2label for this, you will get the following error messages:

[root@node2 ~]# e2label /dev/xvda3 SWAP-VM
e2label: Bad magic number in super-block while trying to open /dev/xvda3
Couldn't find valid filesystem superblock.

We should use mkswap for it. As mkswap has one option -L:

-L label
Specify a label, to allow swapon by label. (Only for new style swap areas.)

So let's see example below:

[root@node2 ~]# mkswap -L SWAP-VM /dev/xvda3
Setting up swapspace version 1, size = 2335973 kB
LABEL=SWAP-VM, no uuid

[root@node2 ~]# blkid
/dev/xvda1: LABEL="/boot" UUID="6c5ad2ad-bdf5-4349-96a4-efc9c3a1213a" TYPE="ext3"
/dev/xvda2: LABEL="/" UUID="76bf0aaa-a58e-44cb-92d5-098357c9c397" TYPE="ext3"
/dev/xvdb1: LABEL="VOL1" TYPE="oracleasm"
/dev/xvdc1: LABEL="VOL2" TYPE="oracleasm"
/dev/xvdd1: LABEL="VOL3" TYPE="oracleasm"
/dev/xvde1: LABEL="VOL4" TYPE="oracleasm"
/dev/xvda3: LABEL="SWAP-VM" TYPE="swap"

[root@node2 ~]# swapon /dev/xvda3

[root@node2 ~]# swapon -s
Filename Type Size Used Priority
/dev/xvda3 partition 2281220 0 -1

So now we can add swap to /etc/fstab using LABEL=SWAP-VM:

LABEL=SWAP-VM           swap                    swap    defaults        0 0

linux – how to find which process is doing the most io

July 30th, 2013 Comments off

find /proc/ -maxdepth 3 -type f -name io -exec egrep -H 'read_bytes|write_bytes' {} \;|sort -rnk 2|head

Then you can ps auxww|grep <pid> to see what processes are doing most of the IO.

Here's a better way to find out which process caused the disk to spin up.

/etc/init.d/syslog stop
echo 1 > /proc/sys/vm/block_dump
dmesg | egrep "READ|WRITE|dirtied"|awk -F: '{print $1}'| sort | uniq -c | sort -rn | head

The output will be like the following:

1423 kjournald
1075 pdflush
209 indexer
3 cronolog
1 rnald
1 mysqld

After this, make sure to recover /proc/sys/vm/block_dump and bring up syslog:

echo 0 > /proc/sys/vm/block_dump

/etc/init.d/syslog start

Categories: IT Architecture, Linux, Systems Tags: ,

resolved – yum returned Segmentation fault error on centos

January 6th, 2013 Comments off

The following error messages occurred while running yum list or yum update on a centos/rhel host:

[root@test-centos ~]# yum list
Loaded plugins: rhnplugin, security
This system is not registered with ULN.
ULN support will be disabled.
Segmentation fault

As always, I did a strace on this:

[root@test-centos ~]# strace yum list
open("/var/cache/yum/el5_ga_base/primary.xml.gz", O_RDONLY) = 6
lseek(6, 0, SEEK_CUR) = 0
read(6, "\37\213\10\10\0\0\0\0\2\377/u01/basecamp/www/el5_"..., 8192) = 8192
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

And here's the error messages from /var/log/messages:

[root@test-centos ~]# tail /var/log/messages
Jan 6 07:07:44 test-centos kernel: yum[5951]: segfault at 3500000000 ip 000000350cc79e0a sp 00007fff05633b78 error 4 in[350cc00000+14e000]

After some googling, I found the yum "Segmentation fault" was caused by the conflict between zlib and yum. To resolve this problem, we need use the older version of zlib. Here's the detailed steps:

[root@test-centos ~]# cd /usr/lib
[root@test-centos lib]# ls -l libz*
-rw-r--r-- 1 root root 125206 Jul 9 07:40 libz.a
lrwxrwxrwx 1 root root 22 Aug 2 07:10 -> /usr/lib/
lrwxrwxrwx 1 root root 22 Aug 2 07:10 -> /usr/lib/
-rwxr-xr-x 1 root root 75028 Jun 7 2007
-rwxr-xr-x 1 root root 99161 Jul 9 07:40

[root@test-centos lib]# rm
rm: remove symbolic link `'? y
rm: remove symbolic link `'? y
[root@test-centos lib]# ln -s
[root@test-centos lib]# ln -s

After these steps, you should now able to run yum commands without any issue.

Also, after using yum, you should change back zlib to the newer version, and here's the steps:

[root@test-centos ~]# cd /usr/lib
[root@test-centos lib]# rm
rm: remove symbolic link `'? y
rm: remove symbolic link `'? y
[root@test-centos lib]# ln -s
[root@test-centos lib]# ln -s


  • public-yum-ol7.repo has issue which can lead to the error, you can remove it and try epel. You can run "strace yum list" then search for "No such file".
  • Sometimes, you need update "yum" package itself to resolve this. Downloaded from yum repo, and use rpm to upgrade it.
Categories: IT Architecture, Linux, Systems Tags:

ldap auto_home error – Could not chdir to home directory /home/xxx: No such file or directory

October 10th, 2012 Comments off

If you can log on the host but the home directory failed mouting with the following error message:

Could not chdir to home directory /home/xxx: No such file or directory

Then one method you can try is that:

  1. Ensure the home directory for your username exists on the exported NFS server
  2. Append /etc/auto_home on the host with text like the following:<username> <NFS server>:/export/home/&  #this assume the exported home directory is on /export/home, your environment may varies
  3. At last, ensure automount is running on the host and then try log on again. You should now able to mount your home directory.
Categories: IT Architecture, Linux, Systems Tags:

resolved – bnx2i dev eth0 does not support iscsi

September 19th, 2012 Comments off

There's a weird incident occurred on a linux box. The linux box turned not responsible to ping or ssh, although from ifconfig and /proc/net/bonding/bond0 file, the system said it's running ok. After some google work, I found that the issue may related to the NIC driver. I tried bring down/bring up NICs one by one, but got error:

Bringing up loopback interface bond0: bnx2i: dev eth0 does not support iscsi

bnx2i: iSCSI not supported, dev=eth0

bonding: no command found in slaves file for bond bond0. Use +ifname or -ifname

At last, I tried restart the whole network i.e. /etc/init.d/network restart. And that did the trick, the networking was then running ok and can ping/ssh to it without problem.

resolved – passwd permission denied even for root on solaris

July 14th, 2012 Comments off

When I tried resetting a local user's password on a solaris host, I met the following error message:

root@doxer # passwd <username>
New Password:
Re-enter new Password:
Permission denied

This was very weird as I was logged on as root when doing this operation:

root@doxer # id
uid=0(root) gid=1(other)

After some searching I found that this was caused by passwd by default will try to reset LDAP password if the host is using ldap for authentication. Here's excerpt from /etc/nsswitch.conf:

passwd: compat
passwd_compat: ldap

To resolve this, you need designate which authentication mechanism you want to use for resetting a password(here we should use files as this user was local one):

passwd -r files <username>


Here's more about NIS passwd map:<from book Managing NFS and NIS>

Earlier, we introduced the concept of replaced files and appended files. Now, we'll discuss how to work with these files. First, let's review: these are important concepts, so repetition is helpful. If a map replaces the local file, the file is ignored once NIS is running. Aside from making sure that misplaced optimism doesn't lead you to delete the files that were distributed with your system, there's nothing interesting that you can do with these replaced files. We won't have anything further to say about them.

Conversely, local files that are appended to by NIS maps are always consulted first, even if NIS is running. The password file is a good example of a file augmented by NIS. You may want to give some users access to one or two machines, and not include them in the NIS password map. The solution to this problem is to put these users into the local passwd file, but not into the master passwd file on the master server. The local password file is always read before getpwuid( ) goes to an NIS server. Password-file reading routines find locally defined users as well as those in the NIS map, and the search order of "local, then NIS" allows local password file entries to override values in the NIS map. Similarly, the local aliases file can be used to override entries in the NIS mail aliases map, setting up machine-specific expansion of one or more aliases.

Categories: IT Architecture, Linux, Systems Tags:

Resolved – bash /usr/bin/find Arg list too long

July 3rd, 2012 Comments off

Have you ever met error like the following?

root@doxer# find /PRD/*/connectors/A01/QP*/*/logFiles/* -prune -name "*.log" -mtime +7 -type f |wc -l

bash: /usr/bin/find: Arg list too long


The cause of issue is kernel limitation for argument count which can be passed to find (as well as ls, and other utils). ARG_MAX defines

the maximum length of arguments for a new process. You can get the number of it using command:

root@doxer# getconf ARG_MAX

To quickly fix this, you can move your actions into the directory(replace * with subdir_NAME):

cd /PRD/subdir_NAME/connectors/A01/QP*/*/logFiles/;find . -prune -name "*.log" -mtime +7 -type f |wc -l



  1. you can get all configuration values with getconf -a.
  2. For more solutions about the error "bash: /usr/bin/find: Arg list too long", you can refer to
Categories: IT Architecture, Kernel, Linux, Systems Tags:

trap bash shell script explanation and example

July 2nd, 2012 Comments off

If you want to give some information on standard output when the user press ctrl+c on the bash script, or you want to print something when the script completes, then you should consider using trap to implement this.

Here's an example which will print something to end user when the user print ctrl+c(SIGINT is equal to number 2):

trap "echo 'you typed ctrl+c'" 2
sleep 5
And if you want print something when the script ends, you can use the following as an example:

trap "echo 'script ends'" 0
sleep 5


useful sed single line examples when clearing embedded trojans or embedded links

June 7th, 2012 Comments off

When your site is embedded with some links/trojans by somebody maliciously, the first thing you could think of would mostly like to clear these malicious links/trojans. sed is a useful stream editor based on line, and you would of course think of using sed to do the cleaning job.

Usually, the embedded codes would be several lines of html codes like the following:

<div class="trojans">
<a href="">malicous site's name</a>

To clear these html codes, you can use the following sed line:

sed  '/<div class=\"trojans\">/,/<\/div>/d' injected.htm

But usually the injected files are spread across several directories or even your whole website's directory. You can combine using find and sed together to clean these annoying trojans:

find /var/www/html/ -type f \( -name *.htm -o -name *.html -o -name *.php \) -exec sed  -i.bak' /<div class=\"trojans\">/,/<\/div>/d' {} \;

Please note I use -i.bak to backup file before doing the replacement.(you should also backup your data before cleaning trojans!)


For more info about sed examples/tutorials, you may refer to the following two resources:



requiretty in sudoers file will break functioning of accounts without tty

May 2nd, 2012 Comments off

Intercepted from /etc/sudoers:

Defaults requiretty

# Refuse to run if unable to disable echo on the tty. This setting should also be
# changed in order to be able to use sudo without a tty. See requiretty above.

This means that if you have created an account without a tty for it, and you want that user have the privileges to some sudo commands, this setting(Defaults requiretty) will make the account not able to execute these wanted sudo commands.

To fix this, you can do the following:

  1. disable "Defaults requiretty" in /etc/sudoers file
  2. Change nsswitch.conf to be ldap files rather than files ldap
  3. Better yet don’t enable local sudoers