resolved – Permission denied even after chmod 777 world readable writable

September 19th, 2014

Several team members asked me that when they want to change to some directories or read some files ,the system reported error "Permission denied". Even after setting world writable(chmod 777), the error was still there:

-bash-3.2$ cd /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs
-bash: cd: /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs: Permission denied

-bash-3.2$ cat /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out
cat: /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out: Permission denied

-bash-3.2$ ls -l /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out
-rwxrwxrwx 1 oracle oinstall 1100961066 Sep 19 07:37 /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out

In summary, if you want to read some file(e.g. wls_sdi1.out) under some directory(e.g. /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs), then except for "read bit" set on that file(chmod +r wls_sdi1.out), it's also needed that all parent directories of that file(/u01, /u01/local, /u01/local/wls, ......, /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs) have both "read bit" & "execute bit" set(you can check it by ls -ld <dir name>):

chmod +r wls_sdi1.out #first set "read bit" on the file
chmod +r /u01; chmod +x /u01; chmod +r /u01/local; chmod +x /u01/local; <...skipped...>chmod +r /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs; chmod +x /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs; #then set both "read bit" & "execute bit" on all parent directories

And at last, if you can log on as the file owner, then everything will be smooth. For /u01/local/config/m_domains/tasdc1_domain/servers/wls_sdi1/logs/wls_sdi1.out, it's owned by oracle user. So you can try log on as oracle user and do the operations.

Categories: IT Architecture, Kernel, Linux, Systems, Unix Tags:

arping in linux for getting MAC address and update ARP caches by broadcast

August 27th, 2014

Suppose we want to know MAC address of 10.182.120.210. then we can log on one linux host which is in the same subnet of 10.182.120.210, e.g. 10.182.120.188:

[root@centos-doxer ~]#arping -U -c 3 -I bond0 -s 10.182.120.188 10.182.120.210
ARPING 10.182.120.210 from 10.182.120.188 bond0
Unicast reply from 10.182.120.210 [00:21:CC:B7:1F:EB] 1.397ms
Unicast reply from 10.182.120.210 [00:21:CC:B7:1F:EB] 1.378ms
Sent 3 probes (1 broadcast(s))
Received 2 response(s)

So 00:21:CC:B7:1F:EB is the MAC address of 10.182.120.210. And from here we can see that IP address 10.182.120.210 is now used in local network.

Another use of arping is to update ARP cache. One scene is that, you assign a new machine with one being used IP address, then you will not able to log on the old machine with the IP address. Even after you shutdown the new machine, you may still not able to access the old machine. And here's the resolution:

Suppose we have configured the new machine NIC eth0 with IP address 192.168.0.2 which is already used by one old machine. Log on the new machine and run the following commands:

arping -A 192.168.0.2 -I eth0 192.168.0.2
arping -U -s 192.168.0.2 -I eth0 192.168.0.1 #this is sending ARP broadcast, and 192.168.0.1 is the gateway address.
/sbin/arping -I eth0 -c 3 -s 192.168.0.2 192.168.0.3 #update neighbours' ARP caches

resolved – IOError: [Errno 2] No such file or directory when creating VMs on Oracle VM Server

August 25th, 2014

Today when I tried to add one OVS server to Oracle VM Server server pool, there was error message like below:

Start - /OVS/running_pool/vm_test
PowerOn Failed : Result - failed:<Exception: return=>failed:<Exception: failed:<IOError: [Errno 2] No such file or directory: '/var/ovs/mount/85255944BDF24F62831E1C6E7101CF7A/running_pool/vm_test/vm.cfg'>

I log on one OVS server and found the path was there. And later I logged on all OVS servers in that server pool and found one OVS server did not have storage repo. So I removed that OVS server from pool and tried to added it back to pool and want to create the VM again. But this time, the following error messages prompted when I tried to add OVS server back:

2014-08-21 02:52:52.962 WARNING failed:errcode=50006, errmsg=Do 'clusterm_init_root_sr' on servers ('testhost1') failed.
StackTrace:
File "/opt/ovs-agent-2.3/OVSSiteCluster.py", line 651, in _cluster_setup
_check(ret)
File "/opt/ovs-agent-2.3/OVSXCluster.py", line 340, in _check
raise OVSException(error=ret["error"])

2014-08-21 02:52:52.962 NOTIFICATION Failed setup cluster for agent 2.2.0...
2014-08-21 02:52:52.963 ERROR Cluster Setup when adding server
2014-08-21 02:52:52.970 ERROR [Server Pool Management][Server Pool][test_pool]:During adding servers ([testhost1]) to server pool (test_pool), Cluster setup failed: (OVM-1011 OVM Manager communication with host_master for operation HA Setup for Oracle VM Agent 2.2.0 failed:
errcode=50006, errmsg=Do 'clusterm_init_root_sr' on servers ('testhost1') failed.

From here, I realized that this error was caused by storage repo could not created on that OVS server testhost1. So I logged on testhost1 for a check. As the storage repo was one NFS share, so I tried do a showmount -e <nfs server>, and found it's not working. And then I tried to check the tracert to <nfs server>, and it's not going through.

From another host, showmount -e <nfs server> worked. So the problem was on OVS server testhost1. After more debugging, I found that one NIC was on the host but not pingable. Later I had a check of the switch, and found the NIC was unplugged. I plugged in the NIC and tried again with adding back OVS server, creating VM, and all went smoothly.

PS:

Suppose you want to know the NFS clients which mount one share from the NFS server, then on any client that has access to the NFS server, do the following:

[root@centos-doxer ~]# showmount -a nfs-server.example.com|grep to_be
10.182.120.188:/export/IDM_BR/share01_to_be_removed
test02.example:/export/IDM_BR/share01_to_be_removed

-a or --all

List both the client hostname or IP address and mounted directory in host:dir format. This info should not be considered reliable.

crontab cronjob failed with date single apostrophe date +%d-%b-%Y-%H-%M on linux

August 4th, 2014

I tried to creat one linux cronjob today, and want to note down date & time when the job was running, and here's the content:

echo '10 10 * * 1 root cd /var/log/ovm-manager/;tar zcvf oc4j.log.`date +%m-%d-%y`.tar.gz oc4j.log;echo "">/var/log/ovm-manager/oc4j.log' > /etc/cron.d/oc4j

However, this entry failed to run, and when check log in /var/log/cron:

Aug 4 06:24:01 testhost crond[1825]: (root) RELOAD (cron/root)
Aug 4 06:24:01 testhost crond[1825]: (root.bak) ORPHAN (no passwd entry)
Aug 4 06:25:01 testhost crond[28376]: (root) CMD (cd /var/log/ovm-manager/;tar zcvf oc4j.log.`date +)

So, the command was intercepted and that's the reason for the failure.

Eventually, I figured out that cron treats the % character specially (it is turned into a newline in the command). You must precede all % characters with a \ in a crontab file, which tells cron to just put a % in the command. And here's the updated version:

echo '10 10 * * 1 root cd /var/log/ovm-manager/;tar zcvf oc4j.log.`date +\%m-\%d-\%y`.tar.gz oc4j.log;echo "">/var/log/ovm-manager/oc4j.log' > /etc/cron.d/oc4j

This time, the job got ran successfully:

Aug 4 06:31:01 testhost crond[1825]: (root) RELOAD (cron/root)
Aug 4 06:31:01 testhost crond[1825]: (root.bak) ORPHAN (no passwd entry)
Aug 4 06:31:01 testhost crond[28503]: (root) CMD (cd /var/log/ovm-manager/;tar zcvf oc4j.log.`date +%m-%d-%y`.tar.gz oc4j.log;echo "">/var/log/ovm-manager/oc4j.log)

PS:

More on here http://stackoverflow.com/questions/1486088/cron-fails-on-single-apostrophe

Categories: IT Architecture, Linux, Systems, Unix Tags:

resolved – Kernel panic – not syncing: Attempted to kill init

July 29th, 2014

Today when I tried to poweron one VM hosted on XEN server, the following error messages prompted:

Write protecting the kernel read-only data: 6784k
Kernel panic - not syncing: Attempted to kill init! [failed one]
Pid: 1, comm: init Not tainted 2.6.32-300.29.1.el5uek #1
Call Trace:
[<ffffffff810579a2>] panic+0xa5/0x162
[<ffffffff8109b997>] ? atomic_add_unless+0x2e/0x47
[<ffffffff8109bdf9>] ? __put_css_set+0x29/0x179
[<ffffffff8145744c>] ? _write_lock_irq+0x10/0x20
[<ffffffff81062a65>] ? exit_ptrace+0xa7/0x118
[<ffffffff8105b076>] do_exit+0x7e/0x699
[<ffffffff8105b731>] sys_exit_group+0x0/0x1b
[<ffffffff8105b748>] sys_exit_group+0x17/0x1b
[<ffffffff81011db2>] system_call_fastpath+0x16/0x1b

This is quite weird as it's ok yesterday:

Write protecting the kernel read-only data: 6784k
blkfront: xvda: barriers enabled (tag) [normal one]
xvda: detected capacity change from 0 to 15126289920
xvda: xvda1 xvda2 xvda3
blkfront: xvdb: barriers enabled (tag)
xvdb: detected capacity change from 0 to 16777216000
xvdb: xvdb1
Setting capacity to 32768000
xvdb: detected capacity change from 0 to 16777216000
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: Disabled at runtime.
type=1404 audit(1406281405.511:2): selinux=0 auid=4294967295 ses=4294967295

After some checking, I found that this OVS server was hosting more than 40 VMs, and the VCPUs was tight. So I turned off some unused VMs and then issue resolved.

yum install specified version of packages

July 15th, 2014

Assume that you want to install one specified version of package, say glibc-2.5-118.el5_10.2.x86_64:

[root@centos-doxer ~]# yum list|grep glibc
glibc.i686 2.5-107.el5_9.4 installed
glibc.x86_64 2.5-107.el5_9.4 installed
glibc-common.x86_64 2.5-107.el5_9.4 installed
glibc-devel.i386 2.5-107.el5_9.4 installed
glibc-devel.x86_64 2.5-107.el5_9.4 installed
glibc-headers.x86_64 2.5-107.el5_9.4 installed
compat-glibc.i386 1:2.3.4-2.26 el5_latest
compat-glibc.x86_64 1:2.3.4-2.26 el5_latest
compat-glibc-headers.x86_64 1:2.3.4-2.26 el5_latest
glibc.i686 2.5-118.el5_10.2 el5_latest
glibc.x86_64 2.5-118.el5_10.2 el5_latest
glibc-common.x86_64 2.5-118.el5_10.2 el5_latest
glibc-devel.i386 2.5-118.el5_10.2 el5_latest
glibc-devel.x86_64 2.5-118.el5_10.2 el5_latest
glibc-headers.x86_64 2.5-118.el5_10.2 el5_latest
glibc-utils.x86_64 2.5-118.el5_10.2 el5_latest

Then you should execute glibc-2.5-118.el5_10.2.x86_64. The format of this command is yum install <packagename>-<version>.<platform, such as x86_64>.

Categories: IT Architecture, Linux, Systems Tags: