Author Archive

checking MTU or Jumbo Frame settings with ping

February 14th, 2014 No comments

You may set your linux box's MTU to jumbo frame sized 9000 bytes or larger, but if the switch your box connected to does not have jumbo frame enabled, then your linux box may met problems when sending & receiving packets.

So how can we get an idea of whether Jumbo Frame enabled on switch or linux box?

Of course you can log on switch and check, but we can also verify this from linux box that connects to switch.

On linux box, you can see the MTU settings of each interface using ifconfig:

[root@centos-doxer ~]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 08:00:27:3F:C5:08
RX packets:50502 errors:0 dropped:0 overruns:0 frame:0
TX packets:4579 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:9835512 (9.3 MiB) TX bytes:1787223 (1.7 MiB)
Base address:0xd010 Memory:f0000000-f0020000

As stated above, 9000 here doesn't mean that Jumbo Frame enabled on your box to switch. As you can verify with below command:

[root@testbox ~]# ping -c 2 -M do -s 1472 testbox2
PING ( 1472(1500) bytes of data. #so here 1500 bytes go through the network
1480 bytes from ( icmp_seq=1 ttl=252 time=0.319 ms
1480 bytes from ( icmp_seq=2 ttl=252 time=0.372 ms

--- ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.319/0.345/0.372/0.032 ms
[root@testbox ~]#
[root@testbox ~]#
[root@testbox ~]# ping -c 2 -M do -s 1473 testbox2
PING ( 1473(1501) bytes of data. #so here 1501 bytes can not go through. From here we can see that MTU for this box is 1500, although ifconfig says it's 9000
From ( icmp_seq=1 Frag needed and DF set (mtu = 1500)
From ( icmp_seq=1 Frag needed and DF set (mtu = 1500)

--- ping statistics ---
0 packets transmitted, 0 received, +2 errors

Also, if your the switch is Cisco one, you can verify whether the switch port connecting server has enabled jumbo frame or not by sniffing CDP (Cisco discover protocol) packet. Here's one example:

-bash-4.1# tcpdump -i eth0 -nn -v -c 1 ether[20:2] == 0x2000 #ether[20:2] == 0x2000 means capture only packets that have a 2 byte value of hex 2000 starting at byte 20
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
03:44:14.221022 CDPv2, ttl: 180s, checksum: 692 (unverified), length 287
Device-ID (0x01), length: 46 bytes: ''
Address (0x02), length: 13 bytes: IPv4 (1)
Port-ID (0x03), length: 16 bytes: 'Ethernet111/1/12'
Capability (0x04), length: 4 bytes: (0x00000228): L2 Switch, IGMP snooping
Version String (0x05), length: 66 bytes:
Cisco Nexus Operating System (NX-OS) Software, Version 5.2(1)N1(4)
Platform (0x06), length: 11 bytes: 'N5K-C5548UP'
Native VLAN ID (0x0a), length: 2 bytes: 123
AVVID trust bitmap (0x12), length: 1 byte: 0x00
AVVID untrusted ports CoS (0x13), length: 1 byte: 0x00
Duplex (0x0b), length: 1 byte: full
MTU (0x11), length: 4 bytes: 1500 bytes #so here MTU size was set to 1500 bytes
System Name (0x14), length: 18 bytes: 'ucf-c1z3-swi-5k01b'
System Object ID (not decoded) (0x15), length: 14 bytes:
0x0000: 060c 2b06 0104 0109 0c03 0103 883c
Management Addresses (0x16), length: 13 bytes: IPv4 (1)
Physical Location (0x17), length: 13 bytes: 0x00/snmplocation
1 packets captured
1 packets received by filter
0 packets dropped by kernel
110 packets dropped by interface


  1. As for "-M do" parameter for ping, you may refer to man ping for more info. And as for DF(don't fragment) and Path MTU Discovery mentioned in the manpage, you may read more on and
  2. Here's more on tcpdump tips and
  3. Maximum packet size is the MTU plus the data-link header length. Packets are not always transmitted at the Maximum packet size. As we can see from output of iptraf -z eth0.
  4. Here's more about MTU:

The link layer, which is typically Ethernet, sends information into the network as a series of frames. Even though the layers above may have pieces of information much larger than the frame size, the link layer breaks everything up into frames(which in payload encloses IP packet such as TCP/UDP/ICMP) to send them over the network. This maximum size of data in a frame is known as the maximum transfer unit (MTU). You can use network configuration tools such as ip or ifconfig to set the MTU.

The size of the MTU has a direct impact on the efficiency of the network. Each frame in the link layer has a small header, so using a large MTU increases the ratio of user data to overhead (header). When using a large MTU, however, each frame of data has a higher chance of being corrupted or dropped. For clean physical links, a high MTU usually leads to better performance because it requires less overhead; for noisy links, however, a smaller MTU may actually enhance performance because less data has to be re-sent when a single frame is corrupted.

Here's one image of layers of network frames:



Oracle VM operations – poweron, poweroff, status, stat -r

January 27th, 2014 No comments

Here's the script:

#1.OVM must be running before operations status before running poweroff or poweron
use Net::SSH::Perl;
$host = $ARGV[0];
$operation = $ARGV[1];
$user = 'root';
$password = 'password';

if($host eq "help") {
print "$0 OVM-name status|poweron|poweroff|stat-r\n";

$ssh = Net::SSH::Perl->new($host);

if($operation eq "status") {
($stdout,$stderr,$exit) = $ssh->cmd("ovm -uadmin -pwelcome1 vm ls|grep -v VM_test");
select $host_fd;
print $stdout;
close $host_fd;
} elsif($operation eq "poweroff") {
if($_ =~ "Server_Pool|OVM|Powered") {
if($_ =~ /(.*?)\s+([0-9]{1,})\s+([0-9]{1,})\s+([0-9]{1,})\s+([a-zA-Z]{1,})\s+(.*)/){
$ssh->cmd("ovm -uadmin -pwelcome1 vm poweroff -n $1 -s $6");
sleep 12;
} elsif($operation eq "poweron") {
if($_ =~ "Server_Pool|OVM|Running") {
if($_ =~ /(.*?)\s+([0-9]{1,})\s+([0-9]{1,})\s+([0-9]{1,})\s+([a-zA-Z]{1,})\s+Off(.*)/){
$ssh->cmd("ovm -uadmin -pwelcome1 vm poweron -n $1 -s $6");
#print "ovm -uadmin -pwelcome1 vm poweron -n $1 -s $6";
sleep 20;
} elsif($operation eq "stat-r") {
if($_ =~ /(.*?)\s+([0-9]{1,})\s+([0-9]{1,})\s+([0-9]{1,})\s+(Shutting\sDown|Initializing)\s+(.*)/){
#print "ovm -uadmin -pwelcome1 vm stat -r -n $1 -s $6";
$ssh->cmd("ovm -uadmin -pwelcome1 vm stat -r -n $1 -s $6");
sleep 1;

You can use the following to make the script run in parallel:

for i in <all OVMs>;do (./ $i status &);done

avoid putty ssh connection sever or disconnect

January 17th, 2014 2 comments

After sometime, ssh will disconnect itself. If you want to avoid this, you can try run the following command:

while [ 1 ];do echo hi;sleep 60;done &

This will print message "hi" every 60 seconds on the standard output.


You can also set some parameters in /etc/ssh/sshd_config, you can refer to

“Include snapshots” made NFS shares from ZFS appliance shrinking

January 17th, 2014 No comments

Today I met one weird issue when checking one NFS share mounted from ZFS appliance. The NFS filesystem mounted on client was shrinking when I removed files as the space on that filesystem was getting low. But what made me confused was that the filesystem's size would getting lower! Shouldn't the free space getting larger and the size keep unchanged?

After some debugging, I found that this was caused by ZFS appliance shares' "Include snapshots". When I uncheck "Include snapshots", the issue was gone!


Categories: Hardware, NAS, Storage Tags:

resolved – ESXi Failed to lock the file

January 13th, 2014 No comments

When I was power on one VM in ESXi, one error occurred:

An error was received from the ESX host while powering on VM doxer-test.
Cannot open the disk '/vmfs/volumes/4726d591-9c3bdf6c/doxer-test/doxer-test_1.vmdk' or one of the snapshot disks it depends on.
Failed to lock the file

And also:

unable to access file since it is locked

This apparently was caused by some storage issue. I firstly googled and found most of the posts were telling stories about ESXi working mechanism, and I tried some of them but with no luck.

Then I thought of that our storage datastore was using NFS/ZFS, and NFS has file lock issue as you know. So I mount the nfs share which datastore was using and removed one file named lck-c30d000000000000. After this, the VM booted up successfully! (or we can log on ESXi host, and remove lock file there also)

install java jdk on linux

January 7th, 2014 No comments

Here's the steps if you want to install java on linux:

wget <path to jre-7u25-linux-x64.rpm> -P /tmp
rpm -ivh /tmp/jre-7u25-linux-x64.rpm
mkdir -p /root/.mozilla/plugins
rm -f /root/.mozilla/plugins/
ln -s /usr/java/jre1.7.0_25/lib/amd64/ /root/.mozilla/plugins/
ll /root/.mozilla/plugins/

add another root user and set password

January 7th, 2014 No comments

In linux, do the following to add another root user and set password:

mkdir -p /home/root2
useradd -u 0 -o -g root -G root -s /bin/bash -d /home/root2 root2
echo password | passwd --stdin root2

Categories: IT Architecture, Linux, Systems Tags:

self defined timeout for telnet on Linux

December 26th, 2013 No comments

telnet's default timeout value is relative high, so you may want to change timeout value to lower value such as 5 seconds. Here's the way that we can fulfill this:


$command &
( sleep $waitfor ; kill -9 $commandpid > /dev/null 2>&1 ) &
wait $commandpid > /dev/null 2>&1
kill $sleeppid > /dev/null 2>&1

timeout telnet 1521 >> $output

Also, we can use expect and set timeout for expect. When telnet is integrated with expect, we can fulfill timeout for telnet through using expect's timeout value:


set timeout 30

send "<put telnet command here>\r"

Add static routes in linux which will survive reboot and network bouncing

December 24th, 2013 No comments

We can see that in linux, the file /etc/sysconfig/static-routes is revoked by /etc/init.d/network:

[root@test-linux ~]# grep static-routes /etc/init.d/network
# Add non interface-specific static-routes.
if [ -f /etc/sysconfig/static-routes ]; then
grep "^any" /etc/sysconfig/static-routes | while read ignore args ; do

So we can add rules in /etc/sysconfig/static-routes to let network routes survive reboot and network bouncing. The format of /etc/sysconfig/static-routes is like:

any net netmask gw
any net netmask gw

To make route in effect immediately, you can use route add:

route add -net netmask gw

But remember that to change the default gateway, we need modify /etc/sysconfig/network(modify GATEWAY=).

After the modification, bounce the network using service network restart to make the changes in effect.

remove duplicate images using fdupes and expect in linux

December 13th, 2013 No comments

I've got several thousands of pictures, but most of them had several exact copies of themselves. So I had to remove duplicate ones by hand firstly.

Later, I thought of that in linux we had md5sum which will give the same string for files with exact same contents. Then I tried to write some program, and that toke me some while.

I searched google and found that in linux, we had fdupes which can do the job very well. fdupes will calculate duplicate files based on file size/md5 value, and will prompt you to reserve one copy or all copies of the duplicates and remove others if you gave -d parameter to it. You can read more about fdupes here

As all the pictures were on a windows machine, so I installed cygwin and installed fdupes and expect. Later I wrote a small script to reserve only one copy of the duplicate pictures for me(you will have to enter your option either reserving one copy or all copies by hand if you do not use expect, as there's no option for reserve one copy by the author of fdupes). Here's my program:

$ cat fdupes.expect
set timeout 1000000
spawn /home/andy/
expect "preserve files" {
send "1\r";exp_continue

$ cat /home/andy/
fdupes.exe -d /cygdrive/d/pictures #yup, my pictures are all on this directory on windows, i.e. d:\pictures

After this, you can just run fdupes.expect, and it will reserve only one copy and remove other duplicates for you.

Common storage multi path Path-Management Software

December 12th, 2013 No comments
Vendor Path-Management Software URL
Hewlett-Packard AutoPath, SecurePath
Microsoft MPIO
Hitachi Dynamic Link Manager
EMC PowerPath
IBM RDAC, MultiPath Driver
VERITAS Dynamic Multipathing (DMP)

resolved – mount clntudp_create: RPC: Program not registered

December 2nd, 2013 No comments

When I did a showmount -e localhost, error occured:

[root@centos-doxer ~]# showmount -e localhost
mount clntudp_create: RPC: Program not registered

So I checked what RPC program number of showmount was using:

[root@centos-doxer ~]# grep showmount /etc/rpc
mountd 100005 mount showmount

As this indicated, we should startup mountd daemon to make showmount -e localhost work. And mountd is part of nfs, so I started up nfs:

[root@centos-doxer ~]# /etc/init.d/nfs start
Starting NFS services: [ OK ]
Starting NFS quotas: [ OK ]
Starting NFS daemon: [ OK ]
Starting NFS mountd: [ OK ]

Now as mountd was running, showmount -e localhost should work.


quick configuration of python httpd server

November 28th, 2013 No comments

Let's assume that you want to copy files from server A to server B, and you have found that no scp available, but wget is there for use. Then you can try run one python command and use wget to download files from server A.

Here's the steps:

On server A:

cd <directory of files you want to copy>

python -m SimpleHTTPServer #notice the output of this command, for example, "Serving HTTP on port 8000 ..."

Now you can open browser and visit http://<hostname of server A>:8000. You will notice files are there now.

On server B:

wget http://<hostname of server A>:8000/<files to copy>

After you've copied files, you can press control+c to terminate that python http server on Server A. (Or you can press ctrl+z, and then bg %<job id> to make that python httpd server run in background)

Categories: IT Architecture, Programming Tags:

VLAN in windows hyper-v

November 26th, 2013 No comments

Briefly, a virtual LAN (VLAN) can be regarded as a broadcast domain. It operates on the OSI
network layer 2. The exact protocol definition is known as 802.1Q. Each network packet belong-
ing to a VLAN has an identifier. This is just a number between 0 and 4095, with both 0 and 4095
reserved for other uses. Let’s assume a VLAN with an identifier of 10. A NIC configured with
the VLAN ID of 10 will pick up network packets with the same ID and will ignore all other IDs.
The point of VLANs is that switches and routers enabled for 802.1Q can present VLANs to dif-
ferent switch ports in the network. In other words, where a normal IP subnet is limited to a set
of ports on a physical switch, a subnet defined in a VLAN can be present on any switch port—if
so configured, of course.

Getting back to the VLAN functionality in Hyper-V: both virtual switches and virtual NICs
can detect and use VLAN IDs. Both can accept and reject network packets based on VLAN ID,
which means that the VM does not have to do it itself. The use of VLAN enables Hyper-V to
participate in more advanced network designs. One limitation in the current implementation is
that a virtual switch can have just one VLAN ID, although that should not matter too much in
practice. The default setting is to accept all VLAN IDs.

Difference between Computer Configuration settings and User Configuration settings in Active Directory Policy Editor

November 22nd, 2013 No comments
  • Computer Configuration settings are applied to computer accounts at startup and during the background refresh interval.
  • User Configuration settings are applied to the user accounts logon and during the background refresh interval.

resolved – sshd: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost= user=

November 20th, 2013 No comments

Today when I tried to log on one linux server with a normal account, errors were found in /var/log/secure:

Nov 20 07:43:39 test_linux sshd[11200]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost= user=testuser
Nov 20 07:43:39 test_linux sshd[11200]: pam_ldap: error trying to bind (Invalid credentials)
Nov 20 07:43:42 test_linux sshd[11200]: nss_ldap: failed to bind to LDAP server ldaps:// Invalid credentials
Nov 20 07:43:42 test_linux sshd[11200]: nss_ldap: failed to bind to LDAP server ldap:// Invalid credentials
Nov 20 07:43:42 test_linux sshd[11200]: nss_ldap: could not search LDAP server - Server is unavailable
Nov 20 07:43:42 test_linux sshd[11200]: nss_ldap: failed to bind to LDAP server ldaps:// Invalid credentials
Nov 20 07:43:43 test_linux sshd[11200]: nss_ldap: failed to bind to LDAP server ldap:// Invalid credentials
Nov 20 07:43:43 test_linux sshd[11200]: nss_ldap: could not search LDAP server - Server is unavailable
Nov 20 07:43:55 test_linux sshd[11200]: pam_ldap: error trying to bind (Invalid credentials)
Nov 20 07:43:55 test_linux sshd[11200]: Failed password for testuser from port 34243 ssh2
Nov 20 07:43:55 test_linux sshd[11201]: fatal: Access denied for user testuser by PAM account configuration

After some attempts on linux PAM(sshd, system-auth), I still got nothing. Later, I checked /etc/ldap.conf with one other box, and found the configuration on the problematic host was not right.

I copied the right ldap.conf and tried log on later, and the issue resolved.


You can read more about linux PAM here (I recommend having a reading on the System Administrators' Guide as that may be the only one linux administrators can reach. You can also get a detailed info on some commonly used PAM modules such as,, pam_cracklib, etc.)

resolved – how to show all results in one page when searching your wordpress blog

November 13th, 2013 No comments

Assume that you have your own wordpress blog, and you note down everything you met in daily work.

Now you have some trouble again in work, and remembered that you've noted similar issue before. So you tried searching your wordpress blog with a keyword such as "trouble". Later, wordpress returned a result of 30 pages, each page had 10 articles. Now you scrolled and click "next page" a lot and that really frustrated you. What if you have all the searching result in one page? Thus you just need scroll the page and no waiting for loading pages of next, next, next page. (You may worry that the page load time will disappoint other guys searching your blog, but this proves to be little to worry, as no body will search your blog except yourself. Believe me buddy!)

Here goes the way to fulfill this functionality:

  1. Go to wordpress admin page, then click "Appearance" -> "Editor";
  2. Click archive.php in the right to edit this file(search.php refers to archive.php, so you should edit archive.php);
  3. Search for "have_posts()", and add one line above that line. The line to be added is like this: <?php query_posts($query_string . '&showposts=30'); ?> You may change 30 here to any number you want. As you guessed, this is the number that will show after searching.
  4. Save the change and try searching again. You'll notice the change.


  1. Note that every time you upgrade wordpress or your wordpress theme you may need to do above steps again;
  2. The idea is from

resolved – kernel panic not syncing: Fatal exception Pid: comm: not Tainted

November 13th, 2013 No comments

We're install IDM OAM today and the linux server panic every time we run the startup script. Server panic info was like this:

Pid: 4286, comm: emdctl Not tainted 2.6.32-300.29.1.el5uek #1
Process emdctl (pid: 4286, threadinfo ffff88075bf20000, task ffff88073d0ac480)
ffff88075bf21958 ffffffffa02b1769 ffff88075bf21948 ffff8807cdcce500
<0> ffff88075bf95cc8 ffff88075bf95ee0 ffff88075bf21998 ffffffffa01fd5c6
<0> ffffffffa02b1732 ffff8807bc2543f0 ffff88075bf95cc8 ffff8807bc2543f0
Call Trace:
[<ffffffffa02b1769>] nfs3_xdr_writeargs+0x37/0x7a [nfs]
[<ffffffffa01fd5c6>] rpcauth_wrap_req+0x7f/0x8b [sunrpc]
[<ffffffffa02b1732>] ? nfs3_xdr_writeargs+0x0/0x7a [nfs]
[<ffffffffa01f612a>] call_transmit+0x199/0x21e [sunrpc]
[<ffffffffa01fc8ba>] __rpc_execute+0x85/0x270 [sunrpc]
[<ffffffffa01fcae2>] rpc_execute+0x26/0x2a [sunrpc]
[<ffffffffa01f5546>] rpc_run_task+0x57/0x5f [sunrpc]
[<ffffffffa02abd86>] nfs_write_rpcsetup+0x20b/0x22d [nfs]
[<ffffffffa02ad1e8>] nfs_flush_one+0x97/0xc3 [nfs]
[<ffffffffa02a86b4>] nfs_pageio_doio+0x37/0x60 [nfs]
[<ffffffffa02a87c5>] nfs_pageio_complete+0xe/0x10 [nfs]
[<ffffffffa02ac264>] nfs_writepages+0xa7/0xe4 [nfs]
[<ffffffffa02ad151>] ? nfs_flush_one+0x0/0xc3 [nfs]
[<ffffffffa02acd2e>] nfs_write_mapping+0x63/0x9e [nfs]
[<ffffffff810f02fe>] ? __pmd_alloc+0x5d/0xaf
[<ffffffffa02acd9c>] nfs_wb_all+0x17/0x19 [nfs]
[<ffffffffa029f6f7>] nfs_do_fsync+0x21/0x4a [nfs]
[<ffffffffa029fc9c>] nfs_file_flush+0x67/0x70 [nfs]
[<ffffffff81117025>] filp_close+0x46/0x77
[<ffffffff81059e6b>] put_files_struct+0x7c/0xd0
[<ffffffff81059ef9>] exit_files+0x3a/0x3f
[<ffffffff8105b240>] do_exit+0x248/0x699
[<ffffffff8100e6a1>] ? xen_force_evtchn_callback+0xd/0xf
[<ffffffff8106898a>] ? freezing+0x13/0x15
[<ffffffff8105b731>] sys_exit_group+0x0/0x1b
[<ffffffff8106bd03>] get_signal_to_deliver+0x303/0x328
[<ffffffff8101120a>] do_notify_resume+0x90/0x6d7
[<ffffffff81459f06>] ? kretprobe_table_unlock+0x1c/0x1e
[<ffffffff8145ac6f>] ? kprobe_flush_task+0x71/0x7c
[<ffffffff8103164c>] ? paravirt_end_context_switch+0x17/0x31
[<ffffffff81123e8f>] ? path_put+0x22/0x27
[<ffffffff8101207e>] int_signal+0x12/0x17
Code: 55 48 89 e5 0f 1f 44 00 00 48 8b 06 0f c8 89 07 48 8b 46 08 0f c8 89 47 04 c9 48 8d 47 08 c3 55 48 89 e5 0f 1f 44 00 00 48 0f ce <48> 89 37 c9 48 8d 47 08 c3 55 48 89 e5 53 0f 1f 44 00 00 f6 06
RIP [<ffffffffa02b03c3>] xdr_encode_hyper+0xc/0x15 [nfs]
RSP <ffff88075bf21928>
---[ end trace 04ad5382f19cf8ad ]---
Kernel panic - not syncing: Fatal exception
Pid: 4286, comm: emdctl Tainted: G D 2.6.32-300.29.1.el5uek #1
Call Trace:
[<ffffffff810579a2>] panic+0xa5/0x162
[<ffffffff81450075>] ? threshold_create_device+0x242/0x2cf
[<ffffffff8100ed2f>] ? xen_restore_fl_direct_end+0x0/0x1
[<ffffffff814574b0>] ? _spin_unlock_irqrestore+0x16/0x18
[<ffffffff810580f5>] ? release_console_sem+0x194/0x19d
[<ffffffff810583be>] ? console_unblank+0x6a/0x6f
[<ffffffff8105766f>] ? print_oops_end_marker+0x23/0x25
[<ffffffff814583a6>] oops_end+0xb7/0xc7
[<ffffffff8101565d>] die+0x5a/0x63
[<ffffffff81457c7c>] do_trap+0x115/0x124
[<ffffffff81013731>] do_alignment_check+0x99/0xa2
[<ffffffff81012cb5>] alignment_check+0x25/0x30
[<ffffffffa02b03c3>] ? xdr_encode_hyper+0xc/0x15 [nfs]
[<ffffffffa02b06be>] ? xdr_encode_fhandle+0x15/0x17 [nfs]
[<ffffffffa02b1769>] nfs3_xdr_writeargs+0x37/0x7a [nfs]
[<ffffffffa01fd5c6>] rpcauth_wrap_req+0x7f/0x8b [sunrpc]
[<ffffffffa02b1732>] ? nfs3_xdr_writeargs+0x0/0x7a [nfs]
[<ffffffffa01f612a>] call_transmit+0x199/0x21e [sunrpc]
[<ffffffffa01fc8ba>] __rpc_execute+0x85/0x270 [sunrpc]
[<ffffffffa01fcae2>] rpc_execute+0x26/0x2a [sunrpc]
[<ffffffffa01f5546>] rpc_run_task+0x57/0x5f [sunrpc]
[<ffffffffa02abd86>] nfs_write_rpcsetup+0x20b/0x22d [nfs]
[<ffffffffa02ad1e8>] nfs_flush_one+0x97/0xc3 [nfs]
[<ffffffffa02a86b4>] nfs_pageio_doio+0x37/0x60 [nfs]
[<ffffffffa02a87c5>] nfs_pageio_complete+0xe/0x10 [nfs]
[<ffffffffa02ac264>] nfs_writepages+0xa7/0xe4 [nfs]
[<ffffffffa02ad151>] ? nfs_flush_one+0x0/0xc3 [nfs]
[<ffffffffa02acd2e>] nfs_write_mapping+0x63/0x9e [nfs]
[<ffffffff810f02fe>] ? __pmd_alloc+0x5d/0xaf
[<ffffffffa02acd9c>] nfs_wb_all+0x17/0x19 [nfs]
[<ffffffffa029f6f7>] nfs_do_fsync+0x21/0x4a [nfs]
[<ffffffffa029fc9c>] nfs_file_flush+0x67/0x70 [nfs]
[<ffffffff81117025>] filp_close+0x46/0x77
[<ffffffff81059e6b>] put_files_struct+0x7c/0xd0
[<ffffffff81059ef9>] exit_files+0x3a/0x3f
[<ffffffff8105b240>] do_exit+0x248/0x699
[<ffffffff8100e6a1>] ? xen_force_evtchn_callback+0xd/0xf
[<ffffffff8106898a>] ? freezing+0x13/0x15
[<ffffffff8105b731>] sys_exit_group+0x0/0x1b
[<ffffffff8106bd03>] get_signal_to_deliver+0x303/0x328
[<ffffffff8101120a>] do_notify_resume+0x90/0x6d7
[<ffffffff81459f06>] ? kretprobe_table_unlock+0x1c/0x1e
[<ffffffff8145ac6f>] ? kprobe_flush_task+0x71/0x7c
[<ffffffff8103164c>] ? paravirt_end_context_switch+0x17/0x31
[<ffffffff81123e8f>] ? path_put+0x22/0x27
[<ffffffff8101207e>] int_signal+0x12/0x17

We tried a lot(application coredump, kdump etc) but still not got solution until we notice that there were a lot of nfs related message in the kernel panic info(marked as red above).

As our linux server was not using NFS or autofs, so we tried upgrade nfs client(nfs-utils) and disabled autofs:

yum update nfs-utils

chkconfig autofs off

After this, the startup for IDM succeeded, and no server panic found anymore!

make ssh on linux not to disconnect after some certain time

November 1st, 2013 No comments

You connect to a linux box through ssh, and sometimes you just found ssh "hangs" there or just disconnected. That's what ssh configuration on server makes this happen.

You can do the following to make this disconnection time long enough so that you get across this annoying issue:

cp /etc/ssh/sshd_config{,.bak30}
sed -i '/ClientAliveInterval/ s/^/# /' /etc/ssh/sshd_config
sed -i '/ClientAliveCountMax/ s/^/# /' /etc/ssh/sshd_config
echo 'ClientAliveInterval 30' >> /etc/ssh/sshd_config
echo 'TCPKeepAlive yes' >> /etc/ssh/sshd_config
echo 'ClientAliveCountMax 99999' >> /etc/ssh/sshd_config
/etc/init.d/sshd restart


Categories: IT Architecture, Linux, Systems Tags:

make sudo asking for no password on linux

November 1st, 2013 No comments

Assuming that you have a user named 'test', and he belongs to 'admin' group. So you want user test can sudo to root, and don't want linux prompting for password. Here's the way you can do it:

cp /etc/sudoers{,.bak}
sed -i '/%admin/ s/^/# /' /etc/sudoers
echo '%admin ALL=(ALL) NOPASSWD: ALL' >> /etc/sudoers


disable linux strong password policy

November 1st, 2013 No comments

You may enable strong password policy for linux, and can disable it of course. So here's the way if you want to disable it:

cp /etc/pam.d/system-auth{,.bak}
sed -i '/ s/^/# /' /etc/pam.d/system-auth
sed -i 's/use_authtok//' /etc/pam.d/system-auth
echo "password" | passwd --stdin username


  1. To enable strong password for linux, you can have a try on this
  2. You can read more about linux pam here

make tee to copy stdin as well as stderr & prevent ESC output of script

October 30th, 2013 No comments
  • Make tee to copy stdin as well as stderr

As said by manpage of tee:

read from standard input and write to standard output and files

So if you have error messages in your script, then the error messages will not copied and write to file.

Here's one workaround for this:

./ 2>&1 | tee -a log

Or you can use the more complicated one:

command > >(tee stdout.log) 2> >(tee stderr.log >&2)

  • Prevent ESC output of script

script literally captures every type of output that was sent to the screen. If you have colored or bold output, this shows up as esc characters within the output file. These characters can significantly clutter the output and are not usually useful. If you set the TERM environmental variable to dumb (using setenv TERM dumb for csh-based shells and export TERM=dumb for sh-based shells), applications will not output the escape characters. This provides a more readable output.

In addition, the timing information provided by script clutters the output. Although it can be useful to have automatically generated timing information, it may be easier to not use script’s timing, and instead just time the important commands with the time command mentioned in the previous chapter.


  1. Here's the full version
  2. Some contents of this article is excerpted from <Optimizing Linux® Performance: A Hands-On Guide to Linux® Performance Tools>.

use batch script to start up & shutdown Virtualbox VMs

October 28th, 2013 No comments

I woke up before 8 every day on weekdays, and want to poweron two VMs in virtualbox named "xp" and "win2008". So I can write a script and put it in "startup" folder, then these two VMs will startup with system automatically:

@echo off
date /t | find "Mon" && goto 1
date /t | find "Tue" && goto 1
date /t | find "Wed" && goto 1
date /t | find "Thu" && goto 1
date /t | find "Fri" && goto 1

if %time:~0,2% leq 8 (
c:\VirtualBox\VBoxManage startvm win2008 --type gui
c:\VirtualBox\VBoxManage startvm xp --type gui
) else exit

And I also want to shutdown these two VMs in one run:

c:\VirtualBox\VBoxManage controlvm win2008 acpipowerbutton
c:\VirtualBox\VBoxManage controlvm xp acpipowerbutton


You may also consider group policy(gpedit.msc -> Computer Configuration -> Windows Settings -> Scripts -> Shutdown) in windows so that when you shutdown your pc, all VMs will turned off automatically if you have a GPO for shutdown. More in


F5 big-ip LTM iRULE to redirect http requests to https

October 25th, 2013 No comments

Here's the irule script:

HTTP::redirect "https://[HTTP::host][HTTP::uri]"


1.You can read more about F5 LTM docs here <select a version of big ip software from the left side first>

2.Here's one diagram shows a logical configuration example of the F5 solution for Oracle Database, Applications, Middleware, Servers and Storage:


oracle database tips – sql

October 16th, 2013 No comments


set timing on
set autotrace on explain; #explain automatically
set lines<ize> 80; #linesize
set wrap off #no wrap
set pagesize 24
set feedback off --show or not show * rows selected
set heading off --show or not show column name
set verify off --(not)show substitution before/after
set trimspool off #strip whitespaces for columns
set echo on #(not)show commands executed

set define on/off #enable/disable ampersand substitution

col file_name format a60; column
; #show last command
/ #redo last command

define fil= '/var/tmp/a.txt'
prompt Spooling to &fil
spool &fil
select * from tab;
spool off

sqlplus 'sys/password@devdb as sysdba' #first export ORACLE_HOME(no trailing /) and set tnsnames.ora
sqlplus /nolog #no initial connection
sqlplus>conn sys/password@sid as sysdba

user_ #current user's, user_tables
all_ #current user's, include granted
dba_ #all, dba_users, dba_indexes, dba_tables, dba_objects
dictionary #data dictionary's data dictionary
v$fixed_table #dynamic view's dynamic view
v$process #oracle server processes(does not include client side info)
select pid,serial#,spid,program from v$process; #pid, serial# is PID allocated from Oracle Software; spid is OS pid; program is OS process name mapped to Oracle Service process
select p.spid,s.sid,s.serial#,s.username,s.status,s.last_call_et,p.program,p.terminal,logon_time,module,s.osuser from V$process p,V$session s where s.paddr = p.addr and s.status = 'ACTIVE' and s.username like '%SYS%'; #all the active sessions for sys user
v$session #client info. paddr is v$process's addr(server process addr, such as 00000000C4505F10); server is dedicated or shared.
select machine,osuser,program,process,command,server from v$session where machine = 'centos-doxer'; #process is client's OS PID(on centos-doxer for example)

v$segment_statistics #Segment Statistics views. when parameter STATISTICS_LEVEL is typical/all. Three statistics report global cache activity at the object level, and you can use these to quickly pinpoint the segments in your database that are experiencing the highest levels of inter-instance activity: gc cr blocks received, gc current blocks received, and gc buffer busy.

show parameter db_create_online_log_dest; #if not set, then control files/online redo logs will multiplexed to the FRA and DATA disk group; if set, then the control files and online redo logs will be created in that location instead. Specifying multiple different destinations provides greater fault tolerance
SQL> show parameter db_create_file_dest; #+DATA

select component,current_size,min_size,max_size from V$MEMORY_DYNAMIC_COMPONENTS where current_size <> 0 #size of PGA/SGA/shared pool/large pool
SQL> alter system set control_files='xxx','xxx','xxx' scope=spfile #next reboot, show parameter control_files;
SQL> select * from v$controlfile; -- control file information

SQL>create public synonym part_master for inv.part_master; #other users can use(if granted privileges), can across DBs<through db link>
SQL>drop public synonym part_master;
show parameter spfile; #oracle/grid
show parameter dump_dest;
show parameter asm_power_limit #power of RBAL(1 to 11, should be set to night)
SQL> alter system set asm_power_limit=5; #su - grid
SQL>alter tablespace tbs1 read only #drop table test; is ok, because drop only change the name in dictionary
SQL>alter user hr quota 500m/unlimited on <tablespace>;
SQL>alter session/system set … scope #spfile(next reboot)/memory(only current)/both(this is default)
SQL>create pfile='xxx' from spfile;
SQL>create spfile from pfile;
SQL>alter session set current_schema = hr;
SQL>select username from dba_users; #all users
SQL>select distinct owner from dba_objects; #schemas available

SQL>select username,ACCOUNT_STATUS from dba_users;
SQL>select * from global_name; #current database
SQL>select table_name from dba_all_tables where owner = 'HR';

SQL>select to_char(WMSYS.WM_CONCAT(dba_users.username)) user_list from dba_users where expiry_date-sysdate<15 and username not in (select username from dba_users where expiry_date<sysdate); #expire in 15 days

SQL>select table_name from user_tables;

SQL> select OWNER,TABLE_NAME from dba_all_tables t where t.TABLE_NAME like 'OVS_OS_RESOURCE%'; #like '_VSOS_RESOURCE%'
SQL>select * from tab; #all tables and views for current user
SQL>select * from cat; #all tables and sequences for current user. cat is synonym of user_catalog

SQL>create sequence cust_id_seq start with 104501;
SQL>select cust_id_seq.nextval from dual; #104501, 104502

SQL>show user
SQL>create user "newuser" profile "default" identified by "newpassword" password expire default tablespace "users" temporary tablespace "temp" quota unlimited on temp quota unlimited on users account unlock;
SQL>grant "connect" to "newuser"; #connect is role, others are like DBA, resource, Imp_Full_Database, Select_catalog_role

create role "testrole"; #drop role xxx
grant connect to "testrole"
grant execute any procedure to "testrole"
grant select on "table_name" to "testrole"
#profile, password management
SQL> select * from v$pwfile_users; #list users with SYSDB or SYSOP or SYSASM privileges

create profile "newpolicy" failed_login_attempts 3 password_lock_time unlimited password_life_time 60 password_reuse_time 120 password_verify_function complex_password; complex_password is self-defined function
alter user newuser profile newpolicy;

SQL>alter user xxx account unlock; #alter user xxx account unlock identified by xxx;
SQL>password <account>;
SQL>GRANT SELECT ON to test; #with grant option,
SQL>grant all privileges to test; #object privileges(select, insert, update, delete单个named object)/system privileges(alter system/create user/create table, dba_sys_privs)
SQL>revoke create table from joe #not cascading
SQL>select * from session_privs;
SQL>select * from session_roles;

SQL>select owner, object_name, object_type from dba_objects where status ^= 'VALID';

select job_id from jobs where job_id like 'SA_%';
select job_id from jobs where job_id like 'SA\_%' escape '\';
select last_name from employees where commission_pct is NULL; # =NULL is wrong

SQL>create table tab1 tablespace tbs1 as select * from tab2;
SQL>truncate table emp; #DDL, can not rollback or commit, but it's quicker
SQL>create index id on obj(object_name);
SQL>alter table obj move; #invalid index
SQL>alter index id rebuild; #rebuild index
SQL>create tablespace tbs1 datafile '+DATA' size 100m EXTENT MANAGEMENT LOCAL uniform size 1m;#OMF, oracle managed files. use uniform size to specify extent size(or use AUTOALLOCATE for system management)
SQL>alter database datafile '/u01/oradata/ora11g/example01.dbf' autoextend on next 50M maxsize 5000M;
SQL> alter database datafile '+DATA/PROD/DATAFILE/users.259.679156903' resize 10G;

SQL>create database link p6 connect to hr identified by oracle_4U using 'p6′; select * from jobs@p6; #query across different hosts, cross query

SQL>delete from OVS.OVS_SITE where SITE_NAME='test'; #where ... and/or ...; where ... != ...; where year not between 1936 and 1967; where year not in ('1987', '1989')
select employee_id, last_name, phone_number from employees where last_name = '&LASTNAME' or employee_id = &EMPNO; #ampersand(&) substitution
select first_name, last_name from employees where last_name like '%&&SEARCH%' and first_name like '%&SEARCH%'; #ampersand substitution of two same item. && is the value
select first_name, job_id, &&col from employees where job_id in ('MK_MAN','SA_MAN') order by &col; #ampersand substitutes the query column, for example, 'salary' column
undefine col;
define; #can get info about db version, editor, user-defined variables, etc
define col=salary
SQL>select 'a''s yes' from dual; #use ' to escape '
SQL>select q'<a's yes>' as "test Q" from dual; # (round brackets), {curly braces}, [square brackets], <angle brackets>, use q'<>' to escape
SQL> select employee_id,end_date - start_date as days from job_history;
SQL> select last_name from employees where hire_date < '2009/12/31';
NLS_DATE_FORMAT=yyyy/mm/dd hh24:mi:ss; NLS_LANG=american_america.AL32UTF8 -> National Language Support.
select * from V$NLS_PARAMETERS; for querying the current setting.
alter session set NLS_NUMERIC_CHARACTERS=', '; #, as decimal point and use space as separator for thousands
SQL> alter session set NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss'; #time format
SQL> select first_name, hire_date from employees where hire_date between '1994/12/10' and '2009/12/31';
select INSTANCE_NUMBER||','||INSTANCE_NAME||','||HOST_NAME from v$instance; #|| for concatenation, output will be like 1,devdb1,

alter table emp modify (comm number(4,2) default 0.05);
SQL>insert into timestamp_test values (sysdate); #insert into ... values ('xxx');

SQL> select region_name from regions natural join countries where country_name='Canada'; #ANSI natural join(no need to specify the column for joining). other ANSI join includes join table2 using/join table2 on xxx=xxx/left, right, full outer join table2 on xxx=xxx/cross join
select e.employee_id, e.department_id EMP_DEPT_ID, d.department_id DEPT_DEPT_ID, d.department_name from departments d left outer join employees e on (d.DEPARTMENT_ID=e.DEPARTMENT_ID) where d.department_name like 'P%'; #left outer join, will return all values of the left table(departments). for the right table(employees), empty values will be returned
select e.last_name, d.department_name from departments d right outer join employees e on (e.department_id=d.department_id) where e.last_name like 'G%'; #all employees will be returned, even include ones without department id
SQL>select, a.birth_year,a.father, b.birth_year from family a, family b where a.father =; #self join
select Mum, Dad, Child from family F1 join Family F2 on ( join family F3 on (;
select count(*) from regions cross join countries; #cross join for multiplying, 25*4=100 rows

SQL>select ... from ... where prod_list_price = (select max(prod_list_price) from products); #subquery
SQL>select substr(prod_category,1,22) Category, substr(prod_name,1,39) Product,prod_list_price List from products p where prod_list_price > (select avg(prod_list_price) from products where p.prod_category = prod_category) order by substr(prod_category,1,22), prod_list_price desc; #corelated subquery. The main query retrieves the Category, Product, and List Price details for each product that is greater than the average list price of all products within its category. Data from the subquery’s product category is joined with the main query’s product category and referenced by the main query’s table alias.
select E.job_ID current_job, last_name||' can earn twice their salary by changing jobs to: '||J.job_id options, E.salary current_salary, J.max_salary potential_max_salary from employees E join jobs J on (2*E.salary < J.max_salary) where E.salary>5000 order by last_name; #nonequijons
SQL>select * from x union select * from y; #without duplicates
SQL>select * from x union all select * from y; #with duplicates
SQL>select * from x intersect select * from y; #rows that exist in both tables
SQL>select * from x minus select * from y; #rows only in x table

not null/unique/primary key/foreign key/check/ #types of integrity constraints; deferrable/not deferrable
create table check_constraint_example (col1 char(1) constraint check_col1 check (col1 in ('B','G','N'))); #check

Partitioned table #with each partition implemented in a different tablespace. #data partitioning. alter table <abc> drop partition <xyz>
create table sales (
prod_id number not null,
total_amount AS (quantity_sold * amount_sold), #virtual column
time_id date not null,
storage (initial 65536 minextents 1 maxextents 2147483645)
partition by range (time_id) #type of partitioning<range/list/hash/composite(subpartition)/reference<foreign key>/virtual column-based/interval> and the partition key
#partition by list (cov_type_cd) (partition cov_life values ('term 64','UL','ADB') tablespace cov_gib_ts, .....)
(partition sales_200701 values less than (to_date('2007-02-01','YYYY-MM-DD')) tablespace sales_ts_200501,
partition sales_200702 values less than (to_date('2007-03-01','YYYY-MM-DD')) tablespace sales_ts_200502,
partition sales_max values less than (maxvalue) tablespace sales_ts_max
#local partitioned indexes
create index sales_idx_l1 on sales (time_id) local (partition sales_idx_200501 tablespace sales_ts_idx_200501,......) #oracle will manage indexes now(can not remove/add indexes manully)
#Global Partitioned Indexes, if the data is partitioned by monthly ranges, then you can define global index partitioned by product. thus partition pruning cannot occur for the data partitions due to the predicates of a query, index partition pruning may still be possible with the global partition index. rebuild index after dropping a partition
Prefixed and Nonprefixed Partition Indexes #prefixed performs better. An index is prefixed when the leftmost column of the index key is the same as the leftmost column of the index partition key. the index partitions to be scanned will be based on pruned data partitions

Clustered tables: two tables with a close link between them can have their data stored together in a single block to improve join operations.

#Data compression
create table ..... compress;
alter table sales move compress; #uncompress

#Index Key Compression
create index comm_sr_prod_idx on commission (sales_rep_id, prod_id) compress 1; # index compression cannot be used on a unique index that has only one attribute. 1 is the number of prefixing (leading) columns to compress. In this case, you used a value of 1 to indicate that duplicate values of the first column, sales_rep_id, are to be removed.

#Parallel Processing
alter session enable parallel dml; #default for DDL(create/alter) and query, default disabled for DML(insert/update/delete/merge)
create table ...... parallel;
select /*+ parallel (commission, 4) */ prod_id, sum(comm_amt), count(*) from commission group by prod_id; #parallel hint, 4 is the degree of parallelism(default is the number of cpus * PARALLEL_THREADS_PER_CPU)
#query rewrite & materialized view
create materialized view comm_prod_mv tablespace comm_prod_mv_ts storage (initial 50k next 50k) refresh complete next sysdate + 7 enable query rewrite as select sales_rep_id, prod_id, comm_date, count(*), sum(comm_amt) from commission group by sales_rep_id, prod_id, comm_date;


set serverout on size 10000 #print the server message
l_counter number :=0; --l_counter number :=(12,2);
l_today date :=sysdate; --L_start_date := to_date('09-JUN-91:13:01 ', 'DD-MON-YY:HH24:MI');
l_name varchar2(50);
l_seq_val number :=cust_id_seq.nextval; --assigns the next sequence
l_counter :=l_counter+1;
l_name := 'LUCYTHECAT';
dbms_output.put_line (l_today);
/ #run the procedure
show errors;
edit #edit above procedure

v_val varchar2(80);
v_val2 varchar2(60);
cursor get_data is
select VAL from test;
open get_data;
fetch get_data into v_val;
dbms_output.put_line(v_val); --print only the first row
v_val2 := CASE v_val
when 'abc' then 'is abc'
when 'def' then 'is def'
else 'neither abc nor def'
END; --case ... END
dbms_output.put_line (v_val2);
close get_data;
when no_data_found then
raise_application_error (-20052,'sorry no data found'); --use -20000 to -20999
when others then
raise_application_error (-20999,'something has gone wrong');

SQL> declare
v_val varchar2(80);
cursor get_data is
select VAL from test;
for i in get_data --no need for open/close cursor
dbms_output.put_line(i.VAL); --i is get_data cursor
end LOOP;

IF l_bank_balance >= 0 then EXIT;
L_decision := 'ACCOUNT OVERDRAWN';
WHILE l_sales_total < 100000 LOOP
Select sales_amount into l_sale_amount from daily_sales;
l_sales_total := l_sales_total + l_sale_amount;
l_counter number;
i number;
select count(*) into i from employee;
FOR l_counter IN 1 .. 10

##custom Exception
L_counter number := 0;
L_name employee.employee_name%type;
Cursor get_employee_name is
Select employee_name
From employee;
excep_old_friend Exception;
never_met_them Exception;
Open get_employee_name;
Fetch get_employee_name into l_name;
If l_name = 'CARL DUDLEY' then
Raise excep_old_friend;
Else --ELSIF ... then ...
Raise excep_never_met_them;
End if;
Close get_employee_name;
When excep_old_friend then
Dbms_output.put_line('I know this person');
When excep_old_friend then
Dbms_output.put_line('I do not know this person');
When others then
Dbms_output.put_line('Oracle Error: ' || sqlcode);
Dbms_output.put_line('Oracle Error message is: ' || sqlerrm);

create or replace view emp_hq as select * from employee_master where loc_id = '2'; #view. x$, v$, gv$(多了一列instance_id) #stored objects
Triggers cannot exist independently of an OracleDatabase 11g table. They are associated with one and only one table and, if a table is dropped, so is the trigger. #triggers
SQL> create or replace procedure printall #procedures
(first_me in varchar)
v_val varchar2(80);
cursor get_data is
select VAL from test where val like first_me || '%';
for i in get_data
end LOOP;
SQL> execute printall ('def') #or just exec printall ('def')
SQL>describe printall
create or replace function GetProductTaxIn #functions,
(in_product_id number)
return number
v_price number;
cursor get_price_tax is
select nvl(round(prod_list_price * 1.15,2),0) #nvl,replace null with the second parameter
from products
where prod_id = in_product_id;
open get_price_tax;
fetch get_price_tax into v_price;
return v_price;
when others then v_price := 0;
return v_price;
select prod_id, prod_list_price, GetProductTaxIn(Prod_id) from products;
select prod_id, prod_list_price, GetProductTaxIn(Prod_id) from products where GetProductTaxIn(Prod_id) >= 500;

SQL>select owner, object_name, object_type from dba_objects where status ^= 'VALID'; #packages, roll functions and procedures together

Oracle Functions

select supplier_id, NVL2(supplier_desc, supplier_name<if not null>, supplier_name2<if null>) from suppliers;
select NULLIF(12, 12) from dual; #null
select NULLIF(12, 22) from dual; #12
SQL> select coalesce(NULL,'not null',NULL,'not') from dual; #returns the first non-null
#Aggregation Functions
select ... group by rollup(c.cust_gender, b.channel_class, to_char(a.time_id, 'yyyy-mm')); # generates subtotals for attributes specified in the group by clause, plus another row representing the grand total.
select by cube(c.cust_gender, b.channel_class, to_char(a.time_id, 'yyyy-mm')); #more detailed subtotals
select count(*), department_id from employees group by department_id order by department_id; #group functions -> avg, sum, min, max, count and distinct keyword
SQL> select count(distinct commission_pct) from employees;
SQL> select count(*), count(distinct nvl(department_id,0)), count(distinct nvl(job_id,0)) from employees; #107 employees,19 jobs, 12 departments(1 is null)
SQL> select sum(2) from employees; #107 rows will return 214
SQL> select sum(sysdate-hire_date)/365.25 "total years" from employees; #years of all employees in total
SQL>select avg(2) from employees; #2, adds the number 2 across 107 rows and divides the total by the number of rows
SQL> select avg(salary) from employees;
SQL> select avg(sum(commission_pct)) from employees where nvl(department_id,0) in (40,80,0) group by department_id; # AVG(SUM(COMMISSION_PCT)) = (0.15 + 7.65) /2 = 3.9 nested group functions can nest for at most 2
select max(salary), count(*) from employees group by department_id order by department_id; #ones with highest salary in each department
select to_char(end_date,'YYYY') "year", count(*) "Number of employees" from job_history group by to_char(end_date,'YYYY') order by count(*) desc; #any item in the select list that is not a group function must be a grouping attribute of the GROUP BY clause
select department_id, job_id, sum(commission_pct) from employees where commission_pct is not null group by department_id, job_id; #group again by job_id in each department
select department_id, count(*) from job_history where department_id in (50,60,80,110) group by department_id having count(*)>1;
#Analysis Functions
select prod_id, sum(quantity_sold), rank () over (order by sum(quantity_sold) desc) as rank, dense_rank () over (order by sum(quantity_sold) desc) as dense_rank from sales where to_char(time_id, 'yyyy-mm') = '2001-06' group by prod_id; #if tie, rank will leave a gap(from 5 directly to 7), but dense_rank will be like 5 5 6

SQL>select TO_CHAR(123.02335,'FM9999999.0099') FROM DUAL; #FM is to Removes leading and trailing spaces/zeros from number; this example allows 7 bits to the left of the decimal point, and for the right of the decimal point, two bits at least and four bits at most, also will do round for digits after the fifth bits.
SELECT TO_CHAR(1234567, '999,999,999') FROM dual; #1,234,567. you can use FM999,999,999 for striping zeros/spaces
SQL> select TO_NUMBER('$94 567,00', 'L999G999D00', 'NLS_CURRENCY =''$'' NLS_NUMERIC_CHARACTERS='', ''') as result from dual; #pick up numbers from string
select to_char(sysdate,'Month')||' is a special time' from dual; #September. mask is on above
select first_name from employees where to_char(hire_date,'FMDAY')='SATURDAY'; #fm, remove blanks.
SQL> select to_date('25-DEC', 'DD-MON') from dual; #25-DEC-13
SQL> select to_date('25-DEC-10', 'fxDD-MON-YYYY') from dual; #fx is for exact matching. as 10 is YY rather than YYYY, so this clause will report error. use 2010 instead will do.
SQL>select next_day('03-NOV-1987','Wednesday') from dual; #04-NOV-87

SQL> select to_char(00001,'0999999')||' is a special number' from dual; #0000001
SQL> select to_char(00001,'999999')||' is a special number' from dual; #1
Oracle Numeric Format Masksdate format masks
date format masks for time

Categories: Databases, IT Architecture, Oracle DB Tags:

change oracle to use ASM for archive log destination

October 15th, 2013 No comments

Now our DB is using filesystem for storing archive log:


First, let's check available DGs:

[oracle@test03 ~]$ export ORACLE_HOME=/u01/app/
[oracle@test03 ~]$ export ORACLE_SID=+ASM1
[oracle@test03 ~]$ $ORACLE_HOME/bin/sqlplus / as sysasm
SQL> select name,state from v$asm_diskgroup ;

------------------------------ -----------
DA_SLCM07 MOUNTED #for archive log
DBFS_DG MOUNTED #for mounting RAM disk from DB
RE_SLCM07 MOUNTED #for backup

Now we can change archive log destination:

[oracle@test03 ~]$ export ORACLE_HOME=/u01/app/oracle/product/
[oracle@test03 ~]$ export ORACLE_SID=oimdb31
[oracle@test03 ~]$ $ORACLE_HOME/bin/sqlplus / as sysdba
SQL> archive log list;
Database log mode Archive Mode
Automatic archival Enabled
Archive destination /u01/app/oracle/product/
Oldest online log sequence 6
Next log sequence to archive 8
Current log sequence 8

SQL> alter system set log_archive_dest_1='location=+DA_SLCM07' scope=both; #On rac, you need only change this once

After this, you can remove old archive log on the filesystem.

tcp flags explanation in details – SYN ACK FIN RST URG PSH and iptables for sync flood

October 11th, 2013 No comments

This is from wikipedia:

To establish a connection, TCP uses a three-way handshake. Before a client attempts to connect with a server, the server must first bind to and listen at a port to open it up for connections: this is called a passive open. Once the passive open is established, a client may initiate an active open. To establish a connection, the three-way (or 3-step) handshake occurs:

  1. SYN: The active open is performed by the client sending a SYN to the server. The client sets the segment's sequence number to a random value A.
  2. SYN-ACK: In response, the server replies with a SYN-ACK. The acknowledgment number is set to one more than the received sequence number i.e. A+1, and the sequence number that the server chooses for the packet is another random number, B.
  3. ACK: Finally, the client sends an ACK back to the server. The sequence number is set to the received acknowledgement value i.e. A+1, and the acknowledgement number is set to one more than the received sequence number i.e. B+1.

At this point, both the client and server have received an acknowledgment of the connection. The steps 1, 2 establish the connection parameter (sequence number) for one direction and it is acknowledged. The steps 2, 3 establish the connection parameter (sequence number) for the other direction and it is acknowledged. With these, a full-duplex communication is established.

You can read pdf document here

H3C's implementations of sync flood solution

Using iptables to resolve sync flood issue and

You may also consider using tcpkill to kill half open sessions(using ss -s/netstat -s<SYN_RECV>/tcptrack to see connection summary)

Output from netstat -atun:

The reason for waiting is that packets may arrive out of order or be retransmitted after the connection has been closed. CLOSE_WAIT indicates that the other side of the connection has closed the connection. TIME_WAIT indicates that this side has closed the connection. The connection is being kept around so that any delayed packets can be matched to the connection and handled appropriately.

more on about FIN_wait (one error: 2MSL<Maximum Segment Lifetime>=120s, not 2ms)

All about tcp socket states:

And here's more about tcp connection(internet socket) states:

hadoop installation on centos linux – pseudodistributed mode

September 18th, 2013 No comments

First, install JDK and set JAVA_HOME:

yum install jdk-1.6.0_30-fcs export JAVA_HOME=/usr/java/jdk1.6.0_30

Now install hadoop rpm:

rpm -Uvh hadoop-1.2.1-1.x86_64.rpm

run hadoop version to verify that hadoop was successfully installed:

[root@node3 hadoop]# hadoop version Hadoop 1.2.1 Subversion -r 1503152 Compiled by mattf on Mon Jul 22 15:27:42 PDT 2013 From source with checksum 6923c86528809c4e7e6f493b6b413a9a This command was run using /usr/share/hadoop/hadoop-core-1.2.1.jar

After this, let's config hadoop to run in Pseudodistributed mode:

[root@node3 hadoop]# cat /etc/hadoop/core-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name></name> <value>hdfs://localhost/</value> </property> </configuration> [root@node3 hadoop]# cat /etc/hadoop/hdfs-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration> [root@node3 hadoop]# cat /etc/hadoop/mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:8021</value> </property> </configuration>

We need configure password-less configuration for ssh localhost if we're running in Pseudodistributed mode. This mainly means you can "ssh localhost" without a password(ssh-keygen -t rsa/append to authorized_keys). After above ssh configuration, let's go formating HDFS filesystem:

hadoop namenode -format

Now we can start daemons:

PS: I found that and and some other hadoop related scripts are not with execution permission initially, so you may run the following script to fix this:

for i in `find /usr/sbin/ -type f ! -perm -u+x`;do chmod u+x $i;done

That's all for hadoop installation on linux. You can now visit http://<ip of node>:50030/jobtracker.jsp and http://<ip of node>:50070/dfshealth.jsp to see status of hadoop jobtracker/namenode respectively.


Hadoop: The Definitive Guide
 is a good book about hadoop.

Categories: Clouding, IT Architecture Tags:

SAN Terminology

September 13th, 2013 No comments
SCSI Target
A SCSI Target is a storage system end-point that provides a service of processing SCSI commands and I/O requests from an initiator. A SCSI Target is created by the storage system's administrator, and is identified by unique addressing methods. A SCSI Target, once configured, consists of zero or more logical units.
SCSI Initiator
A SCSI Initiator is an application or production system end-point that is capable of initiating a SCSI session, sending SCSI commands and I/O requests. SCSI Initiators are also identified by unique addressing methods (See SCSI Targets).
Logical Unit
A Logical Unit is a term used to describe a component in a storage system. Uniquely numbered, this creates what is referred to as a Logicial Unit Number, or LUN. A storage system, being highly configurable, may contain many LUNS. These LUNs, when associated with one or more SCSI Targets, forms a unique SCSI device, a device that can be accessed by one or more SCSI Initiators.
Internet SCSI, a protocol for sharing SCSI based storage over IP networks.
iSCSI Extension for RDMA, a protocol that maps the iSCSI protocol over a network that provides RDMA services (i.e. InfiniBand). The iSER protocol is transparently selected by the iSCSI subsystem, based on the presence of correctly configured IB hardware. In the CLI and BUI, all iSER-capable components (targets and initiators) are managed as iSCSI components.
Fibre Channel, a protocol for sharing SCSI based storage over a storage area network (SAN), consisting of fiber-optic cables, FC switches and HBAs.
SCSI RDMA Protocol, a protocol for sharing SCSI based storage over a network that provides RDMA services (i.e. InfiniBand).
An iSCSI qualified name, the unique identifier of a device in an iSCSI network. iSCSI uses the form for IQNs. For example, the appliance may use the IQN: to identify one of its iSCSI targets. This name shows that this is an iSCSI device built by a company registered in March of 1986. The naming authority is just the DNS name of the company reversed, in this case, "com.sun". Everything following is a unique ID that Sun uses to identify the target.
Target portal
When using the iSCSI protocol, the target portal refers to the unique combination of an IP address and TCP port number by which an initiator can contact a target.
Target portal group
When using the iSCSI protocol, a target portal group is a collection of target portals. Target portal groups are managed transparently; each network interface has a corresponding target portal group with that interface's active addresses. Binding a target to an interface advertises that iSCSI target using the portal group associated with that interface.
Challenge-handshake authentication protocol, a security protocol which can authenticate a target to an initiator, an initiator to a target, or both.
A system for using a centralized server to perform CHAP authentication on behalf of storage nodes.
Target group
A set of targets. LUNs are exported over all the targets in one specific target group.
Initiator group
A set of initiators. When an initiator group is associated with a LUN, only initiators from that group may access the LUN.
Categories: Hardware, SAN, Storage Tags: ,

subversion svn management tips

August 29th, 2013 No comments

[ovmroot@test31 ~]# cd /opt/svn/conf/
[ovmroot@test31 conf]# ls -lrth *.conf
-rwxrwxrwx 1 ovmroot root 526 May 3 2012 svnmailer.conf
-rwxr-xr-x 1 ovmroot root 472 May 3 2012 mailer.conf
-rwxrwxrwx 1 ovmroot root 944 Apr 26 02:33 htpasswd.conf #htpasswd for generating passwords, htpasswd htpasswd.conf hu.zhou TestPass
-rwxrwxrwx 1 ovmroot root 749 Apr 26 02:33 pwd.conf #password for svn protocol
-rwxrwxrwx 1 ovmroot root 501 Apr 26 02:33 authz.conf #permisson settings, svnlook tree /opt/svndata/c9lab/|grep SVN_project_name


[ovmroot@test31 conf]# cat authz.conf
admin = root,admin
users = testuser1, hu.zhou

@admin = rw

@users = rw

guest =


hu.zhou = rw

testuser1 = r #this is read only


[ovmroot@test31 conf]# cat pwd.conf
root = testpass
admin = testpass
guest = guest

testuser1 = Test
hu.zhou = TestPass

Categories: IT Architecture Tags: