Archive

Author Archive

Symantec Netbackup 7.1 reference manual(commands) download

May 30th, 2011 No comments

Extend filesystem in vxvm which connects to SAN fibre channel storage

May 28th, 2011 No comments

Firstly, please refer to http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Online_Storage_Reconfiguration_Guide/scanning-storage-interconnects.html for some pre-checking, like memory usage, sync etc.
Then, we should scan for new disk connected to hba one by one(issue_lip, Scenario:fabric on Linux with Emulex hbas), we should check dmpnodes are ALL in ENABLED state before moving on to another hba. This is because when scanning for new disks, we expect to result in disabling paths on that controller – and moving on only when the paths are confirmed enabled again. And during all procedures, tail -f /var/log/messages would help.

1) Checked the messages file for any existing issues. Identified and eliminated concerns of I/O error messages which have occured for some time:
May 29 04:11:05 testserver01 kernel: end_request: I/O error, dev sdft, sector 0^M
May 29 04:11:05 testserver01 kernel: Buffer I/O error on device sdft, logical block 0^M
May 29 04:11:05 testserver01 kernel: end_request: I/O error, dev sdft, sector 0^M
/var/log/messages.1:May 27 22:48:04 testserver01 kernel: end_request: I/O error, dev sdjt, sector 8
/var/log/messages.1:May 27 22:48:04 testserver01 kernel: Buffer I/O error on device sdjt3, logical block 1
2)Saved some output for comparison later:
syminq -pdevfile > /var/tmp/syminq-pdevfile-prior
We expected device as hyper 2C27 so I looked for this device in the output and, as expected, did not find it.
vxdisk -o  alldgs list > /var/tmp/vxdisk-oalldgs.list
3)Checked the current state of the vm devices dmp subpaths:
for disk in `vxdisk -q list | awk '{print $1}'`
do
echo $disk
vxdmpadm getsubpaths dmpnodename=${disk}
done | tee -a /var/tmp/getsubpaths.out
Checked the output to make sure all but the root disk(expected) had two enabled paths: NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
================================================================================
cciss/c0d0   ENABLED(A)   -          c360       OTHER_DISKS  OTHER_DISKS
sda
NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
================================================================================
sda          ENABLED(A)   -          c0         EMC          EMC0             -
sddc         ENABLED(A)   -          c1         EMC          EMC0             -
sdae
NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
================================================================================
sdeh         ENABLED(A)   -          c1         EMC          EMC0             -
sdp          ENABLED(A)   -          c0         EMC          EMC0             -
sdaf
NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
================================================================================
sdej         ENABLED(A)   -          c1         EMC          EMC0             -
sdq          ENABLED(A)   -          c0         EMC          EMC0             -
sdai
NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
================================================================================
sdai         ENABLED(A)   -          c0         EMC          EMC0             -
sdfr         ENABLED(A)   -          c1         EMC          EMC0             -
... etc
Ran other commands like
vxdg list (to ensure all diskgroups where enabled)
df -k (no existing filesystem problems)

NOTE:

You can get your  <dmpnodename>  by running:
# vxdisk path | grep emcC5D3
Note, it is listed as DANAME (and not SUBPATH)
4)Tried to scan the first scsi bus
root@testserver01# pwd
/sys/class/scsi_host/host0
root@testserver01# echo '- - -' > scan
Waited to see if the scan would detect any new devices. Monditor the messages log for any messages relating to the scan.
Checked output of syminq and checked the subpaths for each dmpnode as above. All remained ENABLED and no change.
5. Moved on to issuing a force lip to the fibre path
a. issuing the forcelip
root@testserver01# pwd
/sys/class/fc_host/host0
root@testserver01# echo "1" > issue_lip
The command returned and I monitored the messages log. Waited for the disabled path messages to appear (as expected):
May 31 02:44:49 testserver01 kernel: VxVM vxdmp V-5-0-112 disabled path 8/0x80 belonging to the dmpnode 201/0x540
May 31 02:44:49 testserver01 kernel: VxVM vxdmp V-5-0-112 disabled path 67/0xa0 belonging to the dmpnode 201/0x5f0
May 31 02:44:49 testserver01 kernel: VxVM vxdmp V-5-0-112 disabled path 67/0xb0 belonging to the dmpnode 201/0x6c0
May 31 02:44:49 testserver01 kernel: VxVM vxdmp V-5-0-112 disabled path 128/0x60 belonging to the dmpnode 201/0x160
... etc
I checked the output of vxdmpadm getsubpaths for each device to confirm the paths had gone offline:
NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
================================================================================
sddw         ENABLED(A)   -          c0         EMC          EMC0             -
sdiv         ENABLED(A)   -          c1         EMC          EMC0             -
sddm
NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
================================================================================
sdbg         DISABLED     -          c0         EMC          EMC0             -
sdgq         ENABLED(A)   -          c1         EMC          EMC0             -
sddn
NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
================================================================================
sddz         ENABLED(A)   -          c0         EMC          EMC0             -
sdix         ENABLED(A)   -          c1         EMC          EMC0             -
sddo
NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
================================================================================
sdbh         DISABLED     -          c0         EMC          EMC0             -
sdgr         ENABLED(A)   -          c1         EMC          EMC0             -
sddp

Not all the paths were affected at once - a few minutes wait confirms they all go down as expected. The secondary path remains ENABLED, as expected.
I waited a little longer to get some estimation for how long all the paths took to go down - estimate was around 4 minutes.
b. Rescaning the fibre channel.
Before moving to the other path I get volume manager to rescan the device bus to trigger dmp to wake up the DISABLED paths.
root@testserver01# vxdisk scandisks fabric
I wait until all the primary paths become ENABLED again, checking for the dmp enabled messages in the messages log:
May 31 02:49:43 testserver01 kernel: VxVM vxdmp V-5-0-148 enabled path 129/0x90 belonging to the dmpnode 201/0x10
May 31 02:49:43 testserver01 kernel: VxVM vxdmp V-5-0-148 enabled path 129/0x50 belonging to the dmpnode 201/0x20
May 31 02:49:43 testserver01 kernel: VxVM vxdmp V-5-0-148 enabled path 128/0x40 belonging to the dmpnode 201/0x30
May 31 02:49:43 testserver01 kernel: VxVM vxdmp V-5-0-148 enabled path 129/0x30 belonging to the dmpnode 201/0x40
.... etc
And also check vxdmpadm getsubpaths command until all primary paths return to ENABLED state.
I did the same for the second host controlller at /sys/class/fc_host/host1
6)Checking for new disk device:
Try syminq and/or symcfg disco

After this, you can extend vxvm filesystem to the size you want, and re-import devices after that. Please refer to http://www.doxer.org/extending-filesystems-on-lvm-vxvmvxfs-how-to/ for more details.

panic cpu thread page_unlock is not locked issue when using centos xen to create solaris 10

May 25th, 2011 No comments

Don't panic.

You can allocate more memory to solaris virtual machine(like 1024Mb) and try again.

In the Sun Forums thread, they say that 609 MB is the lowest you can go. You can give it a little more memory size if allowed.

Use xming, xshell, putty, tightvnc to display linux gui on windows desktop (x11 forwarding when behind firewall)

May 24th, 2011 10 comments

Q1:How do I run X11 applications through Xming when there's no firewall?

Step 1 - Configure Xming

Let's assume that you want to run xclock on solaris/linux server 192.168.0.3, and want the gui display on your pc whose ip is 192.168.0.4.

Firstly, download xming, install it on your windows pc system.

You can go to http://sourceforge.net/projects/xming/files/ to download.

After this, you need set 192.168.0.3(linux/solaris) to the allowed server list on your windows. Edit X0.hosts which locates at the installation directory of xming(For example, C:\Program Files\Xming\X0.hosts), add a new entry in it:192.168.0.3, the ip address of linux/solaris that you want to run x11 utility from.

Then, restart xming(C:\Program Files\Xming\xming.exe) on your windows.

Step 2 - Connect to remote host, configure it, and run X11 application

Log in linux/solaris server 192.168.0.3. Set environment variable DISPLAY to the ip address of your windows, and append a :0 to it:

#export DISPLAY=192.168.0.4:0

Then you must allow X11 forwarding in sshd configuration file. That is, set X11Forwarding to yes in /etc/ssh/sshd_config and restart your sshd daemon.

 And on solaris/linux server(192.168.0.3), run a X11 programe, like

/usr/bin/xclock #or /usr/openwin/bin/xclock on solaris

You will then see a clock gui pop up in your windows pc.

PS: You may need install xorg-x11-xauth on remote host sometimes if you met error starting up xclock

Q2:How do I run X11 applications from remote host when that host is behind firewall?

If the remote host is behind firewall, then the method above will not work as the communication will be blocked if no firewall exception implemented. To run X11 applications from remote host behind firewall, you can follow steps below:

Step 1 - Configure Xming

This step is the same as step 1 in Q1, but I'll paste it here for your convenience:

Let's assume that you want to run xclock on solaris/linux server 192.168.0.3, and want the gui display on your pc whose ip is 192.168.0.4.

Firstly, download xming, install it on your windows pc system.

You can go to http://sourceforge.net/projects/xming/files/ to download.

After this, you need set 192.168.0.3(linux/solaris) to the allowed server list on your windows. Edit X0.hosts which locates at the installation directory of xming(For example, C:\Program Files\Xming\X0.hosts), add a new entry in it:192.168.0.3, which is the ip address of linux/solaris that you want to run x11 utility from.

Then, restart xming(C:\Program Files\Xming\xming.exe) on your windows.

Step 2 - Configure X11 forwarding on putty/xshell

For Xshell:

After entering remote hostname and log on username on Xshell, now nn the Tunneling tab of Advanced SSH Options dialog box, check "Forward X11 Connections to:" and click on "X DISPLAY:" and enter "localhost:0.0" next to it.

xshell_x11_forwarding

For Putty:

After entering remote hostname and log on username on putty, now unfold "Connection" on the left pane, unfold "SSH, and then select "X11". Later, check "Enable X11 forwarding" and enter "localhost:0.0" next to "X display location".

putty_x11_forwarding

Step 3 - Connect to remote host, configure it, and run X11 application

Log in linux/solaris server 192.168.0.3. Set environment variable DISPLAY to localhost:0

#export DISPLAY=localhost:0 #not 192.168.0.4:0 any more!

Then you must allow X11 forwarding in sshd configuration file. That is, set X11Forwarding to yes in /etc/ssh/sshd_config and restart your sshd daemon.

And on solaris/linux server(192.168.0.3), run a X11 programe, like

/usr/bin/xclock #or /usr/openwin/bin/xclock on solaris

You will then see a clock gui pop up in your windows pc.

PS: You may need install xorg-x11-xauth on remote host sometimes if you met error starting up xclock

Q3:How do I connect to remote host through vnc client(such as tightvnc)?

In general, you need first install vnc-server on remote host, then configure vnc-server on remote host. Later install tightvnc client on your PC and connect to remote host.

I'll show details about install vnc-server on remote host below:

yum grouplist
yum groupinstall "X Window System" -y
yum groupinstall "GNOME Desktop Environment" -y
yum groupinstall "Graphical Internet" -y
yum groupinstall "Graphics" -y
yum install vnc-server
echo "DESKTOP="GNOME"" > /etc/sysconfig/desktop
sed -i.bak ‘/VNCSERVERS=/d’ /etc/sysconfig/vncservers
echo "VNCSERVERS=\"1:root\"" >> /etc/sysconfig/vncservers

mkdir -p /root/.vnc
vi xstartup

#!/bin/sh

# Uncomment the following two lines for normal desktop:
# unset SESSION_MANAGER
# exec /etc/X11/xinit/xinitrc

[ -x /etc/vnc/xstartup ] && exec /etc/vnc/xstartup
[ -r $HOME/.Xresources ] && xrdb $HOME/.Xresources
xsetroot -solid grey
vncconfig -iconic &
#xterm -geometry 80×24+10+10 -ls -title "$VNCDESKTOP Desktop" &
#twm &
gnome-terminal &
gnome-session &

vncpasswd ~/.vnc/passwd #set password
chmod 755 ~/.vnc ; chmod 600 ~/.vnc/passwd ; chmod 755 ~/.vnc/xstartup

chkconfig –level 345 vncserver on
chkconfig –list | grep vncserver
service vncserver start

Q4:What if I want one linux box acting as X server, another linux box as X client, and I'm not sitting behind the Linux X server(means I have to connect to the Linux X server through VNC)?

This may sound complex, but it's simple actually.

First, you need install vnc-server on the linux X server following steps in above "Q3: How do I connect to remote host through vnc client(such as tightvnc)?".

Second, install tightvnc viewer on your windows box you sit behind and connect to the linux X server through it(xxx.xxx.xxx.xxx:1 or example). Run xhost + <linux X client> to enable access to X server.

Then, on the linux X client, export DISPLAY to xxx.xxx.xxx.xxx:1 which is the linux X server. Run a X program such as xclock, and you'll see the clock displaying on the X interface on your windows tightvnc viewer.

PS:

1.You can change vncserver’s resolution through editing /usr/bin/vncserver, change the default $geometry = "1024×768″ to any one you like, for example $geometry = "1600×900″. You can also control each user’s vnc resolution setting through adding line like "VNCSERVERARGS[1]="-geometry 1600×900″" in /etc/sysconfig/vncservers

2.For vnc server on ubuntu, you can refer to http://www.doxer.org/ubuntu-server-gnome-desktop-and-vncserver-configuration/

Analysis of output by solaris format -> verify

May 21st, 2011 No comments

Here's the output of format -> verify command in my solaris10:

format> verify

Primary label contents:

Volume name = < >
ascii name =
pcyl = 2609
ncyl = 2607
acyl = 2
bcyl = 0
nhead = 255
nsect = 63
Part Tag Flag Cylinders Size Blocks
0 root wm 1 - 1306 10.00GB (1306/0/0) 20980890
1 var wm 1307 - 2351 8.01GB (1045/0/0) 16787925
2 backup wu 0 - 2606 19.97GB (2607/0/0) 41881455
3 stand wm 2352 - 2606 1.95GB (255/0/0) 4096575
4 unassigned wm 0 0 (0/0/0) 0
5 unassigned wm 0 0 (0/0/0) 0
6 unassigned wm 0 0 (0/0/0) 0
7 unassigned wm 0 0 (0/0/0) 0
8 boot wu 0 - 0 7.84MB (1/0/0) 16065
9 unassigned wm 0 0 (0/0/0) 0

Now, let's give it an analysis:

  • Part

Solaris x86 has 9 slices for a disk, and for 8th and 9th, they're preserved by solaris.

  • Tag

This is used to indicate the purpose of the slice. Possible values are:

unassigned, boot, root, swap, usr, backup, stand, home, and public, private(The latter two are used by Sun StorEdge).

  • Flag

wm - this slice is writable and mountable.

wu - this slice is writable and unmountable.

rm - this slice is readable and mountable.

ru - this slice is readable and unmountable.

  • Cylinders

This part shows the start and end cylinder number of the slice.

  • Size

The size of the slice.

  • Blocks

This shows the number of cylinders and sectors of the slice.

Now, let's create a slice and mount the filesystem:

root@test / # format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
0. c1t0d0
/pci@0,0/pci15ad,1976@10/sd@0,0
1. c1t1d0
/pci@0,0/pci15ad,1976@10/sd@1,0
Specify disk (enter its number): 1 #select this disk
selecting c1t1d0
[disk formatted]

FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
fdisk - run the fdisk program
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
! - execute , then return
quit
format> partition #select partition to check and create new slice

PARTITION MENU:
0 - change `0' partition
1 - change `1' partition
2 - change `2' partition
3 - change `3' partition
4 - change `4' partition
5 - change `5' partition
6 - change `6' partition
7 - change `7' partition
select - select a predefined table
modify - modify a predefined partition table
name - name the current table
print - display the current table
label - write partition map and label to the disk
! - execute , then return
quit
partition> print #check slice topology
Current partition table (original):
Total disk cylinders available: 2607 + 2 (reserved cylinders)

Part Tag Flag Cylinders Size Blocks
0 root wm 1 - 1306 10.00GB (1306/0/0) 20980890
1 var wm 1307 - 2351 8.01GB (1045/0/0) 16787925
2 backup wu 0 - 2606 19.97GB (2607/0/0) 41881455
3 unassigned wm 0 0 (0/0/0) 0
4 unassigned wm 0 0 (0/0/0) 0
5 unassigned wm 0 0 (0/0/0) 0
6 unassigned wm 0 0 (0/0/0) 0
7 unassigned wm 0 0 (0/0/0) 0
8 boot wu 0 - 0 7.84MB (1/0/0) 16065
9 unassigned wm 0 0 (0/0/0) 0

partition> 3 #select an unassigned slice. It will be /dev/rdsk/c1t1d0s3 after saving to format.dat
Part Tag Flag Cylinders Size Blocks
3 unassigned wm 0 0 (0/0/0) 0

Enter partition id tag[unassigned]: stand
Enter partition permission flags[wm]:
Enter new starting cyl[1]: 2352
Enter partition size[0b, 0c, 2352e, 0.00mb, 0.00gb]: $
partition> label #write label to disk
Ready to label disk, continue? y

partition> name #name the current table
Enter table name (remember quotes): hah

partition> quit

FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
fdisk - run the fdisk program
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
! - execute , then return
quit
format> save #save new disk/partition definitions
Saving new disk and partition definitions
Enter file name["./format.dat"]:
format> quit

root@test / # newfs /dev/rdsk/c1t1d0s3 #create filesystem on newly created slice
newfs: construct a new file system /dev/rdsk/c1t1d0s3: (y/n)? y
Warning: 1474 sector(s) in last cylinder unallocated
/dev/rdsk/c1t1d0s3: 4096574 sectors in 667 cylinders of 48 tracks, 128 sectors
2000.3MB in 42 cyl groups (16 c/g, 48.00MB/g, 11648 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
3149856, 3248288, 3346720, 3445152, 3543584, 3642016, 3740448, 3838880,
3937312, 4035744
root@test / # fstyp /dev/dsk/c1t1d0s3 #check the filesystem type
ufs
root@test / # mkdir /hah
root@test / # mount /dev/dsk/c1t1d0s3 /hah #mount filesystem
root@test / # cd /hah/
root@test hah # touch aa #create a file to have a test
root@test hah # ls #finished, congratulations!
aa lost+found

Netbackup wrong report – files not backuped whilst netbackup reported already backuped

May 20th, 2011 No comments

Netbackup reported:

The following "sns*" files in /ftp/ftpst/datain/sns_a/success have been archived:/ftp/ftpst/datain/sns_a/success/sns_closed_09042011_180000_2.data/ftp/ftpst/datain/sns_a/success/sns_closed_09042011_180000_1.data

However, these files were still within the directories.

Here goes my comprehension:

We have had similar issues with other hosts which turns out to be more of an issue in the reporting of the script. It reports on the things it has selected for archiving rather than what has successfully been archived.

If netbackup does not manage to backup the files to tape because e.g. the drives are too busy then it doesn't delete the files but there is nothing in the script to report that this has occurred.

Categories: Hardware, Storage Tags:

dmx srdf failover procedure

May 17th, 2011 No comments

1.Verify that all devices listed by the above command are in the “Synchronized” state:
# symrdf –g dg01 query
2.Once all devices are confirmed as being “Synchronized”, then issue the command to failover:
# symrdf –g dg01 failover -establish
Once complete, the storage status can be checked again:
# symrdf –g livedsms_PRD query
NOTE: Do NOT trust any symcli output on any Solaris 2.6 host when working with the DMX-4.
IMPORTANT: ensure that the Symmetrix device group contains ALL devices:
vxdisk –g dg01 list #From a recent Sun Explorer

3.Import Storage
The vxdg should be imported as per normal:
# vxdctl enable
# vxdg –Cf import dg01
# vxvol –g dg01 startall
# mountall –F vxfs

4.Verify Storage
Ensure that the storage imported by the above commands appears to be correct by examining the output of the following commands:

# df –klFvxfs
# vxprint –g dg01 –v

Categories: Hardware, Storage Tags:

Steps for upgrading milestone to version 2.3

May 16th, 2011 No comments

Firstly, please refer to this page for prerequisite:

Motorola milestone(xt702) rooted and busybox installation

After reading that article through, you can do the following steps:

1.Go to Recovery Console:    wipe data/factory reset    wipe cache partitionThen, shutdown milestone.

2.Run RSDLite on your pc, select Milestone_2.2.1_UK_Package.sbf.

3.Goto Bootloader, connect your milestone and your pc, and click "start". When this finished, reboot your milestone.

4.Now put Milestone_V2.3.2_MuzisoftUpdate_B6.2-SignFile.zip to OpenRecovery\updates. And then put OpenRecovery and update.zip to /sdcard. Wipe again(Refer to step 1). In Recovery Console, focus on "apply sdcard:update.zip" and type OK button. Then, select root phone. After that, reboot system. Then you'll find your system already been rooted. Then, select Milestone_V2.3.2_MuzisoftUpdate_B6.2-SignFile.zip.

5.Wipe again and reboot milestone. You'll find your system now is 2.3!

Categories: Misc Tags:

Motorola milestone(xt702) rooted and busybox installation

May 16th, 2011 No comments

1.Download update.zip & Android SDK & busybox & Motorola ADB driver & RSDLite & OpenRecovery

f
2.You need your system rooted for these operations. You can download update.zip here, put it in your sdcard. Go to Recovery Console. To go to Recovery Console, you need firstly know your bootloader version. Press upper arrow of the four-directions key on your keyboard and booting button to boot your system. Then you can see the version of system bootloader.I.If the version is 90.78, you need press x+booting button, when there appears a triangle, press upper key of voice control and snapshot button to go to Recovery Console.II.If the version is 90.73/90.74, press snapshot button and booting button, when there appears a triangle, press upper key of voice control and snapshot button to go to Recovery Console.In Recovery Console, focus on "apply sdcard:update.zip" and type OK button. Then, select root phone. After that, reboot system. Then you'll find your system already been rooted.

3.Install Motorola ADB driver, after that, connect your usb to your pc(windows media sync). Setting your milestone like this:    Settings -> Applications -> Development -> check USB Debugging  Then, you'll find Motorola A853 info on your pc.

4.Install Android SDK to C:\android, here goes the layout:    C:\android        add-ons        platforms        SDK Readme.txt        SDK Setup.exe        tools5.Run cmd in your pc, unzip busybox and put it in  :\android\tools, it will be C:\android\tools\busybox.In the cmd console, run the following commands:    cd c:\android\tools    adb push c:\android\tools\busybox /sdcard/busybox #This may take several minutes to complete.Now, run:    adb shellCongratulations! You can now operates in your milestone!And, now you can try su to root:    suIf not successful, you'll need root your system again. Please refer to step 2.

Categories: Misc Tags:

How to Reload IAM DNS Configuration

May 13th, 2011 No comments

Firstly, backup your zone file under /apps/bind/var/named/master:

#cp -R /apps/bind/var/named/master /apps/bind/var/named/master_bak

Now, run command to reload:

#rndc -k /apps/bind/etc/rndc.key

PS:

Here's the man page for rndc, rndc.conf, named, named.conf, ndc:

http://www.linuxmanpages.com/man8/rndc.8.php

And, for name service cache daemon, you man refer to nscd man page.

Categories: IT Architecture, Linux, Systems Tags:

Clear error logs from fma before obu firmware patching

May 12th, 2011 No comments

1. Clear logs from fma:
To clear the FMA faults and error logs from Solaris:
a) Show faults in FMA
# fmadm faulty
NOTE: Do not use 'fmadm faulty -a' in this step. When you specify the -a optionall resource information cached by the Fault Manager is listed, includingfaults, which have already been corrected or where no recovery action is needed(see 'fmadm' man page). The listing also include information for resources,that may no longer be present in the system
b) For each fault UUID listed in the 'fmadm faulty' run
# fmadm repair <uuid># fmadm faulty (to make sure the output is clean after repair)
c) Clear ereports and resource cache
# cd /var/fm/fmd# rm e* f* c*/eft/* r*/*
d) Reset the fmd serd modules
# fmadm reset cpumem-diagnosis# fmadm reset cpumem-retire# fmadm reset eft# fmadm reset io-retire
e) Reboot the system
To clear the FMA faults and error logs from Solaris without rebooting the system:
a) Stop the fmd:
# svcadm disable -s svc:/system/fmd:default
b) Remove all files from the FMA log directories. This is very specific to the files found in the FMS directories, all directories must be left intact.
# cd /var/fm/fmd
# find /var/fm/fmd -type f -exec ls {} \;
c) Check that only files within the /var/fm/fmd directory are identified then replace the ls with rm to remove them.
# find /var/fm/fmd -type f -exec rm {} \;
d) Restart fmd after the files are removed
# svcadm enable svc:/system/fmd:default
For more information about clearing logs please see Doc ID 1004229.1
2. Upgrade the SP and OBP firmware by installing patch 145673-02.

About how to do firmware patch, please refer to http://www.doxer.org/obu-firmware-patching/

Recovering from temporary loss of san storage

May 9th, 2011 No comments

Typically, you will see entries in the messages file relating to  some fibre channel paths being offline and possibly coming online.
If all paths to the san storage are lost, then veritas will disable the disk groups, however the filesystems will still be mounted.

df -k will return messages like  "Cannot stat filesystem"

To recover, you need to

1/ Unmount any filesystems that are mounted from the disk groups that are disabled.
You may find that there are processes that were trying to write to files in these filesystems and that you cannot unmount. You will probably have to kill the processes. the only thing to be careful of is websphere processes. If its a websphere system that is impacted, call the websphere team out first before starting to unmount / kill processes etc.
for redhat systems, the umount is the opposite of solaris, so do a lazy umount of the parent directory
so, for /apps/test/results
umount -l /apps/test
should take care of it.

2/ Once all filesystems that are affected have been unmounted, you can deport the vxdgs
3/ Import the vxdgs
From the vxdg man page
When a disk group is imported, all  disks  in  the               disk  group  are  stamped with the host's host ID.               Typically, a disk group cannot be imported if  any               of  its disks are stamped with a non-matching host               ID.  This provides a sanity check in  cases  where               disks can be accessed from more than one host.
If it is certain that a disk  is  not  in  use  by               another host (such as because a disk group was not               cleanly deported), the -C option can  be  used  to               clear  the  existing  host  ID on all disks in the               disk group as part of the import.  A host  ID  can               also be cleared using vxdisk clearimport.

4/ Check volume status with vxprint -Ath and start the volumes

5/ Mount the filesystems either from entries in /etc/vfstab or /etc/fstab or if its VCS cluster then use vcs to bring mount points only online.
For any databases, eg oracle, make sure things like /apps/ora are mounted before you try to mount the database filesystems, otherwise there is no orahome dbs will not start

6/ If its databases, then talk to the dbas before starting any of the services. they may want to run some checks before starting databases.
7/ Start any services that need to be started, either under vcs or manually from start script. You may need help from the applicqation support team.

Categories: Hardware, Storage Tags:

cdsdisk in vxvm

May 9th, 2011 No comments

Confused by "auto:cdsdisk" from outputs of vxdisk list?

For a disk to be accessible by multiple platforms, the disk must be consistently recognized by the platforms, and all platforms must be capable of performing I/O on the disk.

CDS disks contain specific content at specific locations to identify or control access to the disk on different platforms.

The same content and location are used on all CDS disks, independent of the platform on which the disks are initialized.

Categories: Hardware, Storage Tags:

Description of the Solaris Boot Archives bootadm

May 7th, 2011 No comments

Please download it here:
solaris10-boot-archive (bootadm, failsafe)

Network and Operating Systems Course Collection & VMware Course Collection download

May 7th, 2011 No comments

This download include Network and Operating Systems Course Collection and VMware Course Collection, please click the following link to download:

online-learning courses download

Categories: IT Architecture Tags:

com.ibm.ws.exception.RuntimeError: com.ibm.ws.exception.RuntimeErro r: Unable to start the CoordinatorComponentImpl

May 6th, 2011 No comments

When I tried to start nodeagent and app server of websphere, it said in /apps/WebSphere/Profiles/Node-stage01-04/logs/nodeagent/SystemErr.log:

[06/05/11 00:17:36:556 BST] 0000000a SystemErr     R com.ibm.ws.exception.RuntimeError: com.ibm.ws.exception.RuntimeErro
r: Unable to start the CoordinatorComponentImpl

Then, I found this page:http://www-01.ibm.com/support/docview.wss?uid=swg21293285

I tried to look for RMM jar file, but failed. There's no this file under install_root.

Then, I re-run the startNode.sh for twice, magiccally, this resolved itself. Now the nodeagent and app server are both booting themselves ok.

Weird!

 

remove mnt resource of vcs

May 3rd, 2011 No comments

Target: remove mnt_prd-ora-grdctl from group SG_dbPRD

Here is the steps:

haconf -makerw
hares -delete mnt_prd-ora-grdctl
haconf -dump -makero

configure syslog to redirect WebSphere Message Broker (mq) messages to file

April 29th, 2011 No comments

On Linux® and UNIX® systems, all WebSphere® Message Broker messages (other than those generated by the command line utilities) are sent to the syslog, so it is useful to redirect user messages to a separate file.

Start of changeOn UNIX, syslog entries are restricted in length and messages that are sent to the syslog are truncated by the new line character. To record a large amount of data in a log on UNIX, set the Destination property on the Trace node to File or User Trace instead of Local Error Log.End of change

Before you create a broker on Linux or UNIX systems, configure the syslog daemon to redirect user messages to a file called user.log:

Log on as root.
Enter the following commands to create a file called user.log.
On UNIX systems, enter the command:

touch /var/adm/user.log
chown root:mqbrkrs /var/adm/user.log
chmod 640 /var/adm/user.log

On Linux, enter the command:

touch /var/log/user.log
chown root:mqbrkrs /var/log/user.log
chmod 640 /var/log/user.log

Add the following line to the /etc/syslog.conf file Start of change(on later versions of SUSE Linux, this is /etc/syslog-ng.conf)End of change to redirect debug level messages to the file user.log:
On UNIX systems, enter the command:

user.info /var/adm/user.log

On Linux, enter the command:

user.info /var/log/user.log

You can use user.* - instead of user.info in the preceding examples.
* means that information, notice, warning, and debug messages are caught
- means that syslog does not synchronize the file after writing to it.
You might experience a performance gain, but you can lose some data if the computer fails immediately after it has written to the file.
Start of changeAdd a line to ensure that messages at the required level from the user facility are recorded. If you specify a level of info, all operational messages are recorded; these messages provide important information about the operation of the broker, and can be useful in diagnosing problems.End of change
Restart the syslog daemon.
On AIX®, enter the command:

refresh -s syslogd

On HP-UX and Solaris, enter the command:

kill -HUP 'cat /etc/syslog.pid'

On Linux, enter the command:

/etc/init.d/syslogd restart

or

/etc/rc.d/init.d/syslogd restart

for systems where rc.d is not a soft link
For other syslog options, see the documentation for your operating system.

About other websphere message broker documents, please refer to http://publib.boulder.ibm.com/infocenter/wmbhelp/v6r0m0/index.jsp

Categories: IT Architecture, Linux, Systems Tags: , ,

Resolved:solaris patch panic – cannot start the system after patch

April 27th, 2011 No comments

Here goes the whole story:

Step 1. patch with PCA. after reboot -- -r

Rebooting with command: boot -r

Boot device: /pci@1c,600000/scsi@2/disk@0,0:a  File and args: -r

SunOS Release 5.10 Version Generic_142900-13 64-bit

Copyright 1983-2010 Sun Microsystems, Inc.  All rights reserved.

Use is subject to license terms.

WARNING: mod_load: cannot load module 'sharefs'

WARNING: Cannot mount /etc/dfs/sharetab

 

Hardware watchdog enabled

/kernel/drv/sparcv9/ip: undefined symbol 'ddi_get_lbolt64'

WARNING: mod_load: cannot load module 'ip'

/kernel/fs/sparcv9/sockfs: undefined symbol 'sock_comm_create_function'

/kernel/fs/sparcv9/sockfs: undefined symbol 'smod_lookup_byname'

/kernel/fs/sparcv9/sockfs: undefined symbol 'sctp_disconnect'

/kernel/fs/sparcv9/sockfs: undefined symbol 'sctp_getsockname'

/kernel/fs/sparcv9/sockfs: undefined symbol 'nd_free'

/kernel/fs/sparcv9/sockfs: undefined symbol 'nd_load'

/kernel/fs/sparcv9/sockfs: undefined symbol 'UDP_WR'

 

Step 2. zfs Roll back

ok>boot -F failsafe

#zfs rollback rpool/ROOT/sol10_sparc@pre_patched.142900-13_04.03.2011

 

Step 3. Patch with PCA again, then "halt", boot archive is not updated after patching, so we need remove the boot_archive

ok>boot -F failsafe

# mv /a/platform/`uname -i`/boot_archive /a/root/b_back

# /a/sbin/bootadm update-archive -R /a

# reboot

 

Step 4. Server is patched

root@solaris01~# uname -a

SunOS solaris01 5.10 Generic_144488-06 sun4u sparc SUNW,Sun-Fire-V240

Step 5. Restore the Zone

root@solaris01~# zoneadm -z solaris01sub01 halt

root@solaris01~# zoneadm -z solaris01sub01 detach

root@solaris01~# zoneadm -z solaris01sub01 attach -u

zoneadm: zone 'solaris01sub01': ERROR: attempt to downgrade package SUNWcsu, the source had patch 142053-03 which is not installed on this system

# SUNWcsu is the core package. we can't live without it's upgrade

 

root@solaris01~# zoneadm list -civ

ID NAME             STATUS     PATH                           BRAND    IP

0 global           running    /                              native   shared

- solaris01sub01   configured /export/zones/solaris01sub01   native   shared

 

root@solaris01~# showrev -a|grep 142053-03

Patch: 142053-03 Obsoletes: 142485-01 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsu

Patch: 143954-04 Obsoletes: 142053-03, 142485-01, 142487-01, 142535-03 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsr

 

root@solaris01~# zoneadm -z solaris01sub01 attach -u -F

root@solaris01~# zoneadm -z solaris01sub01 boot

 

root@solaris01~# zlogin solaris01sub01

Last login: Sat Mar 26 05:15:27 on pts/2

Sun Microsystems Inc.   SunOS 5.10    Generic   January 2005

# The zones doesn't work

root@solaris01sub01/# svcs

STATE          STIME    FMRI

online          5:40:24 svc:/system/svc/restarter:default

...

offline         5:40:24 svc:/system/sysidtool:net

offline         5:40:25 svc:/network/nfs/status:default

...

offline         5:40:28 svc:/system/boot-archive-update:default

uninitialized   5:40:27 svc:/application/x11/xfs:default

...

uninitialized   5:40:28 svc:/network/rpc-100235_1/rpc_ticotsord:default

 

root@solaris01sub01/# zfs list

internal error: Unknown error

Abort (core dumped)

root@solaris01sub01/# logout

 

root@solaris01~# zoneadm list -civ

ID NAME             STATUS     PATH                           BRAND    IP

0 global           running    /                              native   shared

- solaris01sub01   configured /export/zones/solaris01sub01   native   shared

 

root@solaris01~# zfs rollback rpool/export/zones/solaris01sub01@id24_pre_AMpatch12

 

root@solaris01sub01/# showrev -a|grep  142053-03

Patch: 142053-03 Obsoletes: 142485-01 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsu

 

root@solaris01~# zlogin solaris01sub01

Last login: Sat Mar 26 05:55:32 on pts/2

Sun Microsystems Inc.   SunOS 5.10    Generic   January 2005

 

root@solaris01sub01/# unzip -q 143954-04.zip

root@solaris01sub01/# patchadd 143954-04

Validating patches...

 

Global patches.

 

0 Patch 143954-04 is for global zone only - cannot be installed on non-global zone.

 

No patches to install.

 

root@solaris01/datastore/Explorers/patches# showrev -a|grep 142053-03

Patch: 142053-03 Obsoletes: 142485-01 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsu

Patch: 143954-04 Obsoletes: 142053-03, 142485-01, 142487-01, 142535-03 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsr

 

root@solaris01/datastore/Explorers/patches# unzip -q 143954-04.zip

root@solaris01/datastore/Explorers/patches# patchadd 143954-04

Validating patches...

....

Patch packages installed:

SUNWcslr

SUNWcsr

SUNWcsu

 

Done!

root@solaris01/datastore/Explorers/patches# zlogin solaris01sub01

[Connected to zone 'solaris01sub01' pts/2]

Last login: Sat Mar 26 05:58:17 on pts/2

Sun Microsystems Inc.   SunOS 5.10    Generic   January 2005

 

# Now we have the new package

root@solaris01sub01/# showrev -a|grep  142053-03

Patch: 142053-03 Obsoletes: 142485-01 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsu

Patch: 143954-04 Obsoletes: 142053-03, 142485-01, 142487-01, 142535-03 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsr, SUNWcsu

 

root@solaris01/datastore/Explorers/patches# zoneadm -z solaris01sub01 halt

root@solaris01/datastore/Explorers/patches# zoneadm -z solaris01sub01 detach

 

# the upgrade goes very smooth

root@solaris01/datastore/Explorers/patches# zoneadm -z solaris01sub01 attach -u

Getting the list of files to remove

Removing 481 files

Remove 11 of 11 packages

Installing 11639 files

Add 190 of 190 packages

Updating editable files

The file </var/sadm/system/logs/update_log> within the zone contains a log of the zone update.

root@solaris01/datastore/Explorers/patches# zoneadm list -civ

ID NAME           STATUS     PATH                           BRAND    IP

0 global           running    /                              native   shared

- solaris01sub01   installed  /export/zones/solaris01sub01   native   shared

root@solaris01/datastore/Explorers/patches# zoneadm -z solaris01sub01 boot

root@solaris01~# zoneadm list -civ

ID NAME             STATUS     PATH                        BRAND    IP

0 global           running    /                              native   shared

31 solaris01sub01   running    /export/zones/solaris01sub01   native   shared

 

root@solaris01~# zlogin solaris01sub01

svc:/system/sysidtool:net failed.

root@solaris01~# zlogin -C solaris01sub01

# choose Term Type 3) VT100. then everything goes well.

Steps to create new volume of VxVM under solaris

April 27th, 2011 1 comment

1.Take snapshot of running processes, filesystem partitions, network connections:
#/usr/ucb/ps aauuxxww>/running_processes.2011.04.25
#df -k>/filesystem.2011.04.25
#netstat -rnv>/networking.2011.04.25

2.Check free space of dg(disk group), sector position of dm(disk media), and subdisk allocation:
#vxprint

3.Let's see the lengths and offsets of all subdisks under plexes:

vxprint -st

You can also use iostat -En.

Where free sectors lay on the disks:

#vxdg -g abinitio free

#vxassist -g abinitio maxsize

4.Create v(volume):
vxassist -g abinitio make ora2 5g

After this, you can see in vxprint:
v  ora2         fsgen        ENABLED  10485760 -        ACTIVE   -       -
pl ora2-01      ora2         ENABLED  10487040 -        ACTIVE   -       -
sd disk1-14     ora2-01      ENABLED  10487040 0        -        -       -

And,
#fstyp /dev/vx/dsk/abinitio/ora2
Unknown_fstyp (no matches)

Let's create filesystem for volume ora2:

Firstly, let's check the block size of other filesystem on the server:

#fstyp -v /dev/vx/rdsk/abinitio/F41|grep bsize
bsize  1024 size  10240 dsize  0  ninode 10240  nau 0
defiextsize 0  ilbsize 0  immedlen 96  ndaddr 10

It's 1024 Bytes.

Then, let's create the filesystem for volume ora2:

#mkfs -F vxfs -o bsize=1024,largefiles /dev/vx/rdsk/abinitio/ora2
version 6 layout
10485760 sectors, 5242880 blocks of size 1024, log size 16384 blocks
largefiles supported
After this, let's check it:
# fstyp /dev/vx/rdsk/abinitio/ora2
vxfs

5.Mount a File System:
#mkdir /ora2
#mount -F vxfs -o largefiles /dev/vx/dsk/abinitio/ora2 /ora2

Now, cd to /ora2 and create a file to test it:
# cd /ora2/
# touch aa
# ls -l
total 0
-rw-r--r--   1 root     other          0 Apr 25 02:42 aa
drwxr-xr-x   2 root     root          96 Apr 25 02:40 lost+found

If all are ok, you should add an entry into /etc/vfstab
/dev/vx/dsk/abinitio/ora2       /dev/vx/rdsk/abinitio/ora2 /ora2        vxfs    2       yes     largefiles

Categories: Hardware, Storage Tags:

Difference between dm, dmp, rdmp, v, pl, sd, lun, c, t, d, s

April 24th, 2011 No comments

dm - disk media

dmp - disk media mapping

rdmp - an rdmp file is a mapping to a raw device stored on a san.  rdmp specifically is a physical mode rdm.  This data, although you may see it as the actual size of the raw device, is only a pointer to the actual storage to a volume on your SAN.

v - volume

pl - plex

sd - subdisk

lun - logic unit number

NOTE:

Disks (dm) must be added to a disk group (dg) under Veritas. A volume contains one or more plexes which in turn contain one or more subdisks.

And again, for the famous c, t, d, s issue:

From the computer perspective, SCSI LUN is only a part of the full SCSI address. The full device's address is made from the:

  • c-part: controller ID of the host bus adapter,
  • t-part: target ID identifying the SCSI target on that bus,
  • d-part: disk ID identifying a LUN on that target,
  • s-part: slice ID identifying a specific slice on that disk.

 

Resolved:[Load Manager Shared Memory]. Error is [28]: [No space left on device](for apache, pmserver etc. running on linux, solaris, unix)

April 23rd, 2011 No comments

This error may occur in pmserver, apache, oracle, rsync, up2date and many other services running on linux, solaris, unix, so it's a widespread and a famous question if you try to search google the keyword:"[Load Manager Shared Memory].  Error is [28]: [No space left on device]".
Now, let's take pmserver running on solaris10 for example to demonstrate to you step by step on how to solve the annoying problem.
Firstly, from "[No space left on device]" and "Load Manager Shared Memory", we firstly guessed that it's caused by shortage of memory, but after checking, we can see that memory is enough to allocate:

1.check the total memory size:

# /usr/sbin/prtconf |grep -i mem
Memory size: 32640 Megabytes
memory (driver not attached)
virtual-memory (driver not attached)

2.check application project memory size:

# su - sbintprd #as you have guessed, pmserver is running by user sbintprd in the box
$ id -p
uid=71269(sbintprd) gid=70772(sbintprd) projid=3(default)

This means that pmserver is running inside 'default' project. Then let's check the setting of "default" project:

# projects -l default
default
projid : 3
comment: ""
users  : (none)
groups : (none)
attribs: project.max-msg-ids=(privileged,256,deny)
project.max-shm-memory=(privileged,17179869184,deny)

# prctl -n project.max-shm-memory -i project default
project: 3: default
NAME    PRIVILEGE       VALUE    FLAG   ACTION                       RECIPIENT
project.max-shm-memory
privileged      16.0GB      -   deny                                 -
system          16.0EB    max   deny                                 -

16GB available to 'default' project. How come the shortage of memory then?

Let's bump up the max-shm-memory size by 2 GB to see what happens:

#prctl -n project.max-shm-memory -r -v 18gb -i project default

After this, we tried to bounce the pmserver, but the problem is still there:

#tail -f pmserver.log
INFO : LM_36070 [Fri Apr 22 22:19:42 2011] : (25218|1) The server is running on a host with 32 logical processors.
INFO : LM_36039 [Fri Apr 22 22:19:42 2011] : (25218|1) The maximum number of sessions that can run simultaneously is [10].
FATAL ERROR : CMN_1011 [Fri Apr 22 22:19:42 2011] : (25218|1) Error allocating system shared memory of [2000000] bytes for [Load Manager Shared Memory].  Error is [28]: [No space left on device]
FATAL ERROR : SF_34004 [Fri Apr 22 22:19:42 2011] : (25218|1) Server initialization failed.
INFO : SF_34014 [Fri Apr 22 22:19:42 2011] : (25218|1) Server shut down.

OK, then, we should think in other ways.

As we know, linux use shared memory between processes. We can use ipcs to check the information about  active  shared  memory  segments:

# ipcs -m|grep sbintprd
m  671088691   0          --rw------- sbintprd sbintprd

#ipcs -mA|grep sbintprd|wc -l
92

And each of them use 20000 size:

IPC status from <running system> as of Sat Apr 23 03:29:51 BST 2011
T         ID      KEY        MODE        OWNER    GROUP  CREATOR   CGROUP NATTCH      SEGSZ  CPID  LPID   ATIME    DTIME    CTIME  ISMATTCH         PROJECT
Shared Memory:
m  671088691   0          --rw------- sbintprd sbintprd sbintprd sbintprd      1    2000000  7781 16109  3:28:35  3:28:50  2:01:43        0         default

Now, we can conclude that the sbintprd user has over allocated and is not freeing up the space. So let's clear the shared memeory:

#for i in `ipcs -m | grep prd | awk '{print $2}'`; do ipcrm -m $i; done

After this step, the pmserver started successfully. From the log we can see:

NFO : LM_36070 [Sat Apr 23 01:51:17 2011]
: (5979|1) The server is running on a host with 32 logical processors.
INFO : LM_36039 [Sat Apr 23 01:51:18 2011] : (5979|1) The maximum number of sessions that
can run simultaneously is [10].
INFO : CMN_1010 [Sat Apr 23 01:51:18 2011] : (5979|1) Allocated system shared memory [id =
469762275] of [2000000] bytes for [Load Manager Shared Memory].
INFO : LM_36095 [Sat Apr 23 01:51:50 2011] : (5979|1) Persistent session cache file
cleanup is scheduled to run on [Sun Apr 24 01:51:50 2011].
INFO : SF_34003 [Sat Apr 23 01:51:50 2011] : (5979|1) Server initialization completed.

Problem resolved!

The result of rm -rf / on linux(debian)

April 20th, 2011 No comments

210169:/# rm -rf /
rm: cannot remove root directory `/'

Haven't tested other distributions, maybe you can have a try? :D

Install curl utility on solaris

April 19th, 2011 1 comment

Firstly, download curl package from sunfreeware.com. unzip the tarball, and execute ./configure
# ./configure
checking whether to enable maintainer-specific portions of Makefiles... no
checking whether to enable debug build options... no
checking whether to enable compiler optimizer... (assumed) yes
checking whether to enable strict compiler warnings... no
checking whether to enable compiler warnings as errors... no
checking whether to enable curl debug memory tracking... no
checking whether to enable c-ares for DNS lookups... no
checking for sed... /usr/bin/sed
checking for grep... /usr/bin/grep
checking for egrep... /usr/bin/egrep
checking for ar... /usr/local/bin/ar
checking for a BSD-compatible install... ./install-sh -c
checking whether build environment is sane... configure: error: newly created file is older than distributed files!
Check your system clock

This was because the system time was off and the timestamps in the source code were in the future.  To fix this all you can do is copy the files to another directory using the copy command:

#cp -r curl-7.21.4 /

Then try to compile again.

After this,  make and make install. If you come across the problem:

configure: error: ar not found in PATH. Cannot continue without ar.

That's because you haven't installed GNU_Binutils on your solaris system.

You should download binutils package from sunfreeware.com and install it:

#gzip -d binutils-2.21-sol10-x86-local.gz
#pkg-add -d binutils-2.21-sol10-x86-local

After this step, if you still receive alert:
configure: error: no acceptable C compiler found in $PATH

If you have installed gcc(we can check using find / -name gcc), then it's the problem of PATH environment variable:
#export PATH=$PATH:/usr/sfw/bin  #you can of course make the variable permanently changed via edit /etc/profile
Then,

#make
#make install
Finished!

Let's have a test of curl:
# which curl
/usr/local/bin/curl

# curl -I -L www.doxer.org
HTTP/1.1 200 OK
Date: Mon, 18 Apr 2011 01:36:14 GMT
Server: Apache/2.2.17 (Unix) mod_ssl/2.2.17 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
X-Powered-By: PHP/5.2.9
X-Pingback: http://www.doxer.org/xmlrpc.php
Content-Type: text/html; charset=UTF-8
Connection: Keep-Alive

Using zpool clear to resolve zpool “cannot open” issue

April 17th, 2011 2 comments

I've created a zfs pool tank using one removable disk. And it worked all ok before I reboot it.

As I unpluged the disk before rebooting, so zpool list returned "FAULTED" on Healthy column:

-bash-3.00# zpool list
NAME   SIZE  ALLOC   FREE    CAP  HEALTH  ALTROOT
tank      -      -      -      -  FAULTED  -

Using zpool status -x to check:

-bash-3.00# zpool status -x
pool: tank
state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-3C
scrub: none requested
config:

NAME        STATE     READ WRITE CKSUM
tank        UNAVAIL      0     0     0  insufficient replicas
c4t0d0    UNAVAIL      0     0     0  cannot open

c4t0d0 is the name of the removable disk. Using iostat -En, I can see that the disk is not connect to system:

-bash-3.00# iostat -En
c0d1             Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: ST31000524AS    Revision:  Serial No:             9VP Size: 1000.20GB <1000202305536 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
c1t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SAMSUNG  Product: CDRW/DVD SM-348B Revision: T509 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0

c0d1 is the hard disk, and c1t0d0 is SAMSUNG DVD.

Now, after plugging in the removable disk, and run again iostat -En, I can see that one more entry returned by iostat -En:

c4t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: Generic  Product: USB Disk         Revision: 9.02 Serial No:
Size: 20.00GB <20003880960 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 3 Predictive Failure Analysis: 0

As I've plugged the removable disk that comprised the zfs pool, then I can use zpool clear to let zfs to recheck the status of itself:

-bash-3.00# zpool clear tank

After this, using zpool status -x, we can see the pool is in healthy status:

-bash-3.00# zpool status -x
all pools are healthy

After getting zpool back to work, let's check the status of zfs:

-bash-3.00# zfs list

NAME                  USED  AVAIL  REFER  MOUNTPOINT
tank                  254K  18.2G    22K  /tank
tank/home             150K  18.2G    23K  /tank/home
tank/home/firsttry    106K  18.2G   106K  /tank/home/firsttry
tank/home/secondtry    21K  18.2G    21K  /export/secondtry

When I tried to cd to /tank/home, it prompt me:

-bash-3.00# cd /tank/home
-bash: cd: /tank/home: No such file or directory

That's because zfs not been mounted yet, let's mount all zfs then:

-bash-3.00# zfs mount -a

Ok, now we can cd to that directory:

-bash-3.00# cd /tank/home/
-bash-3.00# pwd
/tank/home

 

UNIX Printers not working status solution

April 17th, 2011 No comments

Sometimes, unix printers do not work due to some faults of the service. Usually you would have to clean the queue and restart the printer spooler service.

For an example see below:

From the printer-hosting system, check the status of unix printers(all status information):

root@localhost#lpstat –t

printer PRINTER_NAME faulted printing PRINTER_NAME-27847. enabled since Feb 28 10:07 2009. available.

PRINTER_NAME: processing

root@localhost# ping PRINTER_NAME

PRINTER_NAME is alive

root@localhost# lpstat -p PRINTER_NAME #check the status of PRINTER_NAME unix printer status

printer PRINTER_NAME now printing PRINTER_NAME-27847. enabled since Tue Feb 24 09:35:32 2009. available.

root@localhost# lpc clean PRINTER_NAME

clean: PRINTER_NAME: ok-substitution

root@localhost# lpc restart PRINTER_NAME #start a new printer daemon

root@localhost# lpstat -p PRINTER_NAME

printer PRINTER_NAME idle. enabled since Sat Feb 28 10:10:36 2009. available.

root@localhost# lpstat -t | grep PRINTER_NAME

device for PRINTER_NAME: /dev/PRINTER_NAME

PRINTER_NAME accepting requests since Fri Jun 06 09:04:18 2008

printer PRINTER_NAME idle. enabled since Sat Feb 28 10:10:36 2009. available.

# All seems ok so print a test page

root@localhost# echo ‘Hello World’ | lp -d PRINTER_NAME

request id is PRINTER_NAME-27862 (standard input)

root@localhost# lpstat -p PRINTER_NAME

printer PRINTER_NAME idle. enabled since Sat Feb 28 10:10:36 2009. available.

Categories: IT Architecture, Systems, Unix Tags:

openview Error getting OvCore ID solution

April 17th, 2011 No comments

In order to resolve the issue I had to do the following:

/opt/OV/bin/opcagt -stop
/opt/OV/bin/opcagt -kill
./ovcoreid -version
./ovcoreid -create
./ovcert -check
./ovcert -list
./ovcert -remove 62647e9a-b574-753b-023c-c23175d68bad
./ovcert -remove CA_1a50fab8-f092-750d-1d73-ff3268150d67
./ovcert -remove CA_3f49eabc-9514-752f-15c8-ff1a32bb2ec3
./ovcert -remove CA_ebc2af64-3bbc-750d-0528-c9520bf60692
/opt/OV/bin/OpC/install/opcactivate -srv test-cert_srv test

Then got error:

(xpl-273) Error occurred when loading configuration file '/var/opt/OV/conf/xpl/config/local_settings.ini'.
(xpl-272) Syntax error in line 0.
(xpl-95) Conversion to UTF16 failed.
(xpl-278) Processing update jobs skipped.

Then, do the following steps:

cd /var/opt/OV/conf/xpl/config/
mv local_settings.ini local_settings.ini_corrupted
/opt/OV/bin/OpC/install/opcactivate -srv test-cert_srv test
/opt/OV/bin/ovcert -certreq

bash-3.00# /opt/OV/bin/ovcert -list
+———————————————————+
| Keystore Content |
+———————————————————+
| Certificates: |
+———————————————————+
| Trusted Certificates: |
+———————————————————+
bash-3.00# /opt/OV/bin/ovc -status
ovcd OV Control CORE (17140) Running
ovbbccb OV Communication Broker CORE (17141) Running

Categories: Clouding, HA & HPC, IT Architecture Tags:

About swap file & swap space create/delete

April 17th, 2011 No comments

Best practice: The size of SWAP is usually set to twice the size of memory, but should not exceed 2G.

There are two ways to add space to SWAP:

Add swap partition using fdisk
Create swap file using mkswap

Let’s go through one by one.
1. Add swap partition using fdisk:
1)create a swap partition:
fdisk -l /dev/cciss/c0d0 (m—p —n –t (82)–w)(Result: /dev/cciss/c0d0p6)
2)Format the partition:
mkswap -c v1 /dev/cciss/c0d06;
3)Modify /etc/fstab, add a line:
/dev/cciss/c0d0p6 swap swap default 0 0
4)Activate swap partition:
swapon -a /dev/cciss/c0d0p6
5) Check the result:
swapon -s OR free -m OR cat /proc/swaps
2.Create swap file using mkswap
1)dd if=/dev/zero of=/tmp/tmp.swap bs=1M count =100( make a file of 100MB);
2)mkswap /tmp/tmp.swap(set up a Linux swap area)
3)swapon /tmp/tmp.swap;(Activate inux swap area)
4)Modify /etc/fstab, add a line below:
/tmp/tmp.swap swap swap defaults 0 0
5)Check the result:
swapon -s OR free -m OR cat /proc/swaps
3.Delete swap area or swap file
1) swapoff /dev/cciss/c0d0p6 #or delete swap file
2) modify /etc/fstab and comment out the line you just added.

NB:

If you want to add swap space on solaris zfs, please refer to: Extending and Adding SWAP space on solaris zfs and linux

Categories: IT Architecture, Linux, Systems, Unix Tags:

Symantec products reference manual and docs – AWESOME

April 17th, 2011 No comments

Please refer to:

https://sort.symantec.com/documents

Just find your destination and click on it(for example, symantec netbackup, symantec veritas, symantec filestore, symantec Veritas Cluster Server, etc.), a pdf reference manual will be there for you then. AWESOME, isn’t that? :D

Categories: Hardware, Storage Tags:

EMC BCV database backup howto

April 17th, 2011 4 comments

What is a BCV ?
A BCV (Business Continuity Volume) is a snapshot of a set of EMC Symmetrix logical disk devices (luns).
The luns are grouped together into logical units known as symdgs and it is at the symdg level that bcv establishes work. This snapshot is taken at a block level.
When a snapshot (known as a bcv sync) is taken it can be either a full establish or an incremental establish.
• A full establish synchronizes all blocks across the devices.
• An incremental establish only synchronizes those blocks that have been changed since the last establish.

As the BCV is an exact replica of the ‘live’ data it can be accessed in the same way that the original data is (eg if the original data is a VxVM diskgroup, the BCV can also be imported as a vxdg on a different host)

The BCV does not necessarily have to be established to devices in the same Symmetrix Array as the live data. A bcv established to devices in a second array is known as a remote bcv (rbcv)

Essentially, BCVs are just mirrors of storage. Each BCV is a separate device that can be either attached or ‘split’. When BCV is attached to the standard volume, it is first synchronised with the master copy, and then all the changes are applied to both the master and BCV.
Fully mirrored drive sets

Fully mirrored drive sets

When you split the BCV, the changes are no longer propagated to it and the device can be imported on any other server connected to that storage array. You can then perform any operations with recent production data, like a backup, without impacting production performance.
The split-mirror set after the splitting process

The split-mirror set after the splitting process

Usual steps the each backup consists of are:

1. Unmount and deport of BCV’s from the hosts
2. Synchronisation of either local or remote symmetrix disk groups
3. Put database(s) into hot backup mode
4. Split the data symmetrix disk groups
5. Take the database(s) out of hot backup mode
6. Split the archivelog symmetrix diskgroup(s)
7. If required then mount updated BCV on server and backup to tape
8. For customer management liv bcv split there is also a replication job

These steps are divided into three jobs like this: step one is a separate job, steps 7 and 8 are a separate job; Steps 2 to 6 are a separate job. First and last job runs on the netbackup server, second job runs on the host where the database is.

Investigation strategies
The best thing to do is to have a look at the log to see why it failed. Then check the fail conditions, if the are no longer true, re-run the job. Investigate why the conditions of failure appeared, try to correct them and re-run the job.

Useful notes
1.Sometimes, BCV sync can be slow. This usually happens if a lot of data has changed or lots of new space was added, since the sync is a sector-copy. Script should handle it gracefully as long as any progress in synchronisation is made, but sometimes the timeout that is set in the main script may still not be enough.
2.RMAN Running
If an RMAN backup job is running it is not possible to put the database in hotbackup mode so the job must fail. The job can be resubmitted once the RMAN job has been completed.
The steps to fix this issue are:
1) Verify if the bcv is local or remote
2) Wait until all the symdgs established in the script have completed syncing
3) Split the symdgs
4) Verify the RMAN job has completed
5)Rerun the backup

Categories: Hardware, Storage Tags: