Archive

Author Archive

Nagios Check_nrpe/check_disk with error message DISK CRITICAL – /apps/yourxxx is not accessible: Permission denied

June 21st, 2011 1 comment

This was sometimes because the user under which nagios runs by had no read permission to the file systems check_disk is going to check.

For example, if you received alert:

DISK CRITICAL - /apps/vcc/logs/way is not accessible: Permission denied

You then can log on your server, under root, run:

/usr/local/nagios/libexec/check_disk -p /apps/vcc/logs/way, you may see:

DISK OK - free space: /apps/vcc/logs/way 823 MB (21% inode=90%);| /=2938MB;;3966;0;3966

But when you run this under user nagios, you may see DISK CRITICAL again.

Resolution:

Grant read permission to the filesystem/directory that had problem.

 

DNS flush on linux/windows/mac

June 17th, 2011 No comments
  • Mac:
  • sudo dscacheutil -flushcache
  • Linux:
  • Here're ways to flush dns cache on linux/windows/mac:
    On linux:
    rndc flush
    or
    /etc/init.d/nscd restart
    On Windows:
    ipconfig /flushdns

    On Mac:

    sudo dscacheutil -flushcache

    You can install DNS cache plug-in to automatically flush DNS cache for you if you have Firefox installed.

     

    Categories: IT Architecture, Linux, Systems Tags:

    How to resize/shrink lvm home/root partition/filesystem dynamically

    June 8th, 2011 2 comments

    When you're trying to extend root file system under lvm environment, here are the steps:

    1.Extend the root volume:

    #lvextend -L500G /dev/VolGroup00/apps-yourapp

    2.Grow file system:

    #resize2fs /dev/VolGroup00/apps-yourapp 500G

    Shrinking can’t be done that easily and requires an umount. To shrink the FS, do the following:

    umount <logical-volume-device>
    e2fsck -f <logical-volume-device>
    resize2fs <logical-volume-device> <new-size-minus-50MB>
    lvreduce -L <new-size> <logical-volume-device>

    This procedure is for normal(not root) file systems. But what should we do when we want to shrink root/home partition/file system?

    You need go to linux rescue mode using a bootable device(Such as the first cd of your linux distro), type linux rescue, and when there jumps out an alert says whether to mount system to /mnt/sysimage, select skip.

    When you're in the shell of rescue mode, do the following steps:

    1.lvm vgchange -a y

    2.e2fsck -f /dev/VolGroup00/LogVol00

    3.resize2fs -f /dev/VolGroup00/LogVol00 500G

    4.lvm lvreduce -L500G /dev/VolGroup00/LogVol00

    Then reboot your system, and after it starts up, running df -h to check the result. And using vgs to see VSize and VFree.

    Enjoy.

     

     

    Interesting tty/pty/pts experiment

    June 8th, 2011 No comments

    Now, open two tabs on the same server using xshell, on each tab(session), run tty:

    root@testserver# tty

    /dev/pts/7

    root@testserver# tty

    /dev/pts/25

    First experiment - echo an string between ttys

    From /dev/pts/7, run:

    root@testserver# echo hi>/dev/pts/25

    Then, from the other tab(session), you're sure to see:

    root@testserve# hi

    To be continued...

    Grow/shrink vxfs file system(or volume) dynamically using vxresize

    June 5th, 2011 No comments

    I've already use vxassist creating disk group, making file system,  and mounting it to the OS and that partition is now in use. How can I grou/shrink  the size of vxvm filesystem dynamically ?

    Three steps:

    1.Which disk group does the file system use?

    In my scenario, I've /user which is the mount point of volume user_vol, and that volume belongs to andy_li disk group. As I created them, I've a clear mind about it. Then, how can I know which disk group does a file system/volume belongs to?

    #df -h /user

    /dev/vx/dsk/andy_li/user_vol                      1.0G   18M  944M   2% /user

    Now you can see, /user file system belongs to andy_li disk group.

    2. Now let's check how many space left in the disk group that we can use for growing /user:

    #vxdg -g andy_li free

    GROUP        DISK         DEVICE       TAG          OFFSET    LENGTH    FLAGS

    andy_li      andy_li01    sdb          sdb          3121152   999168 -

    999168 blocks, that's about 500MB.

    3. The last and most important thing is to grow the file system:

    /etc/vx/bin/vxresize -b -g andy_li user_vol +999168 alloc=andy_li01

    Ok, after this operation, let's check the file system's size again:

    #df -h /user/

    Filesystem            Size  Used Avail Use% Mounted on

    /dev/vx/dsk/andy_li/user_vol                      1.5G 18M  1.4G   2% /user

    That's all. And vice versa, You can use minus(-) instead of plus(+) to shrink the file system.

    Categories: Hardware, Storage Tags:

    Symantec Netbackup 7.1 reference manual(commands) download

    May 30th, 2011 No comments

    Extend filesystem in vxvm which connects to SAN fibre channel storage

    May 28th, 2011 No comments

    Firstly, please refer to http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Online_Storage_Reconfiguration_Guide/scanning-storage-interconnects.html for some pre-checking, like memory usage, sync etc.
    Then, we should scan for new disk connected to hba one by one(issue_lip, Scenario:fabric on Linux with Emulex hbas), we should check dmpnodes are ALL in ENABLED state before moving on to another hba. This is because when scanning for new disks, we expect to result in disabling paths on that controller – and moving on only when the paths are confirmed enabled again. And during all procedures, tail -f /var/log/messages would help.

    1) Checked the messages file for any existing issues. Identified and eliminated concerns of I/O error messages which have occured for some time:
    May 29 04:11:05 testserver01 kernel: end_request: I/O error, dev sdft, sector 0^M
    May 29 04:11:05 testserver01 kernel: Buffer I/O error on device sdft, logical block 0^M
    May 29 04:11:05 testserver01 kernel: end_request: I/O error, dev sdft, sector 0^M
    /var/log/messages.1:May 27 22:48:04 testserver01 kernel: end_request: I/O error, dev sdjt, sector 8
    /var/log/messages.1:May 27 22:48:04 testserver01 kernel: Buffer I/O error on device sdjt3, logical block 1
    2)Saved some output for comparison later:
    syminq -pdevfile > /var/tmp/syminq-pdevfile-prior
    We expected device as hyper 2C27 so I looked for this device in the output and, as expected, did not find it.
    vxdisk -o  alldgs list > /var/tmp/vxdisk-oalldgs.list
    3)Checked the current state of the vm devices dmp subpaths:
    for disk in `vxdisk -q list | awk '{print $1}'`
    do
    echo $disk
    vxdmpadm getsubpaths dmpnodename=${disk}
    done | tee -a /var/tmp/getsubpaths.out
    Checked the output to make sure all but the root disk(expected) had two enabled paths: NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
    ================================================================================
    cciss/c0d0   ENABLED(A)   -          c360       OTHER_DISKS  OTHER_DISKS
    sda
    NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
    ================================================================================
    sda          ENABLED(A)   -          c0         EMC          EMC0             -
    sddc         ENABLED(A)   -          c1         EMC          EMC0             -
    sdae
    NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
    ================================================================================
    sdeh         ENABLED(A)   -          c1         EMC          EMC0             -
    sdp          ENABLED(A)   -          c0         EMC          EMC0             -
    sdaf
    NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
    ================================================================================
    sdej         ENABLED(A)   -          c1         EMC          EMC0             -
    sdq          ENABLED(A)   -          c0         EMC          EMC0             -
    sdai
    NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
    ================================================================================
    sdai         ENABLED(A)   -          c0         EMC          EMC0             -
    sdfr         ENABLED(A)   -          c1         EMC          EMC0             -
    ... etc
    Ran other commands like
    vxdg list (to ensure all diskgroups where enabled)
    df -k (no existing filesystem problems)

    NOTE:

    You can get your  <dmpnodename>  by running:
    # vxdisk path | grep emcC5D3
    Note, it is listed as DANAME (and not SUBPATH)
    4)Tried to scan the first scsi bus
    root@testserver01# pwd
    /sys/class/scsi_host/host0
    root@testserver01# echo '- - -' > scan
    Waited to see if the scan would detect any new devices. Monditor the messages log for any messages relating to the scan.
    Checked output of syminq and checked the subpaths for each dmpnode as above. All remained ENABLED and no change.
    5. Moved on to issuing a force lip to the fibre path
    a. issuing the forcelip
    root@testserver01# pwd
    /sys/class/fc_host/host0
    root@testserver01# echo "1" > issue_lip
    The command returned and I monitored the messages log. Waited for the disabled path messages to appear (as expected):
    May 31 02:44:49 testserver01 kernel: VxVM vxdmp V-5-0-112 disabled path 8/0x80 belonging to the dmpnode 201/0x540
    May 31 02:44:49 testserver01 kernel: VxVM vxdmp V-5-0-112 disabled path 67/0xa0 belonging to the dmpnode 201/0x5f0
    May 31 02:44:49 testserver01 kernel: VxVM vxdmp V-5-0-112 disabled path 67/0xb0 belonging to the dmpnode 201/0x6c0
    May 31 02:44:49 testserver01 kernel: VxVM vxdmp V-5-0-112 disabled path 128/0x60 belonging to the dmpnode 201/0x160
    ... etc
    I checked the output of vxdmpadm getsubpaths for each device to confirm the paths had gone offline:
    NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
    ================================================================================
    sddw         ENABLED(A)   -          c0         EMC          EMC0             -
    sdiv         ENABLED(A)   -          c1         EMC          EMC0             -
    sddm
    NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
    ================================================================================
    sdbg         DISABLED     -          c0         EMC          EMC0             -
    sdgq         ENABLED(A)   -          c1         EMC          EMC0             -
    sddn
    NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
    ================================================================================
    sddz         ENABLED(A)   -          c0         EMC          EMC0             -
    sdix         ENABLED(A)   -          c1         EMC          EMC0             -
    sddo
    NAME         STATE[A]   PATH-TYPE[M] CTLR-NAME  ENCLR-TYPE   ENCLR-NAME    ATTRS
    ================================================================================
    sdbh         DISABLED     -          c0         EMC          EMC0             -
    sdgr         ENABLED(A)   -          c1         EMC          EMC0             -
    sddp

    Not all the paths were affected at once - a few minutes wait confirms they all go down as expected. The secondary path remains ENABLED, as expected.
    I waited a little longer to get some estimation for how long all the paths took to go down - estimate was around 4 minutes.
    b. Rescaning the fibre channel.
    Before moving to the other path I get volume manager to rescan the device bus to trigger dmp to wake up the DISABLED paths.
    root@testserver01# vxdisk scandisks fabric
    I wait until all the primary paths become ENABLED again, checking for the dmp enabled messages in the messages log:
    May 31 02:49:43 testserver01 kernel: VxVM vxdmp V-5-0-148 enabled path 129/0x90 belonging to the dmpnode 201/0x10
    May 31 02:49:43 testserver01 kernel: VxVM vxdmp V-5-0-148 enabled path 129/0x50 belonging to the dmpnode 201/0x20
    May 31 02:49:43 testserver01 kernel: VxVM vxdmp V-5-0-148 enabled path 128/0x40 belonging to the dmpnode 201/0x30
    May 31 02:49:43 testserver01 kernel: VxVM vxdmp V-5-0-148 enabled path 129/0x30 belonging to the dmpnode 201/0x40
    .... etc
    And also check vxdmpadm getsubpaths command until all primary paths return to ENABLED state.
    I did the same for the second host controlller at /sys/class/fc_host/host1
    6)Checking for new disk device:
    Try syminq and/or symcfg disco

    After this, you can extend vxvm filesystem to the size you want, and re-import devices after that. Please refer to http://www.doxer.org/extending-filesystems-on-lvm-vxvmvxfs-how-to/ for more details.

    panic cpu thread page_unlock is not locked issue when using centos xen to create solaris 10

    May 25th, 2011 No comments

    Don't panic.

    You can allocate more memory to solaris virtual machine(like 1024Mb) and try again.

    In the Sun Forums thread, they say that 609 MB is the lowest you can go. You can give it a little more memory size if allowed.

    Use xming, xshell, putty, tightvnc to display linux gui on windows desktop (x11 forwarding when behind firewall)

    May 24th, 2011 10 comments

    Q1:How do I run X11 applications through Xming when there's no firewall?

    Step 1 - Configure Xming

    Let's assume that you want to run xclock on solaris/linux server 192.168.0.3, and want the gui display on your pc whose ip is 192.168.0.4.

    Firstly, download xming, install it on your windows pc system.

    You can go to http://sourceforge.net/projects/xming/files/ to download.

    After this, you need set 192.168.0.3(linux/solaris) to the allowed server list on your windows. Edit X0.hosts which locates at the installation directory of xming(For example, C:\Program Files\Xming\X0.hosts), add a new entry in it:192.168.0.3, the ip address of linux/solaris that you want to run x11 utility from.

    Then, restart xming(C:\Program Files\Xming\xming.exe) on your windows.

    Step 2 - Connect to remote host, configure it, and run X11 application

    Log in linux/solaris server 192.168.0.3. Set environment variable DISPLAY to the ip address of your windows, and append a :0 to it:

    #export DISPLAY=192.168.0.4:0

    Then you must allow X11 forwarding in sshd configuration file. That is, set X11Forwarding to yes in /etc/ssh/sshd_config and restart your sshd daemon.

     And on solaris/linux server(192.168.0.3), run a X11 programe, like

    /usr/bin/xclock #or /usr/openwin/bin/xclock on solaris

    You will then see a clock gui pop up in your windows pc.

    PS: You may need install xorg-x11-xauth on remote host sometimes if you met error starting up xclock

    Q2:How do I run X11 applications from remote host when that host is behind firewall?

    If the remote host is behind firewall, then the method above will not work as the communication will be blocked if no firewall exception implemented. To run X11 applications from remote host behind firewall, you can follow steps below:

    Step 1 - Configure Xming

    This step is the same as step 1 in Q1, but I'll paste it here for your convenience:

    Let's assume that you want to run xclock on solaris/linux server 192.168.0.3, and want the gui display on your pc whose ip is 192.168.0.4.

    Firstly, download xming, install it on your windows pc system.

    You can go to http://sourceforge.net/projects/xming/files/ to download.

    After this, you need set 192.168.0.3(linux/solaris) to the allowed server list on your windows. Edit X0.hosts which locates at the installation directory of xming(For example, C:\Program Files\Xming\X0.hosts), add a new entry in it:192.168.0.3, which is the ip address of linux/solaris that you want to run x11 utility from.

    Then, restart xming(C:\Program Files\Xming\xming.exe) on your windows.

    Step 2 - Configure X11 forwarding on putty/xshell

    For Xshell:

    After entering remote hostname and log on username on Xshell, now nn the Tunneling tab of Advanced SSH Options dialog box, check "Forward X11 Connections to:" and click on "X DISPLAY:" and enter "localhost:0.0" next to it.

    xshell_x11_forwarding

    For Putty:

    After entering remote hostname and log on username on putty, now unfold "Connection" on the left pane, unfold "SSH, and then select "X11". Later, check "Enable X11 forwarding" and enter "localhost:0.0" next to "X display location".

    putty_x11_forwarding

    Step 3 - Connect to remote host, configure it, and run X11 application

    Log in linux/solaris server 192.168.0.3. Set environment variable DISPLAY to localhost:0

    #export DISPLAY=localhost:0 #not 192.168.0.4:0 any more!

    Then you must allow X11 forwarding in sshd configuration file. That is, set X11Forwarding to yes in /etc/ssh/sshd_config and restart your sshd daemon.

    And on solaris/linux server(192.168.0.3), run a X11 programe, like

    /usr/bin/xclock #or /usr/openwin/bin/xclock on solaris

    You will then see a clock gui pop up in your windows pc.

    PS: You may need install xorg-x11-xauth on remote host sometimes if you met error starting up xclock

    Q3:How do I connect to remote host through vnc client(such as tightvnc)?

    In general, you need first install vnc-server on remote host, then configure vnc-server on remote host. Later install tightvnc client on your PC and connect to remote host.

    I'll show details about install vnc-server on remote host below:

    yum install gnome*

    yum grouplist
    yum groupinstall "X Window System" -y
    yum groupinstall "GNOME Desktop Environment" -y
    yum groupinstall "Graphical Internet" -y
    yum groupinstall "Graphics" -y
    yum install vnc-server
    echo "DESKTOP="GNOME"" > /etc/sysconfig/desktop
    sed -i.bak ‘/VNCSERVERS=/d’ /etc/sysconfig/vncservers
    echo "VNCSERVERS=\"1:root\"" >> /etc/sysconfig/vncservers

    mkdir -p /root/.vnc
    vi xstartup

    #!/bin/sh

    # Uncomment the following two lines for normal desktop:
    # unset SESSION_MANAGER
    # exec /etc/X11/xinit/xinitrc

    [ -x /etc/vnc/xstartup ] && exec /etc/vnc/xstartup
    [ -r $HOME/.Xresources ] && xrdb $HOME/.Xresources
    xsetroot -solid grey
    vncconfig -iconic &
    #xterm -geometry 80×24+10+10 -ls -title "$VNCDESKTOP Desktop" &
    #twm &
    gnome-terminal &
    gnome-session &

    vncpasswd ~/.vnc/passwd #set password
    chmod 755 ~/.vnc ; chmod 600 ~/.vnc/passwd ; chmod 755 ~/.vnc/xstartup

    chkconfig –level 345 vncserver on
    chkconfig –list | grep vncserver
    service vncserver start

    Q4:What if I want one linux box acting as X server, another linux box as X client, and I'm not sitting behind the Linux X server(means I have to connect to the Linux X server through VNC)?

    This may sound complex, but it's simple actually.

    First, you need install vnc-server on the linux X server following steps in above "Q3: How do I connect to remote host through vnc client(such as tightvnc)?".

    Second, install tightvnc viewer on your windows box you sit behind and connect to the linux X server through it(xxx.xxx.xxx.xxx:1 or example). Run xhost + <linux X client> to enable access to X server.

    Then, on the linux X client, export DISPLAY to xxx.xxx.xxx.xxx:1 which is the linux X server. Run a X program such as xclock, and you'll see the clock displaying on the X interface on your windows tightvnc viewer.

    PS:

    1.You can change vncserver’s resolution through editing /usr/bin/vncserver, change the default $geometry = "1024×768″ to any one you like, for example $geometry = "1600×900″. You can also control each user’s vnc resolution setting through adding line like "VNCSERVERARGS[1]="-geometry 1600×900″" in /etc/sysconfig/vncservers

    2.For vnc server on ubuntu, you can refer to http://www.doxer.org/ubuntu-server-gnome-desktop-and-vncserver-configuration/

    Analysis of output by solaris format -> verify

    May 21st, 2011 No comments

    Here's the output of format -> verify command in my solaris10:

    format> verify

    Primary label contents:

    Volume name = < >
    ascii name =
    pcyl = 2609
    ncyl = 2607
    acyl = 2
    bcyl = 0
    nhead = 255
    nsect = 63
    Part Tag Flag Cylinders Size Blocks
    0 root wm 1 - 1306 10.00GB (1306/0/0) 20980890
    1 var wm 1307 - 2351 8.01GB (1045/0/0) 16787925
    2 backup wu 0 - 2606 19.97GB (2607/0/0) 41881455
    3 stand wm 2352 - 2606 1.95GB (255/0/0) 4096575
    4 unassigned wm 0 0 (0/0/0) 0
    5 unassigned wm 0 0 (0/0/0) 0
    6 unassigned wm 0 0 (0/0/0) 0
    7 unassigned wm 0 0 (0/0/0) 0
    8 boot wu 0 - 0 7.84MB (1/0/0) 16065
    9 unassigned wm 0 0 (0/0/0) 0

    Now, let's give it an analysis:

    • Part

    Solaris x86 has 9 slices for a disk, and for 8th and 9th, they're preserved by solaris.

    • Tag

    This is used to indicate the purpose of the slice. Possible values are:

    unassigned, boot, root, swap, usr, backup, stand, home, and public, private(The latter two are used by Sun StorEdge).

    • Flag

    wm - this slice is writable and mountable.

    wu - this slice is writable and unmountable.

    rm - this slice is readable and mountable.

    ru - this slice is readable and unmountable.

    • Cylinders

    This part shows the start and end cylinder number of the slice.

    • Size

    The size of the slice.

    • Blocks

    This shows the number of cylinders and sectors of the slice.

    Now, let's create a slice and mount the filesystem:

    root@test / # format
    Searching for disks...done

    AVAILABLE DISK SELECTIONS:
    0. c1t0d0
    /pci@0,0/pci15ad,1976@10/sd@0,0
    1. c1t1d0
    /pci@0,0/pci15ad,1976@10/sd@1,0
    Specify disk (enter its number): 1 #select this disk
    selecting c1t1d0
    [disk formatted]

    FORMAT MENU:
    disk - select a disk
    type - select (define) a disk type
    partition - select (define) a partition table
    current - describe the current disk
    format - format and analyze the disk
    fdisk - run the fdisk program
    repair - repair a defective sector
    label - write label to the disk
    analyze - surface analysis
    defect - defect list management
    backup - search for backup labels
    verify - read and display labels
    save - save new disk/partition definitions
    inquiry - show vendor, product and revision
    volname - set 8-character volume name
    ! - execute , then return
    quit
    format> partition #select partition to check and create new slice

    PARTITION MENU:
    0 - change `0' partition
    1 - change `1' partition
    2 - change `2' partition
    3 - change `3' partition
    4 - change `4' partition
    5 - change `5' partition
    6 - change `6' partition
    7 - change `7' partition
    select - select a predefined table
    modify - modify a predefined partition table
    name - name the current table
    print - display the current table
    label - write partition map and label to the disk
    ! - execute , then return
    quit
    partition> print #check slice topology
    Current partition table (original):
    Total disk cylinders available: 2607 + 2 (reserved cylinders)

    Part Tag Flag Cylinders Size Blocks
    0 root wm 1 - 1306 10.00GB (1306/0/0) 20980890
    1 var wm 1307 - 2351 8.01GB (1045/0/0) 16787925
    2 backup wu 0 - 2606 19.97GB (2607/0/0) 41881455
    3 unassigned wm 0 0 (0/0/0) 0
    4 unassigned wm 0 0 (0/0/0) 0
    5 unassigned wm 0 0 (0/0/0) 0
    6 unassigned wm 0 0 (0/0/0) 0
    7 unassigned wm 0 0 (0/0/0) 0
    8 boot wu 0 - 0 7.84MB (1/0/0) 16065
    9 unassigned wm 0 0 (0/0/0) 0

    partition> 3 #select an unassigned slice. It will be /dev/rdsk/c1t1d0s3 after saving to format.dat
    Part Tag Flag Cylinders Size Blocks
    3 unassigned wm 0 0 (0/0/0) 0

    Enter partition id tag[unassigned]: stand
    Enter partition permission flags[wm]:
    Enter new starting cyl[1]: 2352
    Enter partition size[0b, 0c, 2352e, 0.00mb, 0.00gb]: $
    partition> label #write label to disk
    Ready to label disk, continue? y

    partition> name #name the current table
    Enter table name (remember quotes): hah

    partition> quit

    FORMAT MENU:
    disk - select a disk
    type - select (define) a disk type
    partition - select (define) a partition table
    current - describe the current disk
    format - format and analyze the disk
    fdisk - run the fdisk program
    repair - repair a defective sector
    label - write label to the disk
    analyze - surface analysis
    defect - defect list management
    backup - search for backup labels
    verify - read and display labels
    save - save new disk/partition definitions
    inquiry - show vendor, product and revision
    volname - set 8-character volume name
    ! - execute , then return
    quit
    format> save #save new disk/partition definitions
    Saving new disk and partition definitions
    Enter file name["./format.dat"]:
    format> quit

    root@test / # newfs /dev/rdsk/c1t1d0s3 #create filesystem on newly created slice
    newfs: construct a new file system /dev/rdsk/c1t1d0s3: (y/n)? y
    Warning: 1474 sector(s) in last cylinder unallocated
    /dev/rdsk/c1t1d0s3: 4096574 sectors in 667 cylinders of 48 tracks, 128 sectors
    2000.3MB in 42 cyl groups (16 c/g, 48.00MB/g, 11648 i/g)
    super-block backups (for fsck -F ufs -o b=#) at:
    32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
    3149856, 3248288, 3346720, 3445152, 3543584, 3642016, 3740448, 3838880,
    3937312, 4035744
    root@test / # fstyp /dev/dsk/c1t1d0s3 #check the filesystem type
    ufs
    root@test / # mkdir /hah
    root@test / # mount /dev/dsk/c1t1d0s3 /hah #mount filesystem
    root@test / # cd /hah/
    root@test hah # touch aa #create a file to have a test
    root@test hah # ls #finished, congratulations!
    aa lost+found

    Netbackup wrong report – files not backuped whilst netbackup reported already backuped

    May 20th, 2011 No comments

    Netbackup reported:

    The following "sns*" files in /ftp/ftpst/datain/sns_a/success have been archived:/ftp/ftpst/datain/sns_a/success/sns_closed_09042011_180000_2.data/ftp/ftpst/datain/sns_a/success/sns_closed_09042011_180000_1.data

    However, these files were still within the directories.

    Here goes my comprehension:

    We have had similar issues with other hosts which turns out to be more of an issue in the reporting of the script. It reports on the things it has selected for archiving rather than what has successfully been archived.

    If netbackup does not manage to backup the files to tape because e.g. the drives are too busy then it doesn't delete the files but there is nothing in the script to report that this has occurred.

    Categories: Hardware, Storage Tags:

    dmx srdf failover procedure

    May 17th, 2011 No comments

    1.Verify that all devices listed by the above command are in the “Synchronized” state:
    # symrdf –g dg01 query
    2.Once all devices are confirmed as being “Synchronized”, then issue the command to failover:
    # symrdf –g dg01 failover -establish
    Once complete, the storage status can be checked again:
    # symrdf –g livedsms_PRD query
    NOTE: Do NOT trust any symcli output on any Solaris 2.6 host when working with the DMX-4.
    IMPORTANT: ensure that the Symmetrix device group contains ALL devices:
    vxdisk –g dg01 list #From a recent Sun Explorer

    3.Import Storage
    The vxdg should be imported as per normal:
    # vxdctl enable
    # vxdg –Cf import dg01
    # vxvol –g dg01 startall
    # mountall –F vxfs

    4.Verify Storage
    Ensure that the storage imported by the above commands appears to be correct by examining the output of the following commands:

    # df –klFvxfs
    # vxprint –g dg01 –v

    Categories: Hardware, Storage Tags:

    Steps for upgrading milestone to version 2.3

    May 16th, 2011 No comments

    Firstly, please refer to this page for prerequisite:

    Motorola milestone(xt702) rooted and busybox installation

    After reading that article through, you can do the following steps:

    1.Go to Recovery Console:    wipe data/factory reset    wipe cache partitionThen, shutdown milestone.

    2.Run RSDLite on your pc, select Milestone_2.2.1_UK_Package.sbf.

    3.Goto Bootloader, connect your milestone and your pc, and click "start". When this finished, reboot your milestone.

    4.Now put Milestone_V2.3.2_MuzisoftUpdate_B6.2-SignFile.zip to OpenRecovery\updates. And then put OpenRecovery and update.zip to /sdcard. Wipe again(Refer to step 1). In Recovery Console, focus on "apply sdcard:update.zip" and type OK button. Then, select root phone. After that, reboot system. Then you'll find your system already been rooted. Then, select Milestone_V2.3.2_MuzisoftUpdate_B6.2-SignFile.zip.

    5.Wipe again and reboot milestone. You'll find your system now is 2.3!

    Categories: Misc Tags:

    Motorola milestone(xt702) rooted and busybox installation

    May 16th, 2011 No comments

    1.Download update.zip & Android SDK & busybox & Motorola ADB driver & RSDLite & OpenRecovery

    f
    2.You need your system rooted for these operations. You can download update.zip here, put it in your sdcard. Go to Recovery Console. To go to Recovery Console, you need firstly know your bootloader version. Press upper arrow of the four-directions key on your keyboard and booting button to boot your system. Then you can see the version of system bootloader.I.If the version is 90.78, you need press x+booting button, when there appears a triangle, press upper key of voice control and snapshot button to go to Recovery Console.II.If the version is 90.73/90.74, press snapshot button and booting button, when there appears a triangle, press upper key of voice control and snapshot button to go to Recovery Console.In Recovery Console, focus on "apply sdcard:update.zip" and type OK button. Then, select root phone. After that, reboot system. Then you'll find your system already been rooted.

    3.Install Motorola ADB driver, after that, connect your usb to your pc(windows media sync). Setting your milestone like this:    Settings -> Applications -> Development -> check USB Debugging  Then, you'll find Motorola A853 info on your pc.

    4.Install Android SDK to C:\android, here goes the layout:    C:\android        add-ons        platforms        SDK Readme.txt        SDK Setup.exe        tools5.Run cmd in your pc, unzip busybox and put it in  :\android\tools, it will be C:\android\tools\busybox.In the cmd console, run the following commands:    cd c:\android\tools    adb push c:\android\tools\busybox /sdcard/busybox #This may take several minutes to complete.Now, run:    adb shellCongratulations! You can now operates in your milestone!And, now you can try su to root:    suIf not successful, you'll need root your system again. Please refer to step 2.

    Categories: Misc Tags:

    How to Reload IAM DNS Configuration

    May 13th, 2011 No comments

    Firstly, backup your zone file under /apps/bind/var/named/master:

    #cp -R /apps/bind/var/named/master /apps/bind/var/named/master_bak

    Now, run command to reload:

    #rndc -k /apps/bind/etc/rndc.key

    PS:

    Here's the man page for rndc, rndc.conf, named, named.conf, ndc:

    http://www.linuxmanpages.com/man8/rndc.8.php

    And, for name service cache daemon, you man refer to nscd man page.

    Categories: IT Architecture, Linux, Systems Tags:

    Clear error logs from fma before obu firmware patching

    May 12th, 2011 No comments

    1. Clear logs from fma:
    To clear the FMA faults and error logs from Solaris:
    a) Show faults in FMA
    # fmadm faulty
    NOTE: Do not use 'fmadm faulty -a' in this step. When you specify the -a optionall resource information cached by the Fault Manager is listed, includingfaults, which have already been corrected or where no recovery action is needed(see 'fmadm' man page). The listing also include information for resources,that may no longer be present in the system
    b) For each fault UUID listed in the 'fmadm faulty' run
    # fmadm repair <uuid># fmadm faulty (to make sure the output is clean after repair)
    c) Clear ereports and resource cache
    # cd /var/fm/fmd# rm e* f* c*/eft/* r*/*
    d) Reset the fmd serd modules
    # fmadm reset cpumem-diagnosis# fmadm reset cpumem-retire# fmadm reset eft# fmadm reset io-retire
    e) Reboot the system
    To clear the FMA faults and error logs from Solaris without rebooting the system:
    a) Stop the fmd:
    # svcadm disable -s svc:/system/fmd:default
    b) Remove all files from the FMA log directories. This is very specific to the files found in the FMS directories, all directories must be left intact.
    # cd /var/fm/fmd
    # find /var/fm/fmd -type f -exec ls {} \;
    c) Check that only files within the /var/fm/fmd directory are identified then replace the ls with rm to remove them.
    # find /var/fm/fmd -type f -exec rm {} \;
    d) Restart fmd after the files are removed
    # svcadm enable svc:/system/fmd:default
    For more information about clearing logs please see Doc ID 1004229.1
    2. Upgrade the SP and OBP firmware by installing patch 145673-02.

    About how to do firmware patch, please refer to http://www.doxer.org/obu-firmware-patching/

    Recovering from temporary loss of san storage

    May 9th, 2011 No comments

    Typically, you will see entries in the messages file relating to  some fibre channel paths being offline and possibly coming online.
    If all paths to the san storage are lost, then veritas will disable the disk groups, however the filesystems will still be mounted.

    df -k will return messages like  "Cannot stat filesystem"

    To recover, you need to

    1/ Unmount any filesystems that are mounted from the disk groups that are disabled.
    You may find that there are processes that were trying to write to files in these filesystems and that you cannot unmount. You will probably have to kill the processes. the only thing to be careful of is websphere processes. If its a websphere system that is impacted, call the websphere team out first before starting to unmount / kill processes etc.
    for redhat systems, the umount is the opposite of solaris, so do a lazy umount of the parent directory
    so, for /apps/test/results
    umount -l /apps/test
    should take care of it.

    2/ Once all filesystems that are affected have been unmounted, you can deport the vxdgs
    3/ Import the vxdgs
    From the vxdg man page
    When a disk group is imported, all  disks  in  the               disk  group  are  stamped with the host's host ID.               Typically, a disk group cannot be imported if  any               of  its disks are stamped with a non-matching host               ID.  This provides a sanity check in  cases  where               disks can be accessed from more than one host.
    If it is certain that a disk  is  not  in  use  by               another host (such as because a disk group was not               cleanly deported), the -C option can  be  used  to               clear  the  existing  host  ID on all disks in the               disk group as part of the import.  A host  ID  can               also be cleared using vxdisk clearimport.

    4/ Check volume status with vxprint -Ath and start the volumes

    5/ Mount the filesystems either from entries in /etc/vfstab or /etc/fstab or if its VCS cluster then use vcs to bring mount points only online.
    For any databases, eg oracle, make sure things like /apps/ora are mounted before you try to mount the database filesystems, otherwise there is no orahome dbs will not start

    6/ If its databases, then talk to the dbas before starting any of the services. they may want to run some checks before starting databases.
    7/ Start any services that need to be started, either under vcs or manually from start script. You may need help from the applicqation support team.

    Categories: Hardware, Storage Tags:

    cdsdisk in vxvm

    May 9th, 2011 No comments

    Confused by "auto:cdsdisk" from outputs of vxdisk list?

    For a disk to be accessible by multiple platforms, the disk must be consistently recognized by the platforms, and all platforms must be capable of performing I/O on the disk.

    CDS disks contain specific content at specific locations to identify or control access to the disk on different platforms.

    The same content and location are used on all CDS disks, independent of the platform on which the disks are initialized.

    Categories: Hardware, Storage Tags:

    Description of the Solaris Boot Archives bootadm

    May 7th, 2011 No comments

    Please download it here:
    solaris10-boot-archive (bootadm, failsafe)

    Network and Operating Systems Course Collection & VMware Course Collection download

    May 7th, 2011 No comments

    This download include Network and Operating Systems Course Collection and VMware Course Collection, please click the following link to download:

    online-learning courses download

    Categories: IT Architecture Tags:

    com.ibm.ws.exception.RuntimeError: com.ibm.ws.exception.RuntimeErro r: Unable to start the CoordinatorComponentImpl

    May 6th, 2011 No comments

    When I tried to start nodeagent and app server of websphere, it said in /apps/WebSphere/Profiles/Node-stage01-04/logs/nodeagent/SystemErr.log:

    [06/05/11 00:17:36:556 BST] 0000000a SystemErr     R com.ibm.ws.exception.RuntimeError: com.ibm.ws.exception.RuntimeErro
    r: Unable to start the CoordinatorComponentImpl

    Then, I found this page:http://www-01.ibm.com/support/docview.wss?uid=swg21293285

    I tried to look for RMM jar file, but failed. There's no this file under install_root.

    Then, I re-run the startNode.sh for twice, magiccally, this resolved itself. Now the nodeagent and app server are both booting themselves ok.

    Weird!

     

    remove mnt resource of vcs

    May 3rd, 2011 No comments

    Target: remove mnt_prd-ora-grdctl from group SG_dbPRD

    Here is the steps:

    haconf -makerw
    hares -delete mnt_prd-ora-grdctl
    haconf -dump -makero

    configure syslog to redirect WebSphere Message Broker (mq) messages to file

    April 29th, 2011 No comments

    On Linux® and UNIX® systems, all WebSphere® Message Broker messages (other than those generated by the command line utilities) are sent to the syslog, so it is useful to redirect user messages to a separate file.

    Start of changeOn UNIX, syslog entries are restricted in length and messages that are sent to the syslog are truncated by the new line character. To record a large amount of data in a log on UNIX, set the Destination property on the Trace node to File or User Trace instead of Local Error Log.End of change

    Before you create a broker on Linux or UNIX systems, configure the syslog daemon to redirect user messages to a file called user.log:

    Log on as root.
    Enter the following commands to create a file called user.log.
    On UNIX systems, enter the command:

    touch /var/adm/user.log
    chown root:mqbrkrs /var/adm/user.log
    chmod 640 /var/adm/user.log

    On Linux, enter the command:

    touch /var/log/user.log
    chown root:mqbrkrs /var/log/user.log
    chmod 640 /var/log/user.log

    Add the following line to the /etc/syslog.conf file Start of change(on later versions of SUSE Linux, this is /etc/syslog-ng.conf)End of change to redirect debug level messages to the file user.log:
    On UNIX systems, enter the command:

    user.info /var/adm/user.log

    On Linux, enter the command:

    user.info /var/log/user.log

    You can use user.* - instead of user.info in the preceding examples.
    * means that information, notice, warning, and debug messages are caught
    - means that syslog does not synchronize the file after writing to it.
    You might experience a performance gain, but you can lose some data if the computer fails immediately after it has written to the file.
    Start of changeAdd a line to ensure that messages at the required level from the user facility are recorded. If you specify a level of info, all operational messages are recorded; these messages provide important information about the operation of the broker, and can be useful in diagnosing problems.End of change
    Restart the syslog daemon.
    On AIX®, enter the command:

    refresh -s syslogd

    On HP-UX and Solaris, enter the command:

    kill -HUP 'cat /etc/syslog.pid'

    On Linux, enter the command:

    /etc/init.d/syslogd restart

    or

    /etc/rc.d/init.d/syslogd restart

    for systems where rc.d is not a soft link
    For other syslog options, see the documentation for your operating system.

    About other websphere message broker documents, please refer to http://publib.boulder.ibm.com/infocenter/wmbhelp/v6r0m0/index.jsp

    Categories: IT Architecture, Linux, Systems Tags: , ,

    Resolved:solaris patch panic – cannot start the system after patch

    April 27th, 2011 No comments

    Here goes the whole story:

    Step 1. patch with PCA. after reboot -- -r

    Rebooting with command: boot -r

    Boot device: /pci@1c,600000/scsi@2/disk@0,0:a  File and args: -r

    SunOS Release 5.10 Version Generic_142900-13 64-bit

    Copyright 1983-2010 Sun Microsystems, Inc.  All rights reserved.

    Use is subject to license terms.

    WARNING: mod_load: cannot load module 'sharefs'

    WARNING: Cannot mount /etc/dfs/sharetab

     

    Hardware watchdog enabled

    /kernel/drv/sparcv9/ip: undefined symbol 'ddi_get_lbolt64'

    WARNING: mod_load: cannot load module 'ip'

    /kernel/fs/sparcv9/sockfs: undefined symbol 'sock_comm_create_function'

    /kernel/fs/sparcv9/sockfs: undefined symbol 'smod_lookup_byname'

    /kernel/fs/sparcv9/sockfs: undefined symbol 'sctp_disconnect'

    /kernel/fs/sparcv9/sockfs: undefined symbol 'sctp_getsockname'

    /kernel/fs/sparcv9/sockfs: undefined symbol 'nd_free'

    /kernel/fs/sparcv9/sockfs: undefined symbol 'nd_load'

    /kernel/fs/sparcv9/sockfs: undefined symbol 'UDP_WR'

     

    Step 2. zfs Roll back

    ok>boot -F failsafe

    #zfs rollback rpool/ROOT/sol10_sparc@pre_patched.142900-13_04.03.2011

     

    Step 3. Patch with PCA again, then "halt", boot archive is not updated after patching, so we need remove the boot_archive

    ok>boot -F failsafe

    # mv /a/platform/`uname -i`/boot_archive /a/root/b_back

    # /a/sbin/bootadm update-archive -R /a

    # reboot

     

    Step 4. Server is patched

    root@solaris01~# uname -a

    SunOS solaris01 5.10 Generic_144488-06 sun4u sparc SUNW,Sun-Fire-V240

    Step 5. Restore the Zone

    root@solaris01~# zoneadm -z solaris01sub01 halt

    root@solaris01~# zoneadm -z solaris01sub01 detach

    root@solaris01~# zoneadm -z solaris01sub01 attach -u

    zoneadm: zone 'solaris01sub01': ERROR: attempt to downgrade package SUNWcsu, the source had patch 142053-03 which is not installed on this system

    # SUNWcsu is the core package. we can't live without it's upgrade

     

    root@solaris01~# zoneadm list -civ

    ID NAME             STATUS     PATH                           BRAND    IP

    0 global           running    /                              native   shared

    - solaris01sub01   configured /export/zones/solaris01sub01   native   shared

     

    root@solaris01~# showrev -a|grep 142053-03

    Patch: 142053-03 Obsoletes: 142485-01 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsu

    Patch: 143954-04 Obsoletes: 142053-03, 142485-01, 142487-01, 142535-03 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsr

     

    root@solaris01~# zoneadm -z solaris01sub01 attach -u -F

    root@solaris01~# zoneadm -z solaris01sub01 boot

     

    root@solaris01~# zlogin solaris01sub01

    Last login: Sat Mar 26 05:15:27 on pts/2

    Sun Microsystems Inc.   SunOS 5.10    Generic   January 2005

    # The zones doesn't work

    root@solaris01sub01/# svcs

    STATE          STIME    FMRI

    online          5:40:24 svc:/system/svc/restarter:default

    ...

    offline         5:40:24 svc:/system/sysidtool:net

    offline         5:40:25 svc:/network/nfs/status:default

    ...

    offline         5:40:28 svc:/system/boot-archive-update:default

    uninitialized   5:40:27 svc:/application/x11/xfs:default

    ...

    uninitialized   5:40:28 svc:/network/rpc-100235_1/rpc_ticotsord:default

     

    root@solaris01sub01/# zfs list

    internal error: Unknown error

    Abort (core dumped)

    root@solaris01sub01/# logout

     

    root@solaris01~# zoneadm list -civ

    ID NAME             STATUS     PATH                           BRAND    IP

    0 global           running    /                              native   shared

    - solaris01sub01   configured /export/zones/solaris01sub01   native   shared

     

    root@solaris01~# zfs rollback rpool/export/zones/solaris01sub01@id24_pre_AMpatch12

     

    root@solaris01sub01/# showrev -a|grep  142053-03

    Patch: 142053-03 Obsoletes: 142485-01 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsu

     

    root@solaris01~# zlogin solaris01sub01

    Last login: Sat Mar 26 05:55:32 on pts/2

    Sun Microsystems Inc.   SunOS 5.10    Generic   January 2005

     

    root@solaris01sub01/# unzip -q 143954-04.zip

    root@solaris01sub01/# patchadd 143954-04

    Validating patches...

     

    Global patches.

     

    0 Patch 143954-04 is for global zone only - cannot be installed on non-global zone.

     

    No patches to install.

     

    root@solaris01/datastore/Explorers/patches# showrev -a|grep 142053-03

    Patch: 142053-03 Obsoletes: 142485-01 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsu

    Patch: 143954-04 Obsoletes: 142053-03, 142485-01, 142487-01, 142535-03 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsr

     

    root@solaris01/datastore/Explorers/patches# unzip -q 143954-04.zip

    root@solaris01/datastore/Explorers/patches# patchadd 143954-04

    Validating patches...

    ....

    Patch packages installed:

    SUNWcslr

    SUNWcsr

    SUNWcsu

     

    Done!

    root@solaris01/datastore/Explorers/patches# zlogin solaris01sub01

    [Connected to zone 'solaris01sub01' pts/2]

    Last login: Sat Mar 26 05:58:17 on pts/2

    Sun Microsystems Inc.   SunOS 5.10    Generic   January 2005

     

    # Now we have the new package

    root@solaris01sub01/# showrev -a|grep  142053-03

    Patch: 142053-03 Obsoletes: 142485-01 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsu

    Patch: 143954-04 Obsoletes: 142053-03, 142485-01, 142487-01, 142535-03 Requires: 127127-11, 139555-08 Incompatibles:  Packages: SUNWcslr, SUNWcsr, SUNWcsu

     

    root@solaris01/datastore/Explorers/patches# zoneadm -z solaris01sub01 halt

    root@solaris01/datastore/Explorers/patches# zoneadm -z solaris01sub01 detach

     

    # the upgrade goes very smooth

    root@solaris01/datastore/Explorers/patches# zoneadm -z solaris01sub01 attach -u

    Getting the list of files to remove

    Removing 481 files

    Remove 11 of 11 packages

    Installing 11639 files

    Add 190 of 190 packages

    Updating editable files

    The file </var/sadm/system/logs/update_log> within the zone contains a log of the zone update.

    root@solaris01/datastore/Explorers/patches# zoneadm list -civ

    ID NAME           STATUS     PATH                           BRAND    IP

    0 global           running    /                              native   shared

    - solaris01sub01   installed  /export/zones/solaris01sub01   native   shared

    root@solaris01/datastore/Explorers/patches# zoneadm -z solaris01sub01 boot

    root@solaris01~# zoneadm list -civ

    ID NAME             STATUS     PATH                        BRAND    IP

    0 global           running    /                              native   shared

    31 solaris01sub01   running    /export/zones/solaris01sub01   native   shared

     

    root@solaris01~# zlogin solaris01sub01

    svc:/system/sysidtool:net failed.

    root@solaris01~# zlogin -C solaris01sub01

    # choose Term Type 3) VT100. then everything goes well.

    Steps to create new volume of VxVM under solaris

    April 27th, 2011 1 comment

    1.Take snapshot of running processes, filesystem partitions, network connections:
    #/usr/ucb/ps aauuxxww>/running_processes.2011.04.25
    #df -k>/filesystem.2011.04.25
    #netstat -rnv>/networking.2011.04.25

    2.Check free space of dg(disk group), sector position of dm(disk media), and subdisk allocation:
    #vxprint

    3.Let's see the lengths and offsets of all subdisks under plexes:

    vxprint -st

    You can also use iostat -En.

    Where free sectors lay on the disks:

    #vxdg -g abinitio free

    #vxassist -g abinitio maxsize

    4.Create v(volume):
    vxassist -g abinitio make ora2 5g

    After this, you can see in vxprint:
    v  ora2         fsgen        ENABLED  10485760 -        ACTIVE   -       -
    pl ora2-01      ora2         ENABLED  10487040 -        ACTIVE   -       -
    sd disk1-14     ora2-01      ENABLED  10487040 0        -        -       -

    And,
    #fstyp /dev/vx/dsk/abinitio/ora2
    Unknown_fstyp (no matches)

    Let's create filesystem for volume ora2:

    Firstly, let's check the block size of other filesystem on the server:

    #fstyp -v /dev/vx/rdsk/abinitio/F41|grep bsize
    bsize  1024 size  10240 dsize  0  ninode 10240  nau 0
    defiextsize 0  ilbsize 0  immedlen 96  ndaddr 10

    It's 1024 Bytes.

    Then, let's create the filesystem for volume ora2:

    #mkfs -F vxfs -o bsize=1024,largefiles /dev/vx/rdsk/abinitio/ora2
    version 6 layout
    10485760 sectors, 5242880 blocks of size 1024, log size 16384 blocks
    largefiles supported
    After this, let's check it:
    # fstyp /dev/vx/rdsk/abinitio/ora2
    vxfs

    5.Mount a File System:
    #mkdir /ora2
    #mount -F vxfs -o largefiles /dev/vx/dsk/abinitio/ora2 /ora2

    Now, cd to /ora2 and create a file to test it:
    # cd /ora2/
    # touch aa
    # ls -l
    total 0
    -rw-r--r--   1 root     other          0 Apr 25 02:42 aa
    drwxr-xr-x   2 root     root          96 Apr 25 02:40 lost+found

    If all are ok, you should add an entry into /etc/vfstab
    /dev/vx/dsk/abinitio/ora2       /dev/vx/rdsk/abinitio/ora2 /ora2        vxfs    2       yes     largefiles

    Categories: Hardware, Storage Tags:

    Difference between dm, dmp, rdmp, v, pl, sd, lun, c, t, d, s

    April 24th, 2011 No comments

    dm - disk media

    dmp - disk media mapping

    rdmp - an rdmp file is a mapping to a raw device stored on a san.  rdmp specifically is a physical mode rdm.  This data, although you may see it as the actual size of the raw device, is only a pointer to the actual storage to a volume on your SAN.

    v - volume

    pl - plex

    sd - subdisk

    lun - logic unit number

    NOTE:

    Disks (dm) must be added to a disk group (dg) under Veritas. A volume contains one or more plexes which in turn contain one or more subdisks.

    And again, for the famous c, t, d, s issue:

    From the computer perspective, SCSI LUN is only a part of the full SCSI address. The full device's address is made from the:

    • c-part: controller ID of the host bus adapter,
    • t-part: target ID identifying the SCSI target on that bus,
    • d-part: disk ID identifying a LUN on that target,
    • s-part: slice ID identifying a specific slice on that disk.

     

    Resolved:[Load Manager Shared Memory]. Error is [28]: [No space left on device](for apache, pmserver etc. running on linux, solaris, unix)

    April 23rd, 2011 No comments

    This error may occur in pmserver, apache, oracle, rsync, up2date and many other services running on linux, solaris, unix, so it's a widespread and a famous question if you try to search google the keyword:"[Load Manager Shared Memory].  Error is [28]: [No space left on device]".
    Now, let's take pmserver running on solaris10 for example to demonstrate to you step by step on how to solve the annoying problem.
    Firstly, from "[No space left on device]" and "Load Manager Shared Memory", we firstly guessed that it's caused by shortage of memory, but after checking, we can see that memory is enough to allocate:

    1.check the total memory size:

    # /usr/sbin/prtconf |grep -i mem
    Memory size: 32640 Megabytes
    memory (driver not attached)
    virtual-memory (driver not attached)

    2.check application project memory size:

    # su - sbintprd #as you have guessed, pmserver is running by user sbintprd in the box
    $ id -p
    uid=71269(sbintprd) gid=70772(sbintprd) projid=3(default)

    This means that pmserver is running inside 'default' project. Then let's check the setting of "default" project:

    # projects -l default
    default
    projid : 3
    comment: ""
    users  : (none)
    groups : (none)
    attribs: project.max-msg-ids=(privileged,256,deny)
    project.max-shm-memory=(privileged,17179869184,deny)

    # prctl -n project.max-shm-memory -i project default
    project: 3: default
    NAME    PRIVILEGE       VALUE    FLAG   ACTION                       RECIPIENT
    project.max-shm-memory
    privileged      16.0GB      -   deny                                 -
    system          16.0EB    max   deny                                 -

    16GB available to 'default' project. How come the shortage of memory then?

    Let's bump up the max-shm-memory size by 2 GB to see what happens:

    #prctl -n project.max-shm-memory -r -v 18gb -i project default

    After this, we tried to bounce the pmserver, but the problem is still there:

    #tail -f pmserver.log
    INFO : LM_36070 [Fri Apr 22 22:19:42 2011] : (25218|1) The server is running on a host with 32 logical processors.
    INFO : LM_36039 [Fri Apr 22 22:19:42 2011] : (25218|1) The maximum number of sessions that can run simultaneously is [10].
    FATAL ERROR : CMN_1011 [Fri Apr 22 22:19:42 2011] : (25218|1) Error allocating system shared memory of [2000000] bytes for [Load Manager Shared Memory].  Error is [28]: [No space left on device]
    FATAL ERROR : SF_34004 [Fri Apr 22 22:19:42 2011] : (25218|1) Server initialization failed.
    INFO : SF_34014 [Fri Apr 22 22:19:42 2011] : (25218|1) Server shut down.

    OK, then, we should think in other ways.

    As we know, linux use shared memory between processes. We can use ipcs to check the information about  active  shared  memory  segments:

    # ipcs -m|grep sbintprd
    m  671088691   0          --rw------- sbintprd sbintprd

    #ipcs -mA|grep sbintprd|wc -l
    92

    And each of them use 20000 size:

    IPC status from <running system> as of Sat Apr 23 03:29:51 BST 2011
    T         ID      KEY        MODE        OWNER    GROUP  CREATOR   CGROUP NATTCH      SEGSZ  CPID  LPID   ATIME    DTIME    CTIME  ISMATTCH         PROJECT
    Shared Memory:
    m  671088691   0          --rw------- sbintprd sbintprd sbintprd sbintprd      1    2000000  7781 16109  3:28:35  3:28:50  2:01:43        0         default

    Now, we can conclude that the sbintprd user has over allocated and is not freeing up the space. So let's clear the shared memeory:

    #for i in `ipcs -m | grep prd | awk '{print $2}'`; do ipcrm -m $i; done

    After this step, the pmserver started successfully. From the log we can see:

    NFO : LM_36070 [Sat Apr 23 01:51:17 2011]
    : (5979|1) The server is running on a host with 32 logical processors.
    INFO : LM_36039 [Sat Apr 23 01:51:18 2011] : (5979|1) The maximum number of sessions that
    can run simultaneously is [10].
    INFO : CMN_1010 [Sat Apr 23 01:51:18 2011] : (5979|1) Allocated system shared memory [id =
    469762275] of [2000000] bytes for [Load Manager Shared Memory].
    INFO : LM_36095 [Sat Apr 23 01:51:50 2011] : (5979|1) Persistent session cache file
    cleanup is scheduled to run on [Sun Apr 24 01:51:50 2011].
    INFO : SF_34003 [Sat Apr 23 01:51:50 2011] : (5979|1) Server initialization completed.

    Problem resolved!

    The result of rm -rf / on linux(debian)

    April 20th, 2011 No comments

    210169:/# rm -rf /
    rm: cannot remove root directory `/'

    Haven't tested other distributions, maybe you can have a try? :D

    Install curl utility on solaris

    April 19th, 2011 1 comment

    Firstly, download curl package from sunfreeware.com. unzip the tarball, and execute ./configure
    # ./configure
    checking whether to enable maintainer-specific portions of Makefiles... no
    checking whether to enable debug build options... no
    checking whether to enable compiler optimizer... (assumed) yes
    checking whether to enable strict compiler warnings... no
    checking whether to enable compiler warnings as errors... no
    checking whether to enable curl debug memory tracking... no
    checking whether to enable c-ares for DNS lookups... no
    checking for sed... /usr/bin/sed
    checking for grep... /usr/bin/grep
    checking for egrep... /usr/bin/egrep
    checking for ar... /usr/local/bin/ar
    checking for a BSD-compatible install... ./install-sh -c
    checking whether build environment is sane... configure: error: newly created file is older than distributed files!
    Check your system clock

    This was because the system time was off and the timestamps in the source code were in the future.  To fix this all you can do is copy the files to another directory using the copy command:

    #cp -r curl-7.21.4 /

    Then try to compile again.

    After this,  make and make install. If you come across the problem:

    configure: error: ar not found in PATH. Cannot continue without ar.

    That's because you haven't installed GNU_Binutils on your solaris system.

    You should download binutils package from sunfreeware.com and install it:

    #gzip -d binutils-2.21-sol10-x86-local.gz
    #pkg-add -d binutils-2.21-sol10-x86-local

    After this step, if you still receive alert:
    configure: error: no acceptable C compiler found in $PATH

    If you have installed gcc(we can check using find / -name gcc), then it's the problem of PATH environment variable:
    #export PATH=$PATH:/usr/sfw/bin  #you can of course make the variable permanently changed via edit /etc/profile
    Then,

    #make
    #make install
    Finished!

    Let's have a test of curl:
    # which curl
    /usr/local/bin/curl

    # curl -I -L www.doxer.org
    HTTP/1.1 200 OK
    Date: Mon, 18 Apr 2011 01:36:14 GMT
    Server: Apache/2.2.17 (Unix) mod_ssl/2.2.17 OpenSSL/0.9.8e-fips-rhel5 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635
    X-Powered-By: PHP/5.2.9
    X-Pingback: http://www.doxer.org/xmlrpc.php
    Content-Type: text/html; charset=UTF-8
    Connection: Keep-Alive

    Using zpool clear to resolve zpool “cannot open” issue

    April 17th, 2011 2 comments

    I've created a zfs pool tank using one removable disk. And it worked all ok before I reboot it.

    As I unpluged the disk before rebooting, so zpool list returned "FAULTED" on Healthy column:

    -bash-3.00# zpool list
    NAME   SIZE  ALLOC   FREE    CAP  HEALTH  ALTROOT
    tank      -      -      -      -  FAULTED  -

    Using zpool status -x to check:

    -bash-3.00# zpool status -x
    pool: tank
    state: UNAVAIL
    status: One or more devices could not be opened.  There are insufficient
    replicas for the pool to continue functioning.
    action: Attach the missing device and online it using 'zpool online'.
    see: http://www.sun.com/msg/ZFS-8000-3C
    scrub: none requested
    config:

    NAME        STATE     READ WRITE CKSUM
    tank        UNAVAIL      0     0     0  insufficient replicas
    c4t0d0    UNAVAIL      0     0     0  cannot open

    c4t0d0 is the name of the removable disk. Using iostat -En, I can see that the disk is not connect to system:

    -bash-3.00# iostat -En
    c0d1             Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
    Model: ST31000524AS    Revision:  Serial No:             9VP Size: 1000.20GB <1000202305536 bytes>
    Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
    Illegal Request: 0
    c1t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
    Vendor: SAMSUNG  Product: CDRW/DVD SM-348B Revision: T509 Serial No:
    Size: 0.00GB <0 bytes>
    Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
    Illegal Request: 2 Predictive Failure Analysis: 0

    c0d1 is the hard disk, and c1t0d0 is SAMSUNG DVD.

    Now, after plugging in the removable disk, and run again iostat -En, I can see that one more entry returned by iostat -En:

    c4t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
    Vendor: Generic  Product: USB Disk         Revision: 9.02 Serial No:
    Size: 20.00GB <20003880960 bytes>
    Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
    Illegal Request: 3 Predictive Failure Analysis: 0

    As I've plugged the removable disk that comprised the zfs pool, then I can use zpool clear to let zfs to recheck the status of itself:

    -bash-3.00# zpool clear tank

    After this, using zpool status -x, we can see the pool is in healthy status:

    -bash-3.00# zpool status -x
    all pools are healthy

    After getting zpool back to work, let's check the status of zfs:

    -bash-3.00# zfs list

    NAME                  USED  AVAIL  REFER  MOUNTPOINT
    tank                  254K  18.2G    22K  /tank
    tank/home             150K  18.2G    23K  /tank/home
    tank/home/firsttry    106K  18.2G   106K  /tank/home/firsttry
    tank/home/secondtry    21K  18.2G    21K  /export/secondtry

    When I tried to cd to /tank/home, it prompt me:

    -bash-3.00# cd /tank/home
    -bash: cd: /tank/home: No such file or directory

    That's because zfs not been mounted yet, let's mount all zfs then:

    -bash-3.00# zfs mount -a

    Ok, now we can cd to that directory:

    -bash-3.00# cd /tank/home/
    -bash-3.00# pwd
    /tank/home