Archive

Posts Tagged ‘storage’

Resolved – Your boot partition is on a disk using the GPT partitioning scheme but this machine cannot boot using GPT

June 12th, 2014 No comments

Today when I tried to install Oracle VM Server on one server, the following error occurred:

Your boot partition is on a disk using the GPT partitioning scheme but this machine cannot boot using GPT. This can happen if there is not enough space on your hard drive(s) for the installation.

So to went on with the installation, I had to think of a way to erase GPT partition table on the drive.

To do this, the first step is to fall into linux rescue mode when booting from CDROM:

rescue

Later, check with fdisk -l, I could see that /dev/sda was the only disk that needed erasing GPT label. So I used dd if=/dev/zero of=/dev/sda bs=512 count=1 to erase GPT table:

 

fdisk_dd

 

After this, run fdisk -l again, I saw that the partition table was now gone:

fdisk_dd_2

Later, re-initializing installation of OVS server. When the following message prompted, select "No":

select_no

And select "yes" when below message prompted so that we can make new partition table:

select_yes

The steps after this was normal ones, and the installation went smoothly.

Common storage multi path Path-Management Software

December 12th, 2013 No comments
Vendor Path-Management Software URL
Hewlett-Packard AutoPath, SecurePath www.hp.com
Microsoft MPIO www.microsoft.com
Hitachi Dynamic Link Manager www.hds.com
EMC PowerPath www.emc.com
IBM RDAC, MultiPath Driver www.ibm.com
Sun MPXIO www.sun.com
VERITAS Dynamic Multipathing (DMP) www.veritas.com

SAN Terminology

September 13th, 2013 No comments
Term
Description
SCSI Target
A SCSI Target is a storage system end-point that provides a service of processing SCSI commands and I/O requests from an initiator. A SCSI Target is created by the storage system's administrator, and is identified by unique addressing methods. A SCSI Target, once configured, consists of zero or more logical units.
SCSI Initiator
A SCSI Initiator is an application or production system end-point that is capable of initiating a SCSI session, sending SCSI commands and I/O requests. SCSI Initiators are also identified by unique addressing methods (See SCSI Targets).
Logical Unit
A Logical Unit is a term used to describe a component in a storage system. Uniquely numbered, this creates what is referred to as a Logicial Unit Number, or LUN. A storage system, being highly configurable, may contain many LUNS. These LUNs, when associated with one or more SCSI Targets, forms a unique SCSI device, a device that can be accessed by one or more SCSI Initiators.
iSCSI
Internet SCSI, a protocol for sharing SCSI based storage over IP networks.
iSER
iSCSI Extension for RDMA, a protocol that maps the iSCSI protocol over a network that provides RDMA services (i.e. InfiniBand). The iSER protocol is transparently selected by the iSCSI subsystem, based on the presence of correctly configured IB hardware. In the CLI and BUI, all iSER-capable components (targets and initiators) are managed as iSCSI components.
FC
Fibre Channel, a protocol for sharing SCSI based storage over a storage area network (SAN), consisting of fiber-optic cables, FC switches and HBAs.
SRP
SCSI RDMA Protocol, a protocol for sharing SCSI based storage over a network that provides RDMA services (i.e. InfiniBand).
IQN
An iSCSI qualified name, the unique identifier of a device in an iSCSI network. iSCSI uses the form iqn.date.authority:uniqueid for IQNs. For example, the appliance may use the IQN: iqn.1986-03.com.sun:02:c7824a5b-f3ea-6038-c79d-ca443337d92c to identify one of its iSCSI targets. This name shows that this is an iSCSI device built by a company registered in March of 1986. The naming authority is just the DNS name of the company reversed, in this case, "com.sun". Everything following is a unique ID that Sun uses to identify the target.
Target portal
When using the iSCSI protocol, the target portal refers to the unique combination of an IP address and TCP port number by which an initiator can contact a target.
Target portal group
When using the iSCSI protocol, a target portal group is a collection of target portals. Target portal groups are managed transparently; each network interface has a corresponding target portal group with that interface's active addresses. Binding a target to an interface advertises that iSCSI target using the portal group associated with that interface.
CHAP
Challenge-handshake authentication protocol, a security protocol which can authenticate a target to an initiator, an initiator to a target, or both.
RADIUS
A system for using a centralized server to perform CHAP authentication on behalf of storage nodes.
Target group
A set of targets. LUNs are exported over all the targets in one specific target group.
Initiator group
A set of initiators. When an initiator group is associated with a LUN, only initiators from that group may access the LUN.
Categories: Hardware, SAN, Storage Tags: ,

iostat dm- mapping to physical device

July 30th, 2013 No comments

-bash-3.2# iostat -xn 2

avg-cpu: %user %nice %system %iowait %steal %idle
0.02 0.00 0.48 0.00 0.21 99.29

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-5 0.00 0.00 0.00 1949.00 0.00 129648.00 66.52 30.66 15.77 0.51 100.20
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 0.00 1139.00 0.00 88752.00 77.92 22.92 20.09 0.83 95.00

Device: rBlk_nor/s wBlk_nor/s rBlk_dir/s wBlk_dir/s rBlk_svr/s wBlk_svr/s rops/s wops/s
nas-host:/export/test/repo 0.00 0.00 0.00 218444.00 0.00 218332.00 3084.50 3084.50

Then how can we know which physical device dm-3 is mapping to?

-bash-3.2# cat /sys/block/dm-3/dev
253:3 #this is major, minor number of dm-3

 

-bash-3.2# dmsetup ls
dmnfs6 (253, 6)
dmnfs5 (253, 5)
dmnfs4 (253, 4)
dmnfs3 (253, 3)
dmnfs2 (253, 2)
dmnfs1 (253, 1)
dmnfs0 (253, 0)

Then we can find which share the device is mapping to:

[root@testhost ~]# cat /sys/block/dm-3/size
8916075 #it's 4T

[root@testhost ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 9.7G 3.4G 5.9G 37% /
/dev/sda1 190M 57M 124M 32% /boot
tmpfs 2.9G 0 2.9G 0% /dev/shm
none 2.9G 648K 2.9G 1% /var/lib/xenstored
sharehost:/export/Service-2 4.1T 2.1T 2.0T 52% /media

So we now know that it's NFS which caused io busy.

PS:

device mapper(dm_mod module, dmsetup ls) http://en.wikipedia.org/wiki/Device_mapper

multiple devices(software raid, mdraid, /proc/mdstat) http://linux.die.net/man/8/mdadm and https://raid.wiki.kernel.org/index.php/Linux_Raid

dmraid(fake raid) https://wiki.archlinux.org/index.php/Installing_with_Fake_RAID

DM-MPIO(DM-Multipathing, multipath/multipathd, dm_multipath module, combined with SAN) http://en.wikipedia.org/wiki/Linux_DM_Multipath

Categories: Hardware, Storage Tags:

perl script for getting sun zfs head project and share usage info

April 26th, 2013 No comments

Sometimes you would like to know on sun zfs head, which project occupies most of the space, and which shares of that occupies most space of that project.

Here's a perl script to fulfill this(it's a little cumbersome, but it works anyway)

#!/usr/bin/perl
use strict;
use warnings;
use Net::SSH::Perl;
use List::Util qw/sum/;
my $host = 'test-zfs-host';
my $user = 'root';
my $password = 'password';

my $ssh = Net::SSH::Perl->new($host);
$ssh->login($user,$password);
my ($stdout,$stderr,$exit) = $ssh->cmd("shares show");
my @std_arr=split(/:/,$stdout);
my @projects_arr = split(/\n/, $std_arr[2]);

foreach(@projects_arr){
$_ =~ s/^\s+|\s+$//g;
}
shift @projects_arr;
pop @projects_arr;
pop @projects_arr;

my @space_projects;
foreach(@projects_arr){
my ($stdout2,$stderr2,$exit2) = $ssh->cmd("shares select $_ get");
my @stdout_arr=split(/\n/,$stdout2);
my $space_temp=join("\n",grep(/space_total/,@stdout_arr));
my @space_total_project = "project ".$_.$space_temp;
push(@space_projects,@space_total_project);
}
my @space_projects3;
foreach(@projects_arr){
my ($stdout3,$stderr3,$exit3) = $ssh->cmd("shares select $_ ls");
my @stdout_arr3=split(/\n/,$stdout3);
my @space_temp3=grep(/\/export\//,@stdout_arr3);
my @space_temp4=grep(!/mountpoint/,@space_temp3);
push(@space_projects3,@space_temp4);
}
open(my $temp1,'>','/var/tmp/temp1') or die("cannot open file temp1");
open(my $temp2,'>','/var/tmp/temp2') or die("cannot open file temp2");
open(my $temp3,'>','/var/tmp/temp3') or die("cannot open file temp3");
open(my $temp4,'>','/var/tmp/temp4') or die("cannot open file temp4");
my @var_T;
my @var_G;
my @var_M;
my @var_K;

foreach(@space_projects){
if($_ =~ /.+space_total\s+(.*)K/){
push(@var_K,$_);
}
elsif($_ =~ /.+space_total\s+(.*)M/){
push(@var_M,$_);
}
elsif($_ =~ /.+space_total\s+(.*)G/){
push(@var_G,$_);
}
elsif($_ =~ /.+space_total\s+(.*)T/){
push(@var_T,$_);
}
}

select $temp1;
foreach(@var_T){
print $_."\n";
}
close $temp1;

select $temp2;
foreach(@var_G){
print $_."\n";
}
close $temp2;
select $temp3;
foreach(@var_M){
print $_."\n";
}
close $temp3;
select $temp4;
foreach(@var_K){
print $_."\n";
}
close $temp4;

system("echo \"======zfs project usage info(Descending)======\"");
system("sort -r -n -k 5 /var/tmp/temp1;sort -r -n -k 5 /var/tmp/temp2;sort -r -n -k 5 /var/tmp/temp3;sort -r -n -k 5 /var/tmp/temp4");
open(my $temp5,'>','/var/tmp/temp5') or die("cannot open file temp5");
open(my $temp6,'>','/var/tmp/temp6') or die("cannot open file temp6");
open(my $temp7,'>','/var/tmp/temp7') or die("cannot open file temp7");
open(my $temp8,'>','/var/tmp/temp8') or die("cannot open file temp8");
my @var_T_2;
my @var_G_2;
my @var_M_2;
my @var_K_2;

foreach(@space_projects3){
if($_ =~ /\s+.*K\s+.*/){
push(@var_K_2,$_);
}
elsif($_ =~ /\s+.*M\s+.*/){
push(@var_M_2,$_);
}
elsif($_ =~ /\s+.*G\s+.*/){
push(@var_G_2,$_);
}
elsif($_ =~ /\s+.*T\s+.*/){
push(@var_T_2,$_);
}
}

select $temp5;
foreach(@var_T_2){
print $_."\n";
}
close $temp5;

select $temp6;
foreach(@var_G_2){
print $_."\n";
}
close $temp6;
select $temp7;
foreach(@var_M_2){
print $_."\n";
}
close $temp7;
select $temp8;
foreach(@var_K_2){
print $_."\n";
}
close $temp8;

system("echo \"\n\n\n======zfs share usage info(Descending)======\"");
system("sort -r -n -k 2 /var/tmp/temp5;sort -r -n -k 2 /var/tmp/temp6;sort -r -n -k 2 /var/tmp/temp7;sort -r -n -k 2 /var/tmp/temp8");

The output would be like:

======zfs project usage info(Descending)======
project DC2_DMZ space_total = 7.68T
project dc2_c9testga space_total = 1.10T
project fa_trialadcf space_total = 277G
project fa_rehydration space_total = 266G
project common space_total = 10.0G
project NODE_8 space_total = 93K
project default space_total = 31K

 

======zfs share usage info(Descending)======
Service_Mid-2 1.44T /export/DC2_DMZ/Service_Mid-2
Service_Web 1.22T /export/DC2_DMZ/Service_Web
dc2_shared_idm 743G /export/DC2_DMZ/dc2_shared_idm
Infra_Web 400G /export/DC2_DMZ/Infra_Web
nuviaq_local02 988M /export/DC2_DMZ/nuviaq_local02
sftp_staging 127M /export/DC2_DMZ/sftp_staging
sftp_local03 14.8M /export/DC2_DMZ/sftp_local03
sftp_manager_local01 85K /export/DC2_DMZ/sftp_manager_local01

resolved – differences between zfs ARC L2ARC ZIL

January 31st, 2013 No comments
  • ARC

zfs ARC(adaptive replacement cache) is a very fast cache located in the server’s memory.

For example, our ZFS server with 12GB of RAM has 11GB dedicated to ARC, which means our ZFS server will be able to cache 11GB of the most accessed data. Any read requests for data in the cache can be served directly from the ARC memory cache instead of hitting the much slower hard drives. This creates a noticeable performance boost for data that is accessed frequently.

  • L2ARC

As a general rule, you want to install as much RAM into the server as you can to make the ARC as big as possible. At some point, adding more memory is just cost prohibitive. That is where the L2ARC becomes important. The L2ARC is the second level adaptive replacement cache. The L2ARC is often called “cache drives” in the ZFS systems.

L2ARC is a new layer between Disk and the cache (ARC) in main memory for ZFS. It uses dedicated storage devices to hold cached data. The main role of this cache is to boost the performance of random read workloads. The intended L2ARC devices include 10K/15K RPM disks like short-stroked disks, solid state disks (SSD), and other media with substantially faster read latency than disk.

  • ZIL

ZIL(ZFS Intent Log) exists for performance improvement on synchronous writes. Synchronous write is very slow than asynchronous write, but it's more stable. Essentially, the intent log of a file system is nothing more than an insurance against power failures, a to-do list if you will, that keeps track of the stuff that needs to be updated on disk, even if the power fails (or something else happens that prevents the system from updating its disks).

To get better performance, use separated disks(SSD) for ZIL, such as zpool add pool log c2d0.

Now I'm giving you an true example about zfs ZIL/L2ARC/ARC on SUN ZFS 7320 head:

test-zfs# zpool iostat -v exalogic
capacity operations bandwidth
pool alloc free read write read write
------------------------- ----- ----- ----- ----- ----- -----
exalogic 6.78T 17.7T 53 1.56K 991K 25.1M
mirror 772G 1.96T 6 133 111K 2.07M
c0t5000CCA01A5FDCACd0 - - 3 36 57.6K 2.07M #these are the physical disks
c0t5000CCA01A6F5CF4d0 - - 2 35 57.7K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A6F5D00d0 - - 2 36 56.2K 2.07M
c0t5000CCA01A6F64F4d0 - - 2 35 57.3K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A76A7B8d0 - - 2 36 56.3K 2.07M
c0t5000CCA01A746CCCd0 - - 2 36 56.8K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A749A88d0 - - 2 35 56.7K 2.07M
c0t5000CCA01A759E90d0 - - 2 35 56.1K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A767FDCd0 - - 2 35 56.1K 2.07M
c0t5000CCA01A782A40d0 - - 2 35 57.1K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A782D10d0 - - 2 35 57.2K 2.07M
c0t5000CCA01A7465F8d0 - - 2 35 56.3K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A7597FCd0 - - 2 35 57.6K 2.07M
c0t5000CCA01A7828F4d0 - - 2 35 56.2K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A7829ACd0 - - 2 35 57.1K 2.07M
c0t5000CCA01A78278Cd0 - - 2 35 57.4K 2.07M
mirror 772G 1.96T 6 133 111K 2.07M
c0t5000CCA01A736000d0 - - 3 35 57.3K 2.07M
c0t5000CCA01A738000d0 - - 2 35 57.3K 2.07M
c0t5000A72030061B82d0 224M 67.8G 0 98 1 1.62M #ZIL(SSD write cache, ZFS Intent Log)
c0t5000A72030061C70d0 224M 67.8G 0 98 1 1.62M
c0t5000A72030062135d0 223M 67.8G 0 98 1 1.62M
c0t5000A72030062146d0 224M 67.8G 0 98 1 1.62M
cache - - - - - -
c2t2d0 334G 143G 15 6 217K 652K #L2ARC(SSD cache drives)
c2t3d0 332G 145G 15 6 215K 649K
c2t4d0 333G 144G 11 6 169K 651K
c2t5d0 333G 144G 13 6 192K 650K
c2t2d0 - - 0 0 0 0
c2t3d0 - - 0 0 0 0
c2t4d0 - - 0 0 0 0
c2t5d0 - - 0 0 0 0

And as for ARC:

test-zfs:> status memory show
Memory:
Cache 63.4G bytes #ARC
Unused 17.3G bytes
Mgmt 561M bytes
Other 491M bytes
Kernel 14.3G bytes

sun zfs firmware upgrade howto

January 29th, 2013 No comments

This article is going to talk about upgrading firmware for sun zfs 7320(you may find other series of sun zfs heads works too):

Categories: Hardware, NAS, SAN, Storage Tags:

zfs iops on nfs iscsi disk

January 5th, 2013 No comments

On zfs storage 7000 series BUI, you may found the following statistic:

This may seem quite weird as you can see that, NFSv3(3052) + iSCSI(1021) is larger than Disk(1583). As iops for protocal NFSv3/iSCSI finally goes to Disk, so why iops for the two protocals is larger than Disk iops?

Here's the reason:

Disk operations for NFSv3 and iSCSI are logical operations. These logical operations are then combined/optimized by sun zfs storage and then finally go to physical Disk operations.

PS:

1.When doing continuous access to disks(like VOD), disk throughputs will become the bottleneck of performance rather than IOPS. In constract, IOPS limits disk performance when random access is going on disks.

2.For NAS performance analytic, here are two good articles(in Chinese) http://goo.gl/Q2M7JE http://www.storageonline.com.cn/storage/nas/the-nas-performance-analysis-overview/

3.You may also wonder why Disk iops can be as high as 1583. As this number is the sum of all disk controllers of the zfs storage system. Here's some ballpark numbers for HDD iops:

 

Categories: Hardware, NAS, SAN, Storage Tags:

zfs shared lun stoage set up for oracle RAC

January 4th, 2013 No comments
  • create iSCSI Target Group

Open zfs BUI, navigate through "Configuration" -> "SAN" -> "iSCSI Targets". Then create new iSCSI Target by clicking plus sign. Give it an alias, and then select the Network interface(may be bond or LACP) you want to use(check it from "Configuration" -> "Network" and "Configuration" -> "Cluster"). After creating this iSCSI target, drag the newly created target to the right side "iSCSI Target Groups" to create one iSCSI Target Group. You can give that iSCSI target group an name too. Note down the iSCSI Target Group's iqn, this is important for later operations.(Network interfaces:use NAS interface. You can select multiple interfaces)

  • create iSCSI Initiator Group

Before going on the next step, we need first get the iSCSI initiator IQN for each hosts we want LUN allocated. On each host, execute the following command to get the iqn for iscsi on linux platform(You can edit this file before read it, for example, make iqn name ended with` hostname` so it's easier for later operations on LUN<do a /etc/init.d/iscsi restart after your modification to initiatorname.iscsi>):

[root@test-host ~]# cat /etc/iscsi/initiatorname.iscsi
InitiatorName=<your host's iqn name>

Now go back to zfs BUI, navigate through "Configuration" -> "SAN" -> "Initiators". On the left side, click "iSCSI Initiators", then click plus sign on it. Enter IQN you get from previos step and give it an name.(do this for each host you want iSCSI LUN allocated). After this, drag the newly created iSCSI initiator(s) from left side to form new iSCSI Initiator Groups on the right side(drag two items from the left to the same item on the right to form an group).


  • create shared LUNs for iSCSI Initiator Group

After this, we need now create LUNs for iSCSI Initiator Group(so that shared lun can be allocated, for example, oracle RAC need shared storage). Click on diskette sign on the just created iSCSI Initiator Group,select the project you want the LUN allocated from, give it a name, and assign the volume size. Select the right target group you created before(you can also create a new one e.g. RAC in shares).

PS:You can also now go to "shares" -> "Luns" and create Lun(s) using the target group you created and use default Initiator group. Note that one LUN need one iSCSI target. So you should create more iSCSI targets and add them to iSCSI target group if you want more LUNs.

  • scan shared LUNs from hosts

Now we're going to operate on linux hosts. On each host you want iSCSI LUN allocated, do the following steps:

iscsiadm -m discovery -t st -p <ip address of your zfs storage>(use cluster's ip if there's zfs cluster) #Discover available targets from a discovery portal
iscsiadm -m node -T <variable, iSCSI Target Group iqn> -p <ip address of your zfs storage> -l #Log into a specific target. Or use output from above command($1 is --portal, $3 is --targetname, -l is --login)
service iscsi restart

After these steps, you host(s) should now see the newly allocated iSCSI LUN(s), you can run fdisk -l to confirm.

PS:

Here's more about iscsiadm:

iscsiadm -m node -T targetname -p ipaddress -u #Log out of a specific target
iscsiadm -m node -T targetname -p ipaddress #Display information about a target
iscsiadm -m node -s -T targetname -p ipaddress #Display statistics about a target
iscsiadm -m session #Display list of all current sessions logged in
iscsiadm -m discovery -o show #View iSCSI database regarding discovery
iscsiadm -m node -o show #View iSCSI database regarding targets to log into
iscsiadm -m session -o show #View iSCSI database regarding sessions logged into
multipath -ll #View if the targets are multipathed (MPIO)

Here's the CMD equivalent of this article(aksh of oracle ZFS appliance):

shares project rac186187
set mountpoint=/export/rac186187
set quota=1T
set readonly=true
set default_permissions=777

set default_user=root

set default_group=root
set sharenfs="sec=sys,ro=test.example.com,rw=@10.240.72.0/21:@10.240.22.0/21,root=@10.240.72.0/21"
#get reservation
#get pool
#get snapdir #snapdir = visible
#get default_group #default_group = root
#get default_user #default_user = root
#get exported #exported = true
commit

configuration net interfaces list
configuration net datalinks list

configuration san iscsi targets create rac186187
set alias=rac186187
set interfaces=aggr93001
commit

configuration san iscsi targets list
configuration san iscsi targets groups create rac186187
set name=rac186187
set targets=iqn.1986-03.com.sun:02:0a3dec0f-b830-c17a-c957-dd2dc1755a16
commit

[cat /etc/iscsi/initiatorname.iscsi] -> iqn.1988-12.com.oracle.testhost186, iqn.1988-12.com.oracle.testhost187
configuration san iscsi initiators create testhost186
set alias=testhost186
set initiator=iqn.1988-12.com.oracle.testhost186
commit

configuration san iscsi initiators create testhost187
set alias=testhost187
set initiator=iqn.1988-12.com.oracle.testhost187
commit

configuration san iscsi initiators list
configuration san iscsi initiators groups create rac186187
set name=rac186187
set initiators=iqn.1988-12.com.oracle.testhost186,iqn.1988-12.com.oracle.testhost187
commit

shares select rac186187 #project must set readonly=false
lun rac186187
set volsize=500G
set targetgroup=rac186187
set initiatorgroup=rac186187
commit

Categories: Hardware, NAS, Storage Tags:

how to turn on hba flags connected to EMC arrays

October 3rd, 2012 No comments

As per EMC recommendation following flags should be enabled for Vmware ESX hosts, if not there will be performance issues:

Common_Serial_Number(C)
SCSI_3(SC3)
SPC2_Protocol_Version(SPC2)

Here's the commands that'll do the trick:

sudo symmask -sid <sid> set hba_flags on C,SPC2,SC3 -enable -wwn <port wwn> -dir <dir number> -p <port number>

Categories: Hardware, NAS, SAN, Storage Tags:

Resolved – Errors found during scanning of LUN allocated from IBM XIV array

October 2nd, 2012 No comments

So here's the story:
After the LUN(IBM XIV array) allocated, we run a 'xiv_fc_admin -R' to make the LUN visible to OS(testhost-db-clstr-vol_37 is the new LUN's Volume Name):
root@testhost01 # xiv_devlist -o device,vol_name,vol_id
XIV Devices
-------------------------------------------------------------------
Device Vol Name Vol Id
-------------------------------------------------------------------
/dev/dsk/c2t500173804EE40140d19s2 testhost-db-clstr-vol_37 1974
-------------------------------------------------------------------
/dev/dsk/c2t500173804EE40150d19s2 testhost-db-clstr-vol_37 1974
-------------------------------------------------------------------
/dev/dsk/c4t500173804EE40142d19s2 testhost-db-clstr-vol_37 1974
-------------------------------------------------------------------
/dev/dsk/c4t500173804EE40152d19s2 testhost-db-clstr-vol_37 1974
-------------------------------------------------------------------
...
...
...
/dev/vx/dmp/xiv0_16 testhost-db-clstr-vol_17 1922
...
...
...
Non-XIV Devices
--------------------
Device
--------------------
/dev/vx/dmp/disk_0
--------------------
/dev/vx/dmp/disk_1
--------------------
/dev/vx/dmp/disk_2
--------------------
/dev/vx/dmp/disk_3
--------------------

Then, I ran 'vxdctl enable' in order to make the DMP device visible to OS, but error message prompted:
root@testhost01 # vxdctl enable
VxVM vxdctl ERROR V-5-1-0 Data Corruption Protection Activated - User Corrective Action Needed
VxVM vxdctl INFO V-5-1-0 To recover, first ensure that the OS device tree is up to date (requires OS specific commands).
VxVM vxdctl INFO V-5-1-0 Then, execute 'vxdisk rm' on the following devices before reinitiating device discovery:
xiv0_18, xiv0_18, xiv0_18, xiv0_18

After this, the new LUN disappered from output of 'xiv_devlist -o device,vol_name,vol_id'(testhost-db-clstr-vol_37 disappered), and xiv0_18(the DMP device of new LUN) turned to 'Unreachable device', see below:

root@testhost01 # xiv_devlist -o device,vol_name,vol_id
XIV Devices
-----------------------------------------------------
Device Vol Name Vol Id
-----------------------------------------------------
...
...
...
Non-XIV Devices
--------------------
Device
--------------------
/dev/vx/dmp/disk_0
--------------------
/dev/vx/dmp/disk_1
--------------------
/dev/vx/dmp/disk_2
--------------------
/dev/vx/dmp/disk_3
--------------------
Unreachable devices: /dev/vx/dmp/xiv0_18
Also, 'vxdisk list' showed:
root@testhost01 # vxdisk list xiv0_18
Device: xiv0_18
devicetag: xiv0_18
type: auto
flags: error private autoconfig
pubpaths: block=/dev/vx/dmp/xiv0_18s2 char=/dev/vx/rdmp/xiv0_18s2
guid: -
udid: IBM%5F2810XIV%5F4EE4%5F07B6
site: -
Multipathing information:
numpaths: 4
c4t500173804EE40142d19s2 state=disabled
c4t500173804EE40152d19s2 state=disabled
c2t500173804EE40150d19s2 state=disabled
c2t500173804EE40140d19s2 state=disabled

I tried to format the new DMP device(xiv0_18), but failed with info below:
root@testhost01 # format -d /dev/vx/dmp/xiv0_18
Searching for disks...done

c2t500173804EE40140d19: configured with capacity of 48.06GB
c2t500173804EE40150d19: configured with capacity of 48.06GB
c4t500173804EE40142d19: configured with capacity of 48.06GB
c4t500173804EE40152d19: configured with capacity of 48.06GB
Unable to find specified disk '/dev/vx/dmp/xiv0_18'.

Also, 'vxdisksetup -i' failed with info below:
root@testhost01 # vxdisksetup -i /dev/vx/dmp/xiv0_18
prtvtoc: /dev/vx/rdmp/xiv0_18: No such device or address

And, 'xiv_fc_admin -R' failed with info below:
root@testhost01 # xiv_fc_admin -R
ERROR: Error during command execution: vxdctl enabled
====================================================
OK, that's all of the symptoms and the headache, here's the solution:
====================================================

1. Run 'xiv_fc_admin -R'(ERROR: Error during command execution: vxdctl enabled will prompt, ignore it. this step scanned for new LUN). You can also run a devfsadm -c disk(not needed actually)
2. Now exclude problematic paths of the DMP device(you can check the paths from vxdisk list xiv0_18)
root@testhost01 # vxdmpadm exclude vxvm path=c4t500173804EE40142d19s2
root@testhost01 # vxdmpadm exclude vxvm path=c4t500173804EE40152d19s2
root@testhost01 # vxdmpadm exclude vxvm path=c2t500173804EE40150d19s2
root@testhost01 # vxdmpadm exclude vxvm path=c2t500173804EE40140d19s2
3. Now run 'vxdctl enable', the following error message will NOT showed:
VxVM vxdctl ERROR V-5-1-0 Data Corruption Protection Activated - User Corrective Action Needed
VxVM vxdctl INFO V-5-1-0 To recover, first ensure that the OS device tree is up to date (requires OS specific commands).
VxVM vxdctl INFO V-5-1-0 Then, execute 'vxdisk rm' on the following devices before reinitiating device discovery:
xiv0_18, xiv0_18, xiv0_18, xiv0_18
4. Now include the problematic paths:
root@testhost01 # vxdmpadm include vxvm path=c4t500173804EE40142d19s2
root@testhost01 # vxdmpadm include vxvm path=c4t500173804EE40152d19s2
root@testhost01 # vxdmpadm include vxvm path=c2t500173804EE40150d19s2
root@testhost01 # vxdmpadm include vxvm path=c2t500173804EE40140d19s2

5. Run 'vxdctl enable'. After this, you should now see the DMP device in output of 'xiv_devlist -o device,vol_name,vol_id'
root@testhost01 # xiv_devlist -o device,vol_name,vol_id
XIV Devices
-----------------------------------------------------
Device Vol Name Vol Id
-----------------------------------------------------
...
...
...
-----------------------------------------------------
/dev/vx/dmp/xiv0_18 testhost-db-clstr-vol_37 1974
-----------------------------------------------------
...
...
...
Non-XIV Devices
--------------------
Device
--------------------
/dev/vx/dmp/disk_0
--------------------
/dev/vx/dmp/disk_1
--------------------
/dev/vx/dmp/disk_2
--------------------
/dev/vx/dmp/disk_3
--------------------

6. 'vxdisk list' will now show the DMP device(xiv0_18) as 'auto - - nolabel', obviously we should now label the DMP device:
root@testhost01 # format -d xiv0_18
Searching for disks...done

c2t500173804EE40140d19: configured with capacity of 48.06GB
c2t500173804EE40150d19: configured with capacity of 48.06GB
c4t500173804EE40142d19: configured with capacity of 48.06GB
c4t500173804EE40152d19: configured with capacity of 48.06GB
Unable to find specified disk 'xiv0_18'.

root@testhost01 # vxdisk list xiv0_18
Device: xiv0_18
devicetag: xiv0_18
type: auto
flags: nolabel private autoconfig
pubpaths: block=/dev/vx/dmp/xiv0_18 char=/dev/vx/rdmp/xiv0_18
guid: -
udid: IBM%5F2810XIV%5F4EE4%5F07B6
site: -
errno: Disk is not usable
Multipathing information:
numpaths: 4
c4t500173804EE40142d19s2 state=enabled
c4t500173804EE40152d19s2 state=enabled
c2t500173804EE40150d19s2 state=enabled
c2t500173804EE40140d19s2 state=enabled

root@testhost01 # vxdisksetup -i /dev/vx/dmp/xiv0_18
prtvtoc: /dev/vx/rdmp/xiv0_18: Unable to read Disk geometry errno = 0x16

Not again! But don't panic this time. Now run format for each subpath of the DMP device(can be found in output of vxdisk list xiv0_18), for example:
root@testhost01 # format c4t500173804EE40142d19s2

c4t500173804EE40142d19s2: configured with capacity of 48.06GB
selecting c4t500173804EE40142d19s2
[disk formatted]
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
!<cmd> - execute <cmd>, then return
quit
format> label
Ready to label disk, continue? yes

format> save
Saving new disk and partition definitions
Enter file name["./format.dat"]:
format> quit

7. After the subpaths were labelled, now run a 'vxdctl enable' again. After this, you'll find the DMP device turned it's state from 'auto - - nolabel' to 'auto:none - - online invalid', and vxdisk list no longer showed the DMP device as 'Disk is not usable':
root@testhost01 # vxdisk list xiv0_18
Device: xiv0_18
devicetag: xiv0_18
type: auto
info: format=none
flags: online ready private autoconfig invalid
pubpaths: block=/dev/vx/dmp/xiv0_18s2 char=/dev/vx/rdmp/xiv0_18s2
guid: -
udid: IBM%5F2810XIV%5F4EE4%5F07B6
site: -
Multipathing information:
numpaths: 4
c4t500173804EE40142d19s2 state=enabled
c4t500173804EE40152d19s2 state=enabled
c2t500173804EE40150d19s2 state=enabled
c2t500173804EE40140d19s2 state=enabled

8. To add the new DMP device to Disk Group, the following steps should be followed:
/usr/lib/vxvm/bin/vxdisksetup -i xiv0_18
vxdg -g <dg_name> adddisk <disk_name>=<device name>
/usr/sbin/vxassist -g <dg_name> maxgrow <vol name> alloc=<newly-add-luns>
/etc/vx/bin/vxresize -g <dg_name> -bx <vol name> <new size>

 

Categories: Hardware, SAN, Storage Tags:

thin provisioning aka virtual provisioning on EMC Symmetrix

July 28th, 2012 No comments

For basic information about thin provisioning, here's some excerpts from wikipedia/HDS site:

Thin provisioning is the act of using virtualization technology to give the appearance of more physical resource than is actually available. It relies on on-demand allocation of blocks of data versus the traditional method of allocating all the blocks up front. This methodology eliminates almost all whitespace which helps avoid the poor utilization rates, often as low as 10%, that occur in the traditional storage allocation method where large pools of storage capacity are allocated to individual servers but remain unused (not written to). This traditional model is often called "fat" or "thick" provisioning.

Thin provisioning simplifies application storage provisioning by allowing administrators to draw from a central virtual pool without immediately adding physical disks. When an application requires more storage capacity, the storage system automatically allocates the necessary physical storage. This just-in-time method of provisioning decouples the provisioning of storage to an application from the physical addition of capacity to the storage system.

The term thin provisioning is applied to disk later in this article, but could refer to an allocation scheme for any resource. For example, real memory in a computer is typically thin provisioned to running tasks with some form of address translation technology doing the virtualization. Each task believes that it has real memory allocated. The sum of the allocated virtual memory assigned to tasks is typically greater than the total of real memory.

The following article below shows the step how to create thin pool, add and remove components from the pool and how to delete thin pool:

http://software-cluster.blogspot.co.uk/2011/09/create-emc-symmetrix-thin-devices.html

And for more information about thin provisioning on EMC Symmetrix V-Max  with Veritas Storage Foundation, the following PDF file may help you.

EMC Symmetrix V-Max with Veritas Storage Foundation.pdf

PS:

1.symcfg -sid 1234 list -datadev #list all TDAT devices(thin data devices which consists thin pool, and thin pool provide the actual physical storage to thin devices)
2.symcfg -sid 1234 list -tdev #list all TDEV devices(thin devices)

3.The following article may be useful for you if you encountered problems when trying to perform storage reclamation(VxVM vxdg ERROR V-5-1-16063 Disk d1 is used by one or more subdisks which are pending to be reclaimed):

http://www.symantec.com/business/support/index?page=content&id=TECH162709

 

 

Categories: Hardware, SAN, Storage Tags: ,

Resolved – VxVM vxconfigd ERROR V-5-1-0 Segmentation violation – core dumped

July 25th, 2012 2 comments

When I tried to import veritas disk group today using vxdg -C import doxerdg, there's error message shown as the following:

VxVM vxdg ERROR V-5-1-684 IPC failure: Configuration daemon is not accessible
return code of vxdg import command is 768

VxVM vxconfigd DEBUG V-5-1-0 IMPORT: Trying to import the disk group using configuration database copy from emc5_0490
VxVM vxconfigd ERROR V-5-1-0 Segmentation violation - core dumped

Then I used pstack to print the stack trace of the dumped file:

root # pstack /var/core/core_doxerorg_vxconfigd_0_0_1343173375_140
core 'core_doxerorg_vxconfigd_0_0_1343173375_14056' of 14056: vxconfigd
ff134658 strcmp (fefc04e8, 103fba8, 0, 0, 31313537, 31313737) + 238
001208bc da_find_diskid (103fba8, 0, 0, 0, 0, 0) + 13c
002427dc dm_get_da (58f068, 103f5f8, 0, 0, 68796573, 0) + 14c
0023f304 ssb_check_disks (58f068, 0, f37328, fffffffc, 4, 0) + 3f4
0018f8d8 dg_import_start (58f068, 9c2088, ffbfed3c, 4, 0, 0) + 25d8
00184ec0 dg_reimport (0, ffbfedf4, 0, 0, 0, 0) + 288
00189648 dg_recover_all (50000, 160d, 3ec1bc, 1, 8e67c8, 447ab4) + 2a8
001f2f5c mode_set (2, ffbff870, 0, 0, 0, 0) + b4c
001e0a80 setup_mode (2, 3e90d4, 4d5c3c, 0, 6c650000, 6c650000) + 18
001e09a0 startup (4d0da8, 0, 0, fffffffc, 0, 4d5bcc) + 3e0
001e0178 main (1, ffbffa7c, ffbffa84, 44f000, 0, 0) + 1a98
000936c8 _start (0, 0, 0, 0, 0, 0) + b8

Then I tried restart vxconfigd, but it failed as well:

root@doxer#/sbin/vxconfigd -k -x syslog

VxVM vxconfigd ERROR V-5-1-0 Segmentation violation - core dumped

After reading the man page of vxconfigd, I determined to use -r reset to reset all Veritas Volume Manager configuration information stored in the kernel as part of startup processing. But before doing this, we need umount all vxvm volumes as stated in the man page:

The reset fails if any volume devices are in use, or if an imported shared disk group exists.

After umount all vxvm partitions, then I ran the following command:

vxconfid -k -r reset

After this, the importing of DGs succeeded.

Categories: Hardware, SAN, Storage Tags: ,

resolved – df Input/output error from veritas vxfs

July 10th, 2012 No comments

If you got error like the following when do a df list which has veritas vxfs as underlying FS:

df: `/BCV/testdg': Input/output error
df: `/BCV/testdg/ora': Input/output error
df: `/BCV/testdg/ora/archivelog01': Input/output error
df: `/BCV/testdg/ora/gg': Input/output error

And when use vxdg list, you found the dgs are in disabled status:

testarc_PRD disabled 1275297639.26.doxer
testdb_PRD disabled 1275297624.24.doxer

Don't panic, to resolve this, you need do the following:

1) Force umount of the failed fs's
2) deporting and importing failed disk groups.
3) Fixing plexes which were in the DISABLED FAILED state.
4) Fsck.vxfs of failed fs's
5) Remounting of the needable fs's

Categories: Hardware, SAN, Storage Tags:

difference between SCSI ISCSI FCP FCoE FCIP NFS CIFS DAS NAS SAN iFCP

May 30th, 2012 No comments

Here goes some differences between SCSI ISCSI FCP FCoE FCIP NFS CIFS DAS NAS SAN(excerpt from Internet):

Most storage networks use the SCSI protocol for communication between servers and disk drive devices. A mapping layer to other protocols is used to form a network: Fibre Channel Protocol (FCP), the most prominent one, is a mapping of SCSI over Fibre Channel; Fibre Channel over Ethernet (FCoE); iSCSI, mapping of SCSI over TCP/IP.

 

A storage area network (SAN) is a dedicated network that provides access to consolidated, block level data storage. SANs are primarily used to make storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices to the operating system. A storage area network (SAN) is a dedicated network that provides access to consolidated, block level data storage. SANs are primarily used to make storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices to the operating system. Historically, data centers first created "islands" of SCSI disk arrays as direct-attached storage (DAS), each dedicated to an application, and visible as a number of "virtual hard drives" (i.e. LUNs). Operating systems maintain their own file systems on their own dedicated, non-shared LUNs, as though they were local to themselves. If multiple systems were simply to attempt to share a LUN, these would interfere with each other and quickly corrupt the data. Any planned sharing of data on different computers within a LUN requires advanced solutions, such as SAN file systems or clustered computing. Despite such issues, SANs help to increase storage capacity utilization, since multiple servers consolidate their private storage space onto the disk arrays.Sharing storage usually simplifies storage administration and adds flexibility since cables and storage devices do not have to be physically moved to shift storage from one server to another. SANs also tend to enable more effective disaster recovery processes. A SAN could span a distant location containing a secondary storage array. This enables storage replication either implemented by disk array controllers, by server software, or by specialized SAN devices. Since IP WANs are often the least costly method of long-distance transport, the Fibre Channel over IP (FCIP) and iSCSI protocols have been developed to allow SAN extension over IP networks. The traditional physical SCSI layer could only support a few meters of distance - not nearly enough to ensure business continuance in a disaster.

More about FCIP is here http://en.wikipedia.org/wiki/Fibre_Channel_over_IP (still use FC protocol)

A competing technology to FCIP is known as iFCP. It uses routing instead of tunneling to enable connectivity of Fibre Channel networks over IP.

IP SAN uses TCP as a transport mechanism for storage over Ethernet, and iSCSI encapsulates SCSI commands into TCP packets, thus enabling the transport of I/O block data over IP networks.

Network-attached storage (NAS), in contrast to SAN, uses file-based protocols such as NFS or SMB/CIFS where it is clear that the storage is remote, and computers request a portion of an abstract file rather than a disk block. The key difference between direct-attached storage (DAS) and NAS is that DAS is simply an extension to an existing server and is not necessarily networked. NAS is designed as an easy and self-contained solution for sharing files over the network.

 

FCoE works with standard Ethernet cards, cables and switches to handle Fibre Channel traffic at the data link layer, using Ethernet frames to encapsulate, route, and transport FC frames across an Ethernet network from one switch with Fibre Channel ports and attached devices to another, similarly equipped switch.

 

When an end user or application sends a request, the operating system generates the appropriate SCSI commands and data request, which then go through encapsulation and, if necessary, encryption procedures. A packet header is added before the resulting IP packets are transmitted over an Ethernet connection. When a packet is received, it is decrypted (if it was encrypted before transmission), and disassembled, separating the SCSI commands and request. The SCSI commands are sent on to the SCSI controller, and from there to the SCSI storage device. Because iSCSI is bi-directional, the protocol can also be used to return data in response to the original request.

 

Fibre channel is more flexible; devices can be as far as ten kilometers (about six miles) apart if optical fiber is used as the physical medium. Optical fiber is not required for shorter distances, however, because Fibre Channel also works using coaxial cable and ordinary telephone twisted pair.

 

Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems in 1984,[1] allowing a user on a client computer to access files over a network in a manner similar to how local storage is accessed. On the contrary, CIFS is its Windows-based counterpart used in file sharing.

Categories: Hardware, NAS, SAN, Storage Tags:

check lun0 is the first mapped LUN before rescan-scsi-bus.sh(sg3_utils) on centos linux

May 26th, 2012 No comments

rescan-scsi-bus.sh from package sg3_utils scans all the SCSI buses on the system, updating the SCSI layer to reflect new devices on the bus. But in order for this to work, LUN0 must be the first mapped logical unit. Here's some excerpt from wiki page:

LUN 0: There is one LUN which is required to exist in every target: zero. The logical unit with LUN zero is special in that it must implement a few specific commands, most notably Report LUNs, which is how an initiator can find out all the other LUNs in the target. But LUN zero need not provide any other services, such as a storage volume.

To confirm LUN0 is the first mapped LUN, do the following check if you're using symantec storage foundation:

syminq -pdevfile |awk '!/^#/ {print $1,$4,$5}' |sort -n | uniq | while read _sym _FA _port
do
if [[ -z "$(symcfg -sid $_sym -fa $_FA -p $_port -addr list | awk '$NF=="000"')" ]]
then
print Sym $_sym, FA $_FA:$_port
fi
done

If you see the following line, then it proves that lun0 is the first mapped LUN, and you can continue with the script rescan-scsi-bus.sh to scan new lun:

Symmetrix ID: 000287890217

Director Device Name Attr Address
---------------------- ----------------------------- ---- --------------
Ident Symbolic Port Sym Physical VBUS TID LUN
------ -------- ---- ---- ----------------------- ---- --- ---

FA-4A 04A 0 0000 c1t600604844A56CA43d0s* VCM 0 00 000

PS:

For more infomation what Logical Unit Number(LUN) is, you may refer to:

http://en.wikipedia.org/wiki/Logical_Unit_Number

Categories: Hardware, SAN, Storage Tags: