Archive

Archive for the ‘Hardware’ Category

resolved – differences between zfs ARC L2ARC ZIL

January 31st, 2013 No comments
  • ARC

zfs ARC(adaptive replacement cache) is a very fast cache located in the server’s memory.

For example, our ZFS server with 12GB of RAM has 11GB dedicated to ARC, which means our ZFS server will be able to cache 11GB of the most accessed data. Any read requests for data in the cache can be served directly from the ARC memory cache instead of hitting the much slower hard drives. This creates a noticeable performance boost for data that is accessed frequently.

  • L2ARC

As a general rule, you want to install as much RAM into the server as you can to make the ARC as big as possible. At some point, adding more memory is just cost prohibitive. That is where the L2ARC becomes important. The L2ARC is the second level adaptive replacement cache. The L2ARC is often called “cache drives” in the ZFS systems.

L2ARC is a new layer between Disk and the cache (ARC) in main memory for ZFS. It uses dedicated storage devices to hold cached data. The main role of this cache is to boost the performance of random read workloads. The intended L2ARC devices include 10K/15K RPM disks like short-stroked disks, solid state disks (SSD), and other media with substantially faster read latency than disk.

  • ZIL

ZIL(ZFS Intent Log) exists for performance improvement on synchronous writes. Synchronous write is very slow than asynchronous write, but it’s more stable. Essentially, the intent log of a file system is nothing more than an insurance against power failures, a to-do list if you will, that keeps track of the stuff that needs to be updated on disk, even if the power fails (or something else happens that prevents the system from updating its disks).

To get better performance, use separated disks(SSD) for ZIL, such as zpool add pool log c2d0.

Now I’m giving you an true example about zfs ZIL/L2ARC/ARC on SUN ZFS 7320 head:

test-zfs# zpool iostat -v exalogic
capacity operations bandwidth
pool alloc free read write read write
————————- —– —– —– —– —– —–
exalogic 6.78T 17.7T 53 1.56K 991K 25.1M
mirror 772G 1.96T 6 133 111K 2.07M
c0t5000CCA01A5FDCACd0 – - 3 36 57.6K 2.07M #these are the physical disks
c0t5000CCA01A6F5CF4d0 – - 2 35 57.7K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A6F5D00d0 – - 2 36 56.2K 2.07M
c0t5000CCA01A6F64F4d0 – - 2 35 57.3K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A76A7B8d0 – - 2 36 56.3K 2.07M
c0t5000CCA01A746CCCd0 – - 2 36 56.8K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A749A88d0 – - 2 35 56.7K 2.07M
c0t5000CCA01A759E90d0 – - 2 35 56.1K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A767FDCd0 – - 2 35 56.1K 2.07M
c0t5000CCA01A782A40d0 – - 2 35 57.1K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A782D10d0 – - 2 35 57.2K 2.07M
c0t5000CCA01A7465F8d0 – - 2 35 56.3K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A7597FCd0 – - 2 35 57.6K 2.07M
c0t5000CCA01A7828F4d0 – - 2 35 56.2K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A7829ACd0 – - 2 35 57.1K 2.07M
c0t5000CCA01A78278Cd0 – - 2 35 57.4K 2.07M
mirror 772G 1.96T 6 133 111K 2.07M
c0t5000CCA01A736000d0 – - 3 35 57.3K 2.07M
c0t5000CCA01A738000d0 – - 2 35 57.3K 2.07M
c0t5000A72030061B82d0 224M 67.8G 0 98 1 1.62M #ZIL(SSD write cache, ZFS Intent Log)
c0t5000A72030061C70d0 224M 67.8G 0 98 1 1.62M
c0t5000A72030062135d0 223M 67.8G 0 98 1 1.62M
c0t5000A72030062146d0 224M 67.8G 0 98 1 1.62M
cache – - – - – -
c2t2d0 334G 143G 15 6 217K 652K #L2ARC(SSD cache drives)
c2t3d0 332G 145G 15 6 215K 649K
c2t4d0 333G 144G 11 6 169K 651K
c2t5d0 333G 144G 13 6 192K 650K
c2t2d0 – - 0 0 0 0
c2t3d0 – - 0 0 0 0
c2t4d0 – - 0 0 0 0
c2t5d0 – - 0 0 0 0

And as for ARC:

test-zfs:> status memory show
Memory:
Cache 63.4G bytes #ARC
Unused 17.3G bytes
Mgmt 561M bytes
Other 491M bytes
Kernel 14.3G bytes

Categories: Kernel, NAS, SAN, Storage Tags: ,

sun zfs firmware upgrade howto

January 29th, 2013 No comments

This article is going to talk about upgrading firmware for sun zfs 7320(you may find other series of sun zfs heads works too):

Categories: NAS, SAN, Storage Tags:

perl script for monitoring sun zfs memory usage

January 16th, 2013 No comments

On zfs’s aksh, I can check memory usage with the following:

test-zfs:> status memory show
Memory:
Cache 719M bytes
Unused 15.0G bytes
Mgmt 210M bytes
Other 332M bytes
Kernel 7.79G bytes

So now I want to collect this memory usae information automatically for SNMP’s use. Here’s the steps:

cpan> o conf prerequisites_policy follow
cpan> o conf commit

Since the host is using proxy to get on the internet, so in /etc/wgetrc:

http_proxy = http://www-proxy.us.example.com:80/
ftp_proxy = http://www-proxy.us.example.com:80/
use_proxy = on

Now install the Net::SSH::Perl perl module:

PERL_MM_USE_DEFAULT=1 perl -MCPAN -e ‘install Net::SSH::Perl’

And to confirm that Net::SSH::Perl was installed, run the following command:

perl -e ‘use Net::SSH::Perl’ #no output is good, as it means the package was installed successfully

Now here goes the perl script to get the memory usage of sun zfs head:

[root@test-centos ~]# cat /var/tmp/mrtg/zfs-test-zfs-memory.pl
#!/usr/bin/perl
use strict;
use warnings;
use Net::SSH::Perl;
my $host = ‘test-zfs’;
my $user = ‘root’;
my $password = ‘password’;

my $ssh = Net::SSH::Perl->new($host);
$ssh->login($user,$password);
my ($stdout,$stderr,$exit) = $ssh->cmd(“status memory show”);
$ssh->cmd(“exit”);
if($stderr){
print “ErrorCode:$exit\n”;
print “ErrorMsg:$stderr”;
} else {
my @std_arr = split(/\n/, $stdout);
shift @std_arr;
foreach(@std_arr) {
if ($_ =~ /.+\b\s+(.+)M\sbytes/){
$_=$1/1024;
}
elsif($_ =~ /.+\b\s+(.+)G\sbytes/){
$_=$1;
}
else{}
}
foreach(@std_arr) {
print $_.”\n”;
}
}
exit $exit;

PS:
If you get the following error messages during installation of a perl module:

[root@test-centos ~]# perl -MCPAN -e ‘install SOAP::Lite’
CPAN: Storable loaded ok
CPAN: LWP::UserAgent loaded ok
Fetching with LWP:
ftp://ftp.perl.org/pub/CPAN/authors/01mailrc.txt.gz
LWP failed with code[500] message[LWP::Protocol::MyFTP: connect: Connection timed out]
Fetching with Net::FTP:
ftp://ftp.perl.org/pub/CPAN/authors/01mailrc.txt.gz

Trying with “/usr/bin/links -source” to get
ftp://ftp.perl.org/pub/CPAN/authors/01mailrc.txt.gz
ELinks: Connection timed out

Then you may have a check of whether you’re using proxy to get on the internet(run cpan > o conf init to re-configure cpan; later you should set /etc/wgetrc: http_proxy, ftp_proxy, use_proxy).

 

Categories: NAS, Perl, Storage Tags: ,

zfs iops on nfs iscsi disk

January 5th, 2013 No comments

On zfs storage 7000 series BUI, you may found the following statistic:

This may seem quite weird as you can see that, NFSv3(3052) + iSCSI(1021) is larger than Disk(1583). As iops for protocal NFSv3/iSCSI finally goes to Disk, so why iops for the two protocals is larger than Disk iops?

Here’s the reason:

Disk operations for NFSv3 and iSCSI are logical operations. These logical operations are then combined/optimized by sun zfs storage and then finally go to physical Disk operations.

PS:

You may also wonder why Disk iops can be as high as 1583. As this number is the sum of all disk controllers of the zfs storage system. Here’s some ballpark numbers for HDD iops:

Categories: Hardware, NAS, SAN, Storage Tags:

zfs shared lun stoage set up for oracle RAC

January 4th, 2013 No comments
  • create iSCSI Target Group

Open zfs BUI, navigate through “Configuration” -> “SAN” -> “iSCSI Targets”. Then create new iSCSI Target by clicking plus sign. Give it an alias, and then select the Network interface(may be bond or LACP) you want to use. After creating this iSCSI target, drag the newly created target to the right side “iSCSI Target Groups” to create one iSCSI Target Group. You can give that iSCSI target group an name too. Note down the iSCSI Target Group’s iqn, this is important for later operations.(Network interfaces:use NAS interface)

  • create iSCSI Initiator Group

Before going on the next step, we need first get the iSCSI initiator IQN for each hosts we want LUN allocated. On each host, execute the following command to get the iqn for iscsi on linux platform(You can edit this file before read it, for example, make iqn name ended with` hostname` so it’s easier for later operations on LUN<do a /etc/init.d/iscsi restart after your modification to initiatorname.iscsi>):

[root@test-host ~]# cat /etc/iscsi/initiatorname.iscsi
InitiatorName=<your host’s iqn name>

Now go back to zfs BUI, navigate through “Configuration” -> “SAN” -> “Initiators”. On the left side, click “iSCSI Initiators”, then click plus sign on it. Enter IQN you get from previos step and give it an name.(do this for each host you want iSCSI LUN allocated). After this, drag the newly created iSCSI initiator(s) from left side to form new iSCSI Initiator Groups on the right side(drag two items from the left to the same item on the right to form an group).


  • create shared LUNs for iSCSI Initiator Group

After this, we need now create LUNs for iSCSI Initiator Group(so that shared lun can be allocated, for example, oracle RAC need shared storage). Click on diskette sign on the just created iSCSI Initiator Group,select the project you want the LUN allocated from, give it a name, and assign the volume size. Select the right target group you created before(you can also create a new one e.g. RAC in shares).

  • scan shared LUNs from hosts

Now we’re going to operate on linux hosts. On each host you want iSCSI LUN allocated, do the following steps:

iscsiadm -m discovery -t st -p <ip address of your zfs storage>(use cluster’s ip if there’s zfs cluster)
iscsiadm -m node -T <variable, iSCSI Target Group iqn> -p <ip address of your zfs storage> -l
service iscsi restart

After these steps, you host(s) should now see the newly allocated iSCSI LUN(s), you can run fdisk -l to confirm.

Good luck!

Categories: NAS, Storage Tags:

resolved – error: conflicting types for ‘pci_pcie_cap’ – Infiniband driver OFED installation

December 10th, 2012 No comments

OFED is an abbr for “OpenFabrics Enterprise Distribution”. When installing OFED-1.5.4.1 on a centos/RHEL 5.8 linux system, I met the following problem:

In file included from /var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5.4.1/drivers/infiniband/core/notice.c:37:
/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5.4.1/kernel_addons/backport/2.6.18-EL5.7/include/linux/pci.h:164: error: conflicting types for ‘pci_pcie_cap’
include/linux/pci.h:1015: error: previous definition of ‘pci_pcie_cap’ was here
make[4]: *** [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5.4.1/drivers/infiniband/core/notice.o] Error 1
make[3]: *** [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5.4.1/drivers/infiniband/core] Error 2
make[2]: *** [/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5.4.1/drivers/infiniband] Error 2
make[1]: *** [_module_/var/tmp/OFED_topdir/BUILD/ofa_kernel-1.5.4.1] Error 2
make[1]: Leaving directory `/usr/src/kernels/2.6.18-308.4.1.0.1.el5-x86_64′
make: *** [kernel] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.70159 (%build)

After googling and experimenting, I can tell you the definite resolution for this problem -> install another version of OFED, OFED-1.5.3.2. Go to the following official site to download: http://www.openfabrics.org/downloads/OFED/ofed-1.5.3/

PS:

1. Here’s the file of OFED installation howto: OFED installation README.txt

2.Some Infiniband commands that may be useful for you:

service openibd status/start/stop
lspci |grep -i infini
ibnetdiscover
ibhosts
ibswitches
ibv_devinfo #Verify the status of ports by using ibv_devinfo: all connected ports should report a “PORT_ACTIVE” state