Archive

Archive for the ‘Kernel’ Category

cpu hyperthreading vs dual core

May 14th, 2013 No comments

Note: This is from http://www.richweb.com/cpu_info

A hyperthreaded processor has the same number of function units as an older, non-hyperthreaded processor. It just has two execution contexts, so it can maybe achieve better function unit utilization by letting more than one program execute concurrently. On the other hand, if you’re running two programs which compete for the same function units, there is no advantage at all to having both running “concurrently.” When one is running, the other is necessarily waiting on the same function units.

A dual core processor literally has two times as many function units as a single-core processor, and can really run two programs concurrently, with no competition for function units.

A dual core processor is built so that both cores share the same level 2 cache. A dual processor (separate physical cpus) system differs in that each cpu will have its own level 2 cache. This may sound like an advantage, and in some situations it can be but in many cases new research and testing shows that the shared cache can be faster when the cpus are sharing the same or very similar tasks.

In general Hyperthreading is considered older technology and is no longer supported in newer cpus. Hyperthreading can provide a marginal (10%) for some server workloads like mysql, but dual core technology has essentially replaced hyperthreading in newer systems.

A dual core cpu running at 3.0Ghz should be faster then a dual cpu (separate core) system running at 3.0Ghz due to the ability to share the cache at higher bus speeds.

The examples below details how we determine what kind of cpu(s) are present.

The kernel data Linux exposes in /proc/cpuinfo will show each logical cpu with a unique processor number. A logical cpu can be a hyperthreading sibling, a shared core in a dual or quad core, or a separate physical cpu. We must look at the siblings, cpu cores and core id to tell the difference.

If the number of cores = the number of siblings for a given physical processor, then hyperthreading is OFF.

/bin/cat /proc/cpuinfo | /bin/egrep ‘processor|model name|cache size|core|sibling|physical’

 

Example 1: Single processor, 1 core, no Hyperthreading

processor	: 0
model name	: AMD Duron(tm) processor
cache size	: 64 KB

 

Example 2: Single processor, 1 core, Hyperthreading is enabled.

Notice how we have 2 siblings, but only 1 core. The physical cpu id is the same for both: 0.

processor	: 0
model name	: Intel(R) Pentium(R) 4 CPU 2.80GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 1
model name	: Intel(R) Pentium(R) 4 CPU 2.80GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1

 

Example 3. Single socket Quad Core

Notice how each processor has its own core id. The number of siblings matches the number of cores so there are no Hyperthreading siblings. Also notice the huge l2 cache – 6 MB. That makes sense though, when considering 4 cores share that l2 cache.

processor	: 0
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
processor	: 1
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 4
processor	: 2
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 4
processor	: 3
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4

 

Example 3a. Single socket Dual Core

Again, each processor has its own core so this is a dual core system.

 

processor	: 0
model name	: Intel(R) Pentium(R) D CPU 3.00GHz
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
processor	: 1
model name	: Intel(R) Pentium(R) D CPU 3.00GHz
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2

 

Example 4. Dual Single core CPU, Hyperthreading ENABLED

This example shows that processer 0 and 2 share the same physical cpu and 1 and 3 share the same physical cpu. The number of siblings is twice the number of cores, which is another clue that this is a system with hyperthreading enabled.

 

processor	: 0
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 1
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 2
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1
processor	: 3
model name	: Intel(R) Xeon(TM) CPU 3.60GHz
cache size	: 1024 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 1

 

Example 5. Dual CPU Dual Core No hyperthreading

Of the 5 examples this should be the most capable system processor-wise. There are a total of 4 cores; 2 cores in 2 separate socketed physical cpus. Each core shares the 4MB cache with its sibling core. The higher clock rate (3.0 Ghz vs 2.3Ghz) should offer slightly better performance than example 3.

 

processor	: 0
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
processor	: 1
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
processor	: 2
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 2
processor	: 3
model name	: Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
cache size	: 4096 KB
physical id	: 3
siblings	: 2
core id		: 1
cpu cores	: 2

PS:
For explanation about flags in linux /proc/cpuinfo, you can refer to following:
http://blog.incase.de/index.php/cpu-feature-flags-and-their-meanings/
Categories: Kernel, Linux Tags:

resolved – differences between zfs ARC L2ARC ZIL

January 31st, 2013 No comments
  • ARC

zfs ARC(adaptive replacement cache) is a very fast cache located in the server’s memory.

For example, our ZFS server with 12GB of RAM has 11GB dedicated to ARC, which means our ZFS server will be able to cache 11GB of the most accessed data. Any read requests for data in the cache can be served directly from the ARC memory cache instead of hitting the much slower hard drives. This creates a noticeable performance boost for data that is accessed frequently.

  • L2ARC

As a general rule, you want to install as much RAM into the server as you can to make the ARC as big as possible. At some point, adding more memory is just cost prohibitive. That is where the L2ARC becomes important. The L2ARC is the second level adaptive replacement cache. The L2ARC is often called “cache drives” in the ZFS systems.

L2ARC is a new layer between Disk and the cache (ARC) in main memory for ZFS. It uses dedicated storage devices to hold cached data. The main role of this cache is to boost the performance of random read workloads. The intended L2ARC devices include 10K/15K RPM disks like short-stroked disks, solid state disks (SSD), and other media with substantially faster read latency than disk.

  • ZIL

ZIL(ZFS Intent Log) exists for performance improvement on synchronous writes. Synchronous write is very slow than asynchronous write, but it’s more stable. Essentially, the intent log of a file system is nothing more than an insurance against power failures, a to-do list if you will, that keeps track of the stuff that needs to be updated on disk, even if the power fails (or something else happens that prevents the system from updating its disks).

To get better performance, use separated disks(SSD) for ZIL, such as zpool add pool log c2d0.

Now I’m giving you an true example about zfs ZIL/L2ARC/ARC on SUN ZFS 7320 head:

test-zfs# zpool iostat -v exalogic
capacity operations bandwidth
pool alloc free read write read write
————————- —– —– —– —– —– —–
exalogic 6.78T 17.7T 53 1.56K 991K 25.1M
mirror 772G 1.96T 6 133 111K 2.07M
c0t5000CCA01A5FDCACd0 – - 3 36 57.6K 2.07M #these are the physical disks
c0t5000CCA01A6F5CF4d0 – - 2 35 57.7K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A6F5D00d0 – - 2 36 56.2K 2.07M
c0t5000CCA01A6F64F4d0 – - 2 35 57.3K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A76A7B8d0 – - 2 36 56.3K 2.07M
c0t5000CCA01A746CCCd0 – - 2 36 56.8K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A749A88d0 – - 2 35 56.7K 2.07M
c0t5000CCA01A759E90d0 – - 2 35 56.1K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A767FDCd0 – - 2 35 56.1K 2.07M
c0t5000CCA01A782A40d0 – - 2 35 57.1K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A782D10d0 – - 2 35 57.2K 2.07M
c0t5000CCA01A7465F8d0 – - 2 35 56.3K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A7597FCd0 – - 2 35 57.6K 2.07M
c0t5000CCA01A7828F4d0 – - 2 35 56.2K 2.07M
mirror 772G 1.96T 5 133 110K 2.07M
c0t5000CCA01A7829ACd0 – - 2 35 57.1K 2.07M
c0t5000CCA01A78278Cd0 – - 2 35 57.4K 2.07M
mirror 772G 1.96T 6 133 111K 2.07M
c0t5000CCA01A736000d0 – - 3 35 57.3K 2.07M
c0t5000CCA01A738000d0 – - 2 35 57.3K 2.07M
c0t5000A72030061B82d0 224M 67.8G 0 98 1 1.62M #ZIL(SSD write cache, ZFS Intent Log)
c0t5000A72030061C70d0 224M 67.8G 0 98 1 1.62M
c0t5000A72030062135d0 223M 67.8G 0 98 1 1.62M
c0t5000A72030062146d0 224M 67.8G 0 98 1 1.62M
cache – - – - – -
c2t2d0 334G 143G 15 6 217K 652K #L2ARC(SSD cache drives)
c2t3d0 332G 145G 15 6 215K 649K
c2t4d0 333G 144G 11 6 169K 651K
c2t5d0 333G 144G 13 6 192K 650K
c2t2d0 – - 0 0 0 0
c2t3d0 – - 0 0 0 0
c2t4d0 – - 0 0 0 0
c2t5d0 – - 0 0 0 0

And as for ARC:

test-zfs:> status memory show
Memory:
Cache 63.4G bytes #ARC
Unused 17.3G bytes
Mgmt 561M bytes
Other 491M bytes
Kernel 14.3G bytes

Categories: Kernel, NAS, SAN, Storage Tags: ,

resolved – bnx2i dev eth0 does not support iscsi

September 19th, 2012 No comments

There’s a weird incident occurred on a linux box. The linux box turned not responsible to ping or ssh, although from ifconfig and /proc/net/bonding/bond0 file, the system said it’s running ok. After some google work, I found that the issue may related to the NIC driver. I tried bring down/bring up NICs one by one, but got error:

Bringing up loopback interface bond0: bnx2i: dev eth0 does not support iscsi

bnx2i: iSCSI not supported, dev=eth0

bonding: no command found in slaves file for bond bond0. Use +ifname or -ifname

At last, I tried restart the whole network i.e. /etc/init.d/network restart. And that did the trick, the networking was then running ok and can ping/ssh to it without problem.

resolved – semget failed with status 28 failed oracle database starting up

August 2nd, 2012 No comments

Today we met a problem with semaphore and unable to start oracle instances. Here’s the error message:

ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpcreates

So it turns out, the max number of arrays have been reached:
#check limits of all IPC
root@doxer# ipcs -al

—— Shared Memory Limits ——–
max number of segments = 4096
max seg size (kbytes) = 67108864
max total shared memory (kbytes) = 17179869184
min seg size (bytes) = 1

—— Semaphore Limits ——–
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 1024000
max ops per semop call = 100
semaphore max value = 32767

—— Messages: Limits ——–
max queues system wide = 16
max size of message (bytes) = 65536
default max size of queue (bytes) = 65536

#check summary of semaphores
root@doxer# ipcs -su

—— Semaphore Status ——–
used arrays = 127
allocated semaphores = 16890

To resolve this, we need increase value of max number of semaphore arrays:

root@doxer# cat /proc/sys/kernel/sem
250 1024000 100 128
^—needs to be increased

PS:

Here’s an example with toilets that describes differences between mutex and semaphore LOL http://koti.mbnet.fi/niclasw/MutexSemaphore.html

Categories: Kernel, Oracle DB Tags:

Resolved – bash /usr/bin/find Arg list too long

July 3rd, 2012 No comments

Have you ever met error like the following?

root@doxer# find /PRD/*/connectors/A01/QP*/*/logFiles/* -prune -name “*.log” -mtime +7 -type f |wc -l

bash: /usr/bin/find: Arg list too long

0

The cause of issue is kernel limitation for argument count which can be passed to find (as well as ls, and other utils). ARG_MAX defines

the maximum length of arguments for a new process. You can get the number of it using command:

root@doxer# getconf ARG_MAX
1048320

To quickly fix this, you can move your actions into the directory(replace * with subdir_NAME):

cd /PRD/subdir_NAME/connectors/A01/QP*/*/logFiles/;find . -prune -name “*.log” -mtime +7 -type f |wc -l

11382

PS:

  1. you can get all configuration values with getconf -a.
  2. For more solutions about the error “bash: /usr/bin/find: Arg list too long”, you can refer to http://www.in-ulm.de/~mascheck/various/argmax/
Categories: Kernel, Linux Tags:

ORA-00600 internal error caused by /tmp swap full

June 22nd, 2012 No comments

Today we encountered a problem when oracle failed to functioning. After some checking, this error was caused by /tmp running out of space. This also confirmed by OS logs:

Jun 20 17:43:59 tmpfs: [ID 518458 kern.warning] WARNING: /tmp: File system full, swap space limit exceeded

Oracle uses /tmp to compile PL/SQL code, so if there no space it unable to compile/execute. Which causing functions/procedures/packeges and trigers to timeout. The same also described in oracle note: ID 1389623.1

So in order to prevent further occurrences of this error, we should increase /tmp on the system to at least 4Gb.

There is an Oracle parameter to change the default location of these temporary files(_ncomp_shared_objects_dir), but it’s not a dynamic parameter. And also, while there is a way to resize a tmpfs filesystem online but it’s somehow risky. So the best idea is that, we firstly bring down Oracle DB on this host, then modify /etc/vfstab, and then reboot the whole system. This way will protect our data against the risk of corruption or lost etc, also it’ll have some outage time.
So finally, here’s the steps:
Amend the line in /etc/vfstab from:

swap – /tmp tmpfs – yes size=512m

To:

swap – /tmp tmpfs – yes size=4096m

Reboot machine and bring up oracle DB